I tried running the tests with high concurrency. We can add TestSimpleRpcScheduler and TestFIFOCompactionPolicy to the flaky list.
On Tue, Feb 5, 2019 at 9:23 AM Andrew Purtell <[email protected]> wrote: > Thanks. JIRAs for flaky tests welcome. Better if there are patches too. :-) > > > On Feb 5, 2019, at 1:25 AM, Xu Cang <[email protected]> wrote: > > > > Thanks to Peter's link. I checked my failed tests, they are all in this > > flaky tests list. (Going to create some JIRAs for flaky tests if there > > aren't now) > > > > +1 for this release now. > > > > Best, > > Xu > > > > > >> On Tue, Feb 5, 2019 at 12:36 AM Peter Somogyi <[email protected]> > wrote: > >> > >> Just like Xu Cang I ran into similar test failures on Debian and many of > >> these are on the flaky list for branch-1 with 100% flakyness. > >> > >> > https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html > >> > >> These are the failed tests for me: > >> regionserver.TestRegionServerMetrics > >> client.TestRegionLocationCaching > >> client.TestHCM > >> client.TestClientOperationInterrupt > >> util.TestHBaseFsck > >> master.cleaner.TestSnapshotFromMaster > >> master.TestMasterShutdown > >> replication.TestReplicationDroppedTables > >> filter.TestFilterListOrOperatorWithBlkCnt > >> mapreduce.TestSecureLoadIncrementalHFilesSplitRecovery > >> mapreduce.TestLoadIncrementalHFilesSplitRecovery > >> > >> On Tue, Feb 5, 2019 at 7:30 AM 张铎(Duo Zhang) <[email protected]> > >> wrote: > >> > >>> I think HBASE-21727 should be partially reverted since it removed a > >> public > >>> method in HBaseConfiguration which is marked as IA.Public? > >>> > >>> [email protected] <[email protected]> 于2019年2月5日周二 上午1:54写道: > >>> > >>>> Based on my own testing I was going to vote +1.I built 1.5.0 from > >>> source, > >>>> and ran it with the tip of the Phoenix 4.x. > >>>> I regularly load a lot of data, execute Phoenix queries, etc. Nothing > >>>> undue, nothing undue in the logs either. > >>>> I'll try to reproduce the test failures. Since Andy can't reproduce > >> them > >>>> there is something flaky, most likely it's the tests, but that's, of > >>>> course, hard to say. > >>>> -- Lars > >>>> On Saturday, February 2, 2019, 4:03:53 PM PST, Andrew Purtell < > >>>> [email protected]> wrote: > >>>> > >>>> Thanks. As I am not able to produce those unit test results we will > >> need > >>>> your help to diagnose the issues. Please file JIRAs as needed, post > the > >>>> test output detail, etc. Thanks for trying the candidate out! > >>>> > >>>> The ITBLL results may be a tool usage problem. The numbers in the > >> failure > >>>> messages you posted are too round. I expect real failures to produce > >> more > >>>> irregular numbers. ITBLL can a bit hard to use. Contact me offline and > >>> I’ll > >>>> give you notes on how I ran the tests myself. > >>>> > >>>> > >>>>> On Feb 2, 2019, at 3:45 PM, Xu Cang <[email protected]> > >>>> wrote: > >>>>> > >>>>> 2 jars sha12 verification: pass. > >>>>> Basic UI check: pass. > >>>>> Unit test. Some failures in hbase - server package. (see details > >>> below, > >>>>> not sure if these are flaky tests) > >>>>> ITBLL tests with slowDeterministic and serverKilling monky. Both got > >>> some > >>>>> failures. (Not sure if this is my environment issue since I am using > >>> *my > >>>>> laptop* to conduct this testing) > >>>>> Not voting for now since I have some doubts regarding my testing > >>> result. > >>>>> Will keep looking. > >>>>> > >>>>> > >>>>> - *Unit test failure: (failures are reproducable)* > >>>>> > >>>>> [INFO] Results: > >>>>> [INFO] > >>>>> [ERROR] Failures: > >>>>> [ERROR] > >>>>> > >>>> > >>> > >> > TestRegionLocationCaching.testCachingForHTableMultiPut:133->checkRegionLocationIsCached:148 > >>>>> Expected non-zero number of cached region locations. Actual: 0 > >>>>> [ERROR] > >>>>> > >>>> > >>> > >> > TestRegionLocationCaching.testCachingForHTableMultiplexerMultiPut:95->checkRegionLocationIsCached:148 > >>>>> Expected non-zero number of cached region locations. Actual: 0 > >>>>> [ERROR] > >>>>> > >>>> > >>> > >> > TestRegionLocationCaching.testCachingForHTableMultiplexerSinglePut:73->checkRegionLocationIsCached:148 > >>>>> Expected non-zero number of cached region locations. Actual: 0 > >>>>> [ERROR] > >>>>> > >>>> > >>> > >> > TestRegionLocationCaching.testCachingForHTableSinglePut:116->checkRegionLocationIsCached:148 > >>>>> Expected non-zero number of cached region locations. Actual: 0 > >>>>> [ERROR] TestReplicasClient.testHedgedRead:595 expected:<0> but > >> was:<1> > >>>>> [ERROR] > >>>>> > >>>> > >>> > >> > TestFilterListOrOperatorWithBlkCnt.testMultiRowRangeWithFilterListOrOperatorWithBlkCnt:127 > >>>>> expected:<4> but was:<5> > >>>>> [ERROR] TestRegionServerMetrics.testRequestCount:137 Metrics > >> Counters > >>>>> should be equal expected:<59> but was:<89> > >>>>> [INFO] > >>>>> [ERROR] Tests run: 1870, Failures: 7, Errors: 0, Skipped: 17 > >>>>> > >>>>> > >>>>> - > >>>>> *ITBLL testing result: (failures are reproducable) * > >>>>> > >>>>> ./bin/hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList > >>>> Loop > >>>>> 1 1 25000000 /tmp/itbll 1 -m slowDeterministic > >>>>> > >>>>> 2019-02-01 23:31:03,746 INFO [main] mapreduce.Job: map 100% reduce > >>> 100% > >>>>> 2019-02-01 23:31:03,746 INFO [main] mapreduce.Job: Job > >>>>> job_local221831554_0003 completed successfully > >>>>> 2019-02-01 23:31:03,758 INFO [main] mapreduce.Job: 175000018 > >>>>> Input split bytes=679 > >>>>> Combine input records=0 > >>>>> Combine output records=0 > >>>>> Reduce input groups=75000000 > >>>>> Reduce shuffle bytes=5175000018 > >>>>> Reduce input records=150000000 > >>>>> Reduce output records=0 > >>>>> Spilled Records=561948580 > >>>>> Shuffled Maps =3 > >>>>> Failed Shuffles=0 > >>>>> Merged Map outputs=3 > >>>>> GC time elapsed (ms)=1859 > >>>>> Total committed heap usage (bytes)=1846542336 > >>>>> HBase Counters > >>>>> BYTES_IN_REMOTE_RESULTS=0 > >>>>> BYTES_IN_RESULTS=31125001574 > >>>>> MILLIS_BETWEEN_NEXTS=934178 > >>>>> NOT_SERVING_REGION_EXCEPTION=7 > >>>>> NUM_SCANNER_RESTARTS=0 > >>>>> NUM_SCAN_RESULTS_STALE=0 > >>>>> REGIONS_SCANNED=12 > >>>>> REMOTE_RPC_CALLS=0 > >>>>> REMOTE_RPC_RETRIES=0 > >>>>> ROWS_FILTERED=12 > >>>>> ROWS_SCANNED=75000012 > >>>>> RPC_CALLS=14607 > >>>>> RPC_RETRIES=7 > >>>>> Shuffle Errors > >>>>> BAD_ID=0 > >>>>> CONNECTION=0 > >>>>> IO_ERROR=0 > >>>>> WRONG_LENGTH=0 > >>>>> WRONG_MAP=0 > >>>>> WRONG_REDUCE=0 > >>>>> > >>>> > org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Verify$Counts > >>>>> REFERENCED=75000000 > >>>>> File Input Format Counters > >>>>> Bytes Read=0 > >>>>> File Output Format Counters > >>>>> Bytes Written=108 > >>>>> 2019-02-01 23:31:03,764 ERROR [main] > >>>>> test.IntegrationTestBigLinkedList$Verify: *Expected referenced count > >>> does > >>>>> not match with actual referenced count. expected referenced=25000000 > >>>>> ,actual=75000000* > >>>>> > >>>>> > >>>>> ./bin/hbase org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList > >>>> Loop > >>>>> 1 4 1000000 /tmp/itbll 4 -m slowDeterministic > >>>>> 2019-02-02 00:34:27,009 ERROR [main] > >>>>> test.IntegrationTestBigLinkedList$Verify: Expected referenced count > >>> does > >>>>> not match with actual referenced count. expected referenced=4000000 > >>>>> ,actual=79000000 > >>>>> > >>>>> > >>>>>> On Fri, Feb 1, 2019 at 2:17 PM Andrew Purtell <[email protected]> > >>>> wrote: > >>>>>> > >>>>>> The first HBase 1.5.0 release candidate (RC0) is available for > >>> download > >>>> at > >>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC0/ and > >>> Maven > >>>>>> artifacts are available in the temporary repository > >>>>>> > >>> > https://repository.apache.org/content/repositories/orgapachehbase-1250/ > >>>>>> > >>>>>> The git tag corresponding to the candidate is '1.5.0RC0' > >> (ce6a6014da). > >>>>>> > >>>>>> A detailed source and binary compatibility report for this release > >> is > >>>>>> available for your review at > >>>>>> > >>>>>> > >>>> > >>> > >> > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC0/compat-check-report.html > >>>>>> . I do not believe there are any reported compatibility issues that > >>> are > >>>> in > >>>>>> violation of our compatibility policy for minor releases, but if you > >>>> find > >>>>>> something and feel differently, please file a JIRA. > >>>>>> > >>>>>> A list of the 88 issues resolved in this release can be found at > >>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the > >>>>>> changelog of the last branch-1.4 release, 1.4.9. > >>>>>> > >>>>>> Please try out the candidate and vote +1/0/-1. > >>>>>> > >>>>>> The vote will be open for at least 72 hours. Unless objection I will > >>>> try to > >>>>>> close it Thursday February 28, 2019 if we have sufficient votes. > >>>>>> > >>>>>> Prior to making this announcement I made the following preflight > >>> checks: > >>>>>> > >>>>>> RAT check passes (7u80) > >>>>>> Unit test suite passes (7u80, 8u181) > >>>>>> Opened the UI in a browser, poked around > >>>>>> LTT load 100M rows with 100% verification and 20% updates (8u181) > >>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181) > >>>>>> ITBLL 1B rows with serverKilling monkey (8u181) > >>>>>> > >>>>>> Some of this testing was done with recent 1.5.0-SNAPSHOT versions. > >>>> During > >>>>>> the month of February I plan to perform a number of additional > >> tests, > >>>>>> including performance regression checks. As more results become > >>>> available I > >>>>>> will post them to this thread. > >>>>>> > >>>>>> -- > >>>>>> Best regards, > >>>>>> Andrew > >>>>>> > >>> > >> > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk
