And if they pass in my environment , then what should we call it then. I have no doubt you are seeing failures. Therefore can you please file JIRAs and attach information that can help identify a fix. Thanks.
> On Apr 11, 2019, at 8:35 PM, Yu Li <car...@gmail.com> wrote: > > I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2 option > and on two different env separately, so it sums up to 6 times stable > failure for each case, and from my perspective this is not flaky. > > IIRC last time when verifying 1.4.7 on the same env no such issue observed, > will double check. > > Best Regards, > Yu > > > On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <andrew.purt...@gmail.com> > wrote: > >> There are two failure cases it looks like. And this looks like flakes. >> >> The wrong FS assertions are not something I see when I run these tests >> myself. I am not able to investigate something I can’t reproduce. What I >> suggest is since you can reproduce do a git bisect to find the commit that >> introduced the problem. Then we can revert it. As an alternative we can >> open a JIRA, report the problem, temporarily @ignore the test, and >> continue. This latter option only should be done if we are fairly confident >> it is a test only problem. >> >> The connect exceptions are interesting. I see these sometimes when the >> suite is executed, not this particular case, but when the failed test is >> executed by itself it always passes. It is possible some change to classes >> related to the minicluster or startup or shutdown timing are the cause, but >> it is test time flaky behavior. I’m not happy about this but it doesn’t >> actually fail the release because the failure is never repeatable when the >> test is run standalone. >> >> In general it would be great if some attention was paid to test >> cleanliness on branch-1. As RM I’m not in a position to insist that >> everything is perfect or there will never be another 1.x release, certainly >> not from branch-1. So, tests which fail repeatedly block a release IMHO but >> flakes do not. >> >> >>> On Apr 10, 2019, at 11:20 PM, Yu Li <car...@gmail.com> wrote: >>> >>> -1 >>> >>> Observed many UT failures when checking the source package (tried >> multiple >>> rounds on two different environments, MacOs and Linux, got the same >>> result), including (but not limited to): >>> >>> TestBulkload: >>> >> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad) >>> Time elapsed: 0.083 s <<< ERROR! >>> java.lang.IllegalArgumentException: Wrong FS: >>> >> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29, >>> expected: hdfs://localhost:55938 >>> at >>> >> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246) >>> at >>> >> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256) >>> at >>> >> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150) >>> >>> TestStoreFile: >>> >> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile) >>> Time elapsed: 0.083 s <<< ERROR! >>> java.net.ConnectException: Call From localhost/127.0.0.1 to >> localhost:55938 >>> failed on connection exception: java.net.ConnectException: Connection >>> refused; For more details see: >>> http://wiki.apache.org/hadoop/ConnectionRefused >>> at >>> >> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047) >>> at >>> >> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908) >>> >>> TestHFile: >>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile) Time elapsed: >>> 0.08 s <<< ERROR! >>> java.net.ConnectException: Call From >>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529 failed on >>> connection exception: java.net.ConnectException: Connection refused; For >>> more details see: http://wiki.apache.org/hadoop/ConnectionRefused >>> at >>> org.apache.hadoop.hbase.io >> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90) >>> Caused by: java.net.ConnectException: Connection refused >>> at >>> org.apache.hadoop.hbase.io >> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90) >>> >>> TestBlocksScanned: >>> >> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned) >>> Time elapsed: 0.069 s <<< ERROR! >>> java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:35529/tmp/ >>> >> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08 >> , >>> expected: file:/// >>> at >>> >> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90) >>> at >>> >> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86) >>> >>> And please let me know if any known issue I'm not aware of. Thanks. >>> >>> Best Regards, >>> Yu >>> >>> >>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <car...@gmail.com> wrote: >>>> >>>> The performance report LGTM, thanks! (and sorry for the lag due to >>>> Qingming Festival Holiday here in China) >>>> >>>> Still verifying the release, just some quick feedback: observed some >>>> incompatible changes in compatibility report including >>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote. >>>> >>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on >>>> https://hbase.apache.org/downloads.html >>>> >>>> Best Regards, >>>> Yu >>>> >>>> >>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <apurt...@apache.org> >> wrote: >>>>> >>>>> The difference is basically noise per the usual YCSB evaluation. Small >>>>> differences in workloads D and F (slightly worse) and workload E >> (slightly >>>>> better) that do not indicate serious regression. >>>>> >>>>> Linux version 4.14.55-62.37.amzn1.x86_64 >>>>> c3.8xlarge x 5 >>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13) >>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA >>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled >>>>> Hadoop 2.9.2 >>>>> Init: Load 100 M rows and snapshot >>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M >> operations >>>>> Args: -threads 100 -target 50000 >>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1', >> IN_MEMORY >>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => >>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY', >> MIN_VERSIONS => >>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => >>>>> '0'} >>>>> >>>>> >>>>> YCSB Workload A >>>>> >>>>> target 50k/op/s 1.4.9 1.5.0 >>>>> >>>>> >>>>> >>>>> [OVERALL], RunTime(ms) 200592 200583 >>>>> [OVERALL], Throughput(ops/sec) 49852 49855 >>>>> [READ], AverageLatency(us) 544 559 >>>>> [READ], MinLatency(us) 267 292 >>>>> [READ], MaxLatency(us) 165631 185087 >>>>> [READ], 95thPercentileLatency(us) 738 742 >>>>> [READ], 99thPercentileLatency(us), 1877 1961 >>>>> [UPDATE], AverageLatency(us) 1370 1181 >>>>> [UPDATE], MinLatency(us) 702 646 >>>>> [UPDATE], MaxLatency(us) 180735 177279 >>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652 >>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085 >>>>> >>>>> YCSB Workload B >>>>> >>>>> target 50k/op/s 1.4.9 1.5.0 >>>>> >>>>> >>>>> >>>>> [OVERALL], RunTime(ms) 200599 200581 >>>>> [OVERALL], Throughput(ops/sec) 49850 49855 >>>>> [READ], AverageLatency(us), 454 471 >>>>> [READ], MinLatency(us) 203 213 >>>>> [READ], MaxLatency(us) 183423 174207 >>>>> [READ], 95thPercentileLatency(us) 563 599 >>>>> [READ], 99thPercentileLatency(us) 1360 1172 >>>>> [UPDATE], AverageLatency(us) 1064 1029 >>>>> [UPDATE], MinLatency(us) 746 726 >>>>> [UPDATE], MaxLatency(us) 163455 101631 >>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157 >>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898 >>>>> >>>>> YCSB Workload C >>>>> >>>>> target 50k/op/s 1.4.9 1.5.0 >>>>> >>>>> >>>>> >>>>> [OVERALL], RunTime(ms) 200541 200538 >>>>> [OVERALL], Throughput(ops/sec) 49865 49865 >>>>> [READ], AverageLatency(us) 332 327 >>>>> [READ], MinLatency(us) 175 179 >>>>> [READ], MaxLatency(us) 210559 170367 >>>>> [READ], 95thPercentileLatency(us) 410 396 >>>>> [READ], 99thPercentileLatency(us) 871 892 >>>>> >>>>> YCSB Workload D >>>>> >>>>> target 50k/op/s 1.4.9 1.5.0 >>>>> >>>>> >>>>> >>>>> [OVERALL], RunTime(ms) 200579 200562 >>>>> [OVERALL], Throughput(ops/sec) 49855 49859 >>>>> [READ], AverageLatency(us) 487 547 >>>>> [READ], MinLatency(us) 210 214 >>>>> [READ], MaxLatency(us) 192255 177535 >>>>> [READ], 95thPercentileLatency(us) 973 1529 >>>>> [READ], 99thPercentileLatency(us) 1836 2683 >>>>> [INSERT], AverageLatency(us) 1239 1152 >>>>> [INSERT], MinLatency(us) 807 788 >>>>> [INSERT], MaxLatency(us) 184575 148735 >>>>> [INSERT], 95thPercentileLatency(us) 1496 1243 >>>>> [INSERT], 99thPercentileLatency(us) 2965 2495 >>>>> >>>>> YCSB Workload E >>>>> >>>>> target 10k/op/s 1.4.9 1.5.0 >>>>> >>>>> >>>>> >>>>> [OVERALL], RunTime(ms) 100605 100568 >>>>> [OVERALL], Throughput(ops/sec) 9939 9943 >>>>> [SCAN], AverageLatency(us) 3548 2687 >>>>> [SCAN], MinLatency(us) 696 678 >>>>> [SCAN], MaxLatency(us) 1059839 238463 >>>>> [SCAN], 95thPercentileLatency(us) 8327 6791 >>>>> [SCAN], 99thPercentileLatency(us) 17647 14415 >>>>> [INSERT], AverageLatency(us) 2688 1555 >>>>> [INSERT], MinLatency(us) 887 815 >>>>> [INSERT], MaxLatency(us) 173311 154623 >>>>> [INSERT], 95thPercentileLatency(us) 4455 2571 >>>>> [INSERT], 99thPercentileLatency(us) 9303 5375 >>>>> >>>>> YCSB Workload F >>>>> >>>>> target 50k/op/s 1.4.9 1.5.0 >>>>> >>>>> >>>>> >>>>> [OVERALL], RunTime(ms) 200562 204178 >>>>> [OVERALL], Throughput(ops/sec) 49859 48976 >>>>> [READ], AverageLatency(us) 856 1137 >>>>> [READ], MinLatency(us) 262 257 >>>>> [READ], MaxLatency(us) 205567 222335 >>>>> [READ], 95thPercentileLatency(us) 2365 3475 >>>>> [READ], 99thPercentileLatency(us) 3099 4143 >>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917 >>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034 >>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799 >>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627 >>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919 >>>>> [UPDATE], AverageLatency(us) 1700 1777 >>>>> [UPDATE], MinLatency(us) 737 687 >>>>> [UPDATE], MaxLatency(us) 97983 94271 >>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147 >>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831 >>>>> >>>>> >>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <car...@gmail.com> wrote: >>>>>> >>>>>> Thanks for the efforts boss. >>>>>> >>>>>> Since it's a new minor release, do we have performance comparison >> report >>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference? Many >>>>>> thanks! >>>>>> >>>>>> Best Regards, >>>>>> Yu >>>>>> >>>>>> >>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <apurt...@apache.org> >>>>> wrote: >>>>>> >>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for >>>>> download >>>>>> at >>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and >>>>> Maven >>>>>>> artifacts are available in the temporary repository >>>>>>> >>>>> >> https://repository.apache.org/content/repositories/orgapachehbase-1292/ >>>>>>> >>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’ >> (b0bc7225c5). >>>>>>> >>>>>>> A detailed source and binary compatibility report for this release is >>>>>>> available for your review at >>>>>>> >>>>>>> >>>>>> >>>>> >> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html >>>>>>> . >>>>>>> >>>>>>> A list of the 115 issues resolved in this release can be found at >>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the >>>>>>> changelog of the last branch-1.4 release, 1.4.9. >>>>>>> >>>>>>> Please try out the candidate and vote +1/0/-1. >>>>>>> >>>>>>> The vote will be open for at least 72 hours. Unless objection I will >>>>> try >>>>>> to >>>>>>> close it Friday April 12, 2019 if we have sufficient votes. >>>>>>> >>>>>>> Prior to making this announcement I made the following preflight >>>>> checks: >>>>>>> >>>>>>> RAT check passes (7u80) >>>>>>> Unit test suite passes (7u80, 8u181)* >>>>>>> Opened the UI in a browser, poked around >>>>>>> LTT load 100M rows with 100% verification and 20% updates (8u181) >>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181) >>>>>>> ITBLL 1B rows with serverKilling monkey (8u181) >>>>>>> >>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905. These >>>>> flaky >>>>>>> tests do not represent serious test failures that would prevent a >>>>>> release. >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best regards, >>>>>>> Andrew >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> Andrew >>>>> >>>>> Words like orphans lost among the crosstalk, meaning torn from truth's >>>>> decrepit hands >>>>> - A23, Crosstalk >>>>> >>>> >>