Confirmed in 1.4.7 source the listed out cases passed (all in the 1st part of hbase-server so the result comes out quickly.)... Also confirmed the test ran order are the same...
Will try 1.5.0 again to prevent the environment difference caused by time. If 1.5.0 still fails, will start to do the git bisect to locate the first bad commit. Was also expecting an easy pass and +1 as always to save time and efforts, but obvious no luck. However it's good to find the issue earlier if there really is any, before release announced. Best Regards, Yu On Fri, 12 Apr 2019 at 12:16, Yu Li <car...@gmail.com> wrote: > Fine, let's focus on verifying whether it's a real problem rather than > arguing about wording, after all that's not my intention... > > As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I was > using the same env and all tests passed w/o issue, that's where my concern > lies and the main reason I gave a -1 vote. I'm running against 1.4.7 source > on the same now and let's see the result. > > [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html > > Best Regards, > Yu > > > On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <andrew.purt...@gmail.com> > wrote: > >> I believe the test execution order matters. We run some tests in >> parallel. The ordering of tests is determined by readdir() results and this >> differs from host to host and checkout to checkout. So when you see a >> repeatable group of failures, that’s great. And when someone else doesn’t >> see those same tests fail, or they cannot be reproduced when running by >> themselves, the commonly accepted term of art for this is “flaky”. >> >> >> > On Apr 11, 2019, at 8:52 PM, Yu Li <car...@gmail.com> wrote: >> > >> > Sorry but I'd call it "possible environment related problem" or "some >> > feature may not work well in specific environment", rather than a flaky. >> > >> > Will check against 1.4.7 released source package before opening any >> JIRA. >> > >> > Best Regards, >> > Yu >> > >> > >> > On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <andrew.purt...@gmail.com> >> > wrote: >> > >> >> And if they pass in my environment , then what should we call it then. >> I >> >> have no doubt you are seeing failures. Therefore can you please file >> JIRAs >> >> and attach information that can help identify a fix. Thanks. >> >> >> >>> On Apr 11, 2019, at 8:35 PM, Yu Li <car...@gmail.com> wrote: >> >>> >> >>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2 >> option >> >>> and on two different env separately, so it sums up to 6 times stable >> >>> failure for each case, and from my perspective this is not flaky. >> >>> >> >>> IIRC last time when verifying 1.4.7 on the same env no such issue >> >> observed, >> >>> will double check. >> >>> >> >>> Best Regards, >> >>> Yu >> >>> >> >>> >> >>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell < >> andrew.purt...@gmail.com> >> >>> wrote: >> >>> >> >>>> There are two failure cases it looks like. And this looks like >> flakes. >> >>>> >> >>>> The wrong FS assertions are not something I see when I run these >> tests >> >>>> myself. I am not able to investigate something I can’t reproduce. >> What I >> >>>> suggest is since you can reproduce do a git bisect to find the commit >> >> that >> >>>> introduced the problem. Then we can revert it. As an alternative we >> can >> >>>> open a JIRA, report the problem, temporarily @ignore the test, and >> >>>> continue. This latter option only should be done if we are fairly >> >> confident >> >>>> it is a test only problem. >> >>>> >> >>>> The connect exceptions are interesting. I see these sometimes when >> the >> >>>> suite is executed, not this particular case, but when the failed >> test is >> >>>> executed by itself it always passes. It is possible some change to >> >> classes >> >>>> related to the minicluster or startup or shutdown timing are the >> cause, >> >> but >> >>>> it is test time flaky behavior. I’m not happy about this but it >> doesn’t >> >>>> actually fail the release because the failure is never repeatable >> when >> >> the >> >>>> test is run standalone. >> >>>> >> >>>> In general it would be great if some attention was paid to test >> >>>> cleanliness on branch-1. As RM I’m not in a position to insist that >> >>>> everything is perfect or there will never be another 1.x release, >> >> certainly >> >>>> not from branch-1. So, tests which fail repeatedly block a release >> IMHO >> >> but >> >>>> flakes do not. >> >>>> >> >>>> >> >>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <car...@gmail.com> wrote: >> >>>>> >> >>>>> -1 >> >>>>> >> >>>>> Observed many UT failures when checking the source package (tried >> >>>> multiple >> >>>>> rounds on two different environments, MacOs and Linux, got the same >> >>>>> result), including (but not limited to): >> >>>>> >> >>>>> TestBulkload: >> >>>>> >> >>>> >> >> >> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad) >> >>>>> Time elapsed: 0.083 s <<< ERROR! >> >>>>> java.lang.IllegalArgumentException: Wrong FS: >> >>>>> >> >>>> >> >> >> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29, >> >>>>> expected: hdfs://localhost:55938 >> >>>>> at >> >>>>> >> >>>> >> >> >> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246) >> >>>>> at >> >>>>> >> >>>> >> >> >> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256) >> >>>>> at >> >>>>> >> >>>> >> >> >> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150) >> >>>>> >> >>>>> TestStoreFile: >> >>>>> >> >>>> >> >> >> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile) >> >>>>> Time elapsed: 0.083 s <<< ERROR! >> >>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to >> >>>> localhost:55938 >> >>>>> failed on connection exception: java.net.ConnectException: >> Connection >> >>>>> refused; For more details see: >> >>>>> http://wiki.apache.org/hadoop/ConnectionRefused >> >>>>> at >> >>>>> >> >>>> >> >> >> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047) >> >>>>> at >> >>>>> >> >>>> >> >> >> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908) >> >>>>> >> >>>>> TestHFile: >> >>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile) Time >> >> elapsed: >> >>>>> 0.08 s <<< ERROR! >> >>>>> java.net.ConnectException: Call From >> >>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529 >> failed >> >> on >> >>>>> connection exception: java.net.ConnectException: Connection refused; >> >> For >> >>>>> more details see: http://wiki.apache.org/hadoop/ConnectionRefused >> >>>>> at >> >>>>> org.apache.hadoop.hbase.io >> >>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90) >> >>>>> Caused by: java.net.ConnectException: Connection refused >> >>>>> at >> >>>>> org.apache.hadoop.hbase.io >> >>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90) >> >>>>> >> >>>>> TestBlocksScanned: >> >>>>> >> >>>> >> >> >> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned) >> >>>>> Time elapsed: 0.069 s <<< ERROR! >> >>>>> java.lang.IllegalArgumentException: Wrong FS: >> >> hdfs://localhost:35529/tmp/ >> >>>>> >> >>>> >> >> >> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08 >> >>>> , >> >>>>> expected: file:/// >> >>>>> at >> >>>>> >> >>>> >> >> >> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90) >> >>>>> at >> >>>>> >> >>>> >> >> >> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86) >> >>>>> >> >>>>> And please let me know if any known issue I'm not aware of. Thanks. >> >>>>> >> >>>>> Best Regards, >> >>>>> Yu >> >>>>> >> >>>>> >> >>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <car...@gmail.com> wrote: >> >>>>>> >> >>>>>> The performance report LGTM, thanks! (and sorry for the lag due to >> >>>>>> Qingming Festival Holiday here in China) >> >>>>>> >> >>>>>> Still verifying the release, just some quick feedback: observed >> some >> >>>>>> incompatible changes in compatibility report including >> >>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote. >> >>>>>> >> >>>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on >> >>>>>> https://hbase.apache.org/downloads.html >> >>>>>> >> >>>>>> Best Regards, >> >>>>>> Yu >> >>>>>> >> >>>>>> >> >>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <apurt...@apache.org> >> >>>> wrote: >> >>>>>>> >> >>>>>>> The difference is basically noise per the usual YCSB evaluation. >> >> Small >> >>>>>>> differences in workloads D and F (slightly worse) and workload E >> >>>> (slightly >> >>>>>>> better) that do not indicate serious regression. >> >>>>>>> >> >>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64 >> >>>>>>> c3.8xlarge x 5 >> >>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13) >> >>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA >> >>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled >> >>>>>>> Hadoop 2.9.2 >> >>>>>>> Init: Load 100 M rows and snapshot >> >>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M >> >>>> operations >> >>>>>>> Args: -threads 100 -target 50000 >> >>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1', >> >>>> IN_MEMORY >> >>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => >> >>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY', >> >>>> MIN_VERSIONS => >> >>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', >> REPLICATION_SCOPE => >> >>>>>>> '0'} >> >>>>>>> >> >>>>>>> >> >>>>>>> YCSB Workload A >> >>>>>>> >> >>>>>>> target 50k/op/s 1.4.9 1.5.0 >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> [OVERALL], RunTime(ms) 200592 200583 >> >>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855 >> >>>>>>> [READ], AverageLatency(us) 544 559 >> >>>>>>> [READ], MinLatency(us) 267 292 >> >>>>>>> [READ], MaxLatency(us) 165631 185087 >> >>>>>>> [READ], 95thPercentileLatency(us) 738 742 >> >>>>>>> [READ], 99thPercentileLatency(us), 1877 1961 >> >>>>>>> [UPDATE], AverageLatency(us) 1370 1181 >> >>>>>>> [UPDATE], MinLatency(us) 702 646 >> >>>>>>> [UPDATE], MaxLatency(us) 180735 177279 >> >>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652 >> >>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085 >> >>>>>>> >> >>>>>>> YCSB Workload B >> >>>>>>> >> >>>>>>> target 50k/op/s 1.4.9 1.5.0 >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> [OVERALL], RunTime(ms) 200599 200581 >> >>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855 >> >>>>>>> [READ], AverageLatency(us), 454 471 >> >>>>>>> [READ], MinLatency(us) 203 213 >> >>>>>>> [READ], MaxLatency(us) 183423 174207 >> >>>>>>> [READ], 95thPercentileLatency(us) 563 599 >> >>>>>>> [READ], 99thPercentileLatency(us) 1360 1172 >> >>>>>>> [UPDATE], AverageLatency(us) 1064 1029 >> >>>>>>> [UPDATE], MinLatency(us) 746 726 >> >>>>>>> [UPDATE], MaxLatency(us) 163455 101631 >> >>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157 >> >>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898 >> >>>>>>> >> >>>>>>> YCSB Workload C >> >>>>>>> >> >>>>>>> target 50k/op/s 1.4.9 1.5.0 >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> [OVERALL], RunTime(ms) 200541 200538 >> >>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865 >> >>>>>>> [READ], AverageLatency(us) 332 327 >> >>>>>>> [READ], MinLatency(us) 175 179 >> >>>>>>> [READ], MaxLatency(us) 210559 170367 >> >>>>>>> [READ], 95thPercentileLatency(us) 410 396 >> >>>>>>> [READ], 99thPercentileLatency(us) 871 892 >> >>>>>>> >> >>>>>>> YCSB Workload D >> >>>>>>> >> >>>>>>> target 50k/op/s 1.4.9 1.5.0 >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> [OVERALL], RunTime(ms) 200579 200562 >> >>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859 >> >>>>>>> [READ], AverageLatency(us) 487 547 >> >>>>>>> [READ], MinLatency(us) 210 214 >> >>>>>>> [READ], MaxLatency(us) 192255 177535 >> >>>>>>> [READ], 95thPercentileLatency(us) 973 1529 >> >>>>>>> [READ], 99thPercentileLatency(us) 1836 2683 >> >>>>>>> [INSERT], AverageLatency(us) 1239 1152 >> >>>>>>> [INSERT], MinLatency(us) 807 788 >> >>>>>>> [INSERT], MaxLatency(us) 184575 148735 >> >>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243 >> >>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495 >> >>>>>>> >> >>>>>>> YCSB Workload E >> >>>>>>> >> >>>>>>> target 10k/op/s 1.4.9 1.5.0 >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> [OVERALL], RunTime(ms) 100605 100568 >> >>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943 >> >>>>>>> [SCAN], AverageLatency(us) 3548 2687 >> >>>>>>> [SCAN], MinLatency(us) 696 678 >> >>>>>>> [SCAN], MaxLatency(us) 1059839 238463 >> >>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791 >> >>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415 >> >>>>>>> [INSERT], AverageLatency(us) 2688 1555 >> >>>>>>> [INSERT], MinLatency(us) 887 815 >> >>>>>>> [INSERT], MaxLatency(us) 173311 154623 >> >>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571 >> >>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375 >> >>>>>>> >> >>>>>>> YCSB Workload F >> >>>>>>> >> >>>>>>> target 50k/op/s 1.4.9 1.5.0 >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> [OVERALL], RunTime(ms) 200562 204178 >> >>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976 >> >>>>>>> [READ], AverageLatency(us) 856 1137 >> >>>>>>> [READ], MinLatency(us) 262 257 >> >>>>>>> [READ], MaxLatency(us) 205567 222335 >> >>>>>>> [READ], 95thPercentileLatency(us) 2365 3475 >> >>>>>>> [READ], 99thPercentileLatency(us) 3099 4143 >> >>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917 >> >>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034 >> >>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799 >> >>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627 >> >>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919 >> >>>>>>> [UPDATE], AverageLatency(us) 1700 1777 >> >>>>>>> [UPDATE], MinLatency(us) 737 687 >> >>>>>>> [UPDATE], MaxLatency(us) 97983 94271 >> >>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147 >> >>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831 >> >>>>>>> >> >>>>>>> >> >>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <car...@gmail.com> wrote: >> >>>>>>>> >> >>>>>>>> Thanks for the efforts boss. >> >>>>>>>> >> >>>>>>>> Since it's a new minor release, do we have performance comparison >> >>>> report >> >>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference? >> >> Many >> >>>>>>>> thanks! >> >>>>>>>> >> >>>>>>>> Best Regards, >> >>>>>>>> Yu >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <apurt...@apache.org >> > >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for >> >>>>>>> download >> >>>>>>>> at >> >>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ >> and >> >>>>>>> Maven >> >>>>>>>>> artifacts are available in the temporary repository >> >>>>>>>>> >> >>>>>>> >> >>>> >> https://repository.apache.org/content/repositories/orgapachehbase-1292/ >> >>>>>>>>> >> >>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’ >> >>>> (b0bc7225c5). >> >>>>>>>>> >> >>>>>>>>> A detailed source and binary compatibility report for this >> release >> >> is >> >>>>>>>>> available for your review at >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>> >> >> >> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html >> >>>>>>>>> . >> >>>>>>>>> >> >>>>>>>>> A list of the 115 issues resolved in this release can be found >> at >> >>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from >> >> the >> >>>>>>>>> changelog of the last branch-1.4 release, 1.4.9. >> >>>>>>>>> >> >>>>>>>>> Please try out the candidate and vote +1/0/-1. >> >>>>>>>>> >> >>>>>>>>> The vote will be open for at least 72 hours. Unless objection I >> >> will >> >>>>>>> try >> >>>>>>>> to >> >>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes. >> >>>>>>>>> >> >>>>>>>>> Prior to making this announcement I made the following preflight >> >>>>>>> checks: >> >>>>>>>>> >> >>>>>>>>> RAT check passes (7u80) >> >>>>>>>>> Unit test suite passes (7u80, 8u181)* >> >>>>>>>>> Opened the UI in a browser, poked around >> >>>>>>>>> LTT load 100M rows with 100% verification and 20% updates >> (8u181) >> >>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181) >> >>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181) >> >>>>>>>>> >> >>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905. >> These >> >>>>>>> flaky >> >>>>>>>>> tests do not represent serious test failures that would prevent >> a >> >>>>>>>> release. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -- >> >>>>>>>>> Best regards, >> >>>>>>>>> Andrew >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Best regards, >> >>>>>>> Andrew >> >>>>>>> >> >>>>>>> Words like orphans lost among the crosstalk, meaning torn from >> >> truth's >> >>>>>>> decrepit hands >> >>>>>>> - A23, Crosstalk >> >>>>>>> >> >>>>>> >> >>>> >> >> >> >