I'm testing a change that keeps the change to CompactionTool but drops the unit test. Will let you know how it goes.
On Wed, Apr 17, 2019 at 10:28 AM Xu Cang <xc...@salesforce.com.invalid> wrote: > I just saw this email, Andrew. Should I re-open HBASE-21959? And revert it > before we understand/fix why it caused the test failure? > Regarding the failing test, do you mean this one "TestBlocksRead"? > Thanks, > > Xu > > On Tue, Apr 16, 2019 at 9:47 PM Andrew Purtell <andrew.purt...@gmail.com> > wrote: > > > I've bisected twice and it lands on this commit: > > > > commit 6bc46bb10920c1c335b784b01d2a326db1a3d587 (HEAD, refs/bisect/bad) > > HBASE-21959 CompactionTool should close the store it uses for > > compacting files, in order to properly archive compacted files. > > > > > hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java > > | 2 ++ > > > > > hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionTool.java > > | 100 > > > > At first glance it's hard to see how this change is relevant, but it does > > introduce a new unit test. > > > > > > On Tue, Apr 16, 2019 at 7:48 PM Andrew Purtell <andrew.purt...@gmail.com > > > > wrote: > > > > > I’ve been able to reproduce it sometimes too and am bisecting. It may > be > > > an interaction between test cases, not a failure per se, but does seem > > have > > > a recent cause, as you pointed out. I’ll be looking at it. > > > > > > Thank you for your kind consideration and for revoking your veto. > > > > > > A coprocessor API fix was just committed to branch-1 so I want to roll > a > > > new RC soon to include it. There is also an issue open to improve the > > > behavior of the UI when the profiler link is clicked but system support > > is > > > not available. > > > > > > > On Apr 16, 2019, at 7:40 PM, Yu Li <car...@gmail.com> wrote: > > > > > > > > After more investigation, the ConnectionRefused exception could be > > > > reproduced with "mvn -Dtest=<case_name> test" after a complete run of > > all > > > > cases through "mvn -PrunAllTests clean test", but cannot by a clean > > > > standalone run (with "mvn *clean* test"). So now I'm more convinced > > it's > > > > some kind of environment chaos caused by parallel execution of test > > > cases, > > > > and not a blocker issue. > > > > > > > > @Andrew It seems to me that kerby jar is not included in our binary > > > > package, so I'm not sure whether a new RC is required by HBASE-22219. > > > > Anyway I'd like to revoke my -1 vote now. Thanks. > > > > > > > > Best Regards, > > > > Yu > > > > > > > > > > > >> On Tue, 16 Apr 2019 at 10:19, Yu Li <car...@gmail.com> wrote: > > > >> > > > >> Sorry for the late response due to job priority. > > > >> > > > >> This ConnectionRefused issue cannot be reproduced on my laptop > (MacOS > > > >> 10.14.4) but could on the linux env. And I've checked and confirmed > it > > > >> could pass with 1.4.7/1.4.9 source package but stably failed with > > 1.5.0, > > > >> performing a git bisect now, will report back later. > > > >> > > > >> Best Regards, > > > >> Yu > > > >> > > > >> > > > >> On Sat, 13 Apr 2019 at 00:38, Andrew Purtell < > > andrew.purt...@gmail.com> > > > >> wrote: > > > >> > > > >>> I also see the occasional ConnectionRefused errors. They don’t > > > reproduce > > > >>> if you run the test standalone. I also only see them on a Linux dev > > > host. > > > >>> That may be enough to find by bisect the commit that introduced > this > > > >>> behavior. Working on it. There is a JIRA filed for this one. Search > > for > > > >>> “TestBlocksRead” and label “branch-1”. > > > >>> > > > >>> Thanks for the investigations. > > > >>> > > > >>>> On Apr 12, 2019, at 6:36 AM, Yu Li <car...@gmail.com> wrote: > > > >>>> > > > >>>> Quick updates: > > > >>>> > > > >>>> W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1, > the > > > >>>> failures listed above in the 1st part of hbase-server disappeared. > > > >>>> > > > >>>> However, in the 2nd part of hbase-server UT there're still many > > > >>>> ConnectionRefused exceptions (17 errors in total) as shown below, > > > which > > > >>>> could be reproduced easily with -Dtest=xxx command on my > > environments, > > > >>>> still checking the root cause. > > > >>>> > > > >>>> [INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead > > > >>>> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time > > > elapsed: > > > >>>> 0.853 s <<< FAILURE! - in > > > >>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead > > > >>>> [ERROR] > > > >>>> > > > >>> > > > > > > testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead) > > > >>>> Time elapsed: 0.17 s <<< ERROR! > > > >>>> java.net.ConnectException: Call From > > > >>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669 > > failed > > > >>> on > > > >>>> connection exception: java.net.ConnectException: Connection > refused; > > > For > > > >>>> more details see: > > > >>>> http://wiki.apache.org/hadoop/ConnectionRefused > > > >>>> at > > > >>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112) > > > >>>> at > > > >>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389) > > > >>>> Caused by: java.net.ConnectException: Connection refused > > > >>>> at > > > >>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112) > > > >>>> at > > > >>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389) > > > >>>> > > > >>>> Best Regards, > > > >>>> Yu > > > >>>> > > > >>>> > > > >>>>> On Fri, 12 Apr 2019 at 13:11, Yu Li <car...@gmail.com> wrote: > > > >>>>> > > > >>>>> I have no doubt that you've run the tests locally before > > announcing a > > > >>>>> release as you're always a great RM boss. And this shows one > value > > of > > > >>>>> verifying release, that different voter has different > environments. > > > >>>>> > > > >>>>> Now I think the failures may be kerberos related, since I > possibly > > > has > > > >>>>> changed some system configuration when doing Flink testing on > this > > > env > > > >>>>> weeks ago. Located one issue (HBASE-22219) which also observed in > > > >>> 1.4.7, > > > >>>>> will further investigate. > > > >>>>> > > > >>>>> Best Regards, > > > >>>>> Yu > > > >>>>> > > > >>>>> > > > >>>>> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell < > > > andrew.purt...@gmail.com > > > >>>> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> “However it's good to find the issue earlier if there > > > >>>>>> really is any, before release announced.” > > > >>>>>> > > > >>>>>> I run the complete unit test suite before announcing a release > > > >>> candidate. > > > >>>>>> Just to be clear. > > > >>>>>> > > > >>>>>> Totally agree we should get these problems sorted before an > actual > > > >>>>>> release. My policy is to cancel a RC if anyone vetoes for this > > > >>> reason... > > > >>>>>> want as much coverage and varying environments as we can manage. > > > >>>>>> > > > >>>>>> Thank you for your help so far and I hope the failures you see > > > result > > > >>> in > > > >>>>>> analysis and fixes that lead to better test stability. > > > >>>>>> > > > >>>>>>> On Apr 11, 2019, at 9:32 PM, Yu Li <car...@gmail.com> wrote: > > > >>>>>>> > > > >>>>>>> Confirmed in 1.4.7 source the listed out cases passed (all in > the > > > 1st > > > >>>>>> part > > > >>>>>>> of hbase-server so the result comes out quickly.)... Also > > confirmed > > > >>> the > > > >>>>>>> test ran order are the same... > > > >>>>>>> > > > >>>>>>> Will try 1.5.0 again to prevent the environment difference > caused > > > by > > > >>>>>> time. > > > >>>>>>> If 1.5.0 still fails, will start to do the git bisect to locate > > the > > > >>>>>> first > > > >>>>>>> bad commit. > > > >>>>>>> > > > >>>>>>> Was also expecting an easy pass and +1 as always to save time > and > > > >>>>>> efforts, > > > >>>>>>> but obvious no luck. However it's good to find the issue > earlier > > if > > > >>>>>> there > > > >>>>>>> really is any, before release announced. > > > >>>>>>> > > > >>>>>>> Best Regards, > > > >>>>>>> Yu > > > >>>>>>> > > > >>>>>>> > > > >>>>>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <car...@gmail.com> wrote: > > > >>>>>>>> > > > >>>>>>>> Fine, let's focus on verifying whether it's a real problem > > rather > > > >>> than > > > >>>>>>>> arguing about wording, after all that's not my intention... > > > >>>>>>>> > > > >>>>>>>> As mentioned, I participated in the 1.4.7 release vote[1] and > > > IIRC I > > > >>>>>> was > > > >>>>>>>> using the same env and all tests passed w/o issue, that's > where > > my > > > >>>>>> concern > > > >>>>>>>> lies and the main reason I gave a -1 vote. I'm running against > > > 1.4.7 > > > >>>>>> source > > > >>>>>>>> on the same now and let's see the result. > > > >>>>>>>> > > > >>>>>>>> [1] > > > https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html > > > >>>>>>>> > > > >>>>>>>> Best Regards, > > > >>>>>>>> Yu > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell < > > > >>> andrew.purt...@gmail.com > > > >>>>>>> > > > >>>>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> I believe the test execution order matters. We run some tests > > in > > > >>>>>>>>> parallel. The ordering of tests is determined by readdir() > > > results > > > >>>>>> and this > > > >>>>>>>>> differs from host to host and checkout to checkout. So when > you > > > >>> see a > > > >>>>>>>>> repeatable group of failures, that’s great. And when someone > > else > > > >>>>>> doesn’t > > > >>>>>>>>> see those same tests fail, or they cannot be reproduced when > > > >>> running > > > >>>>>> by > > > >>>>>>>>> themselves, the commonly accepted term of art for this is > > > “flaky”. > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <car...@gmail.com> > wrote: > > > >>>>>>>>>> > > > >>>>>>>>>> Sorry but I'd call it "possible environment related problem" > > or > > > >>> "some > > > >>>>>>>>>> feature may not work well in specific environment", rather > > than > > > a > > > >>>>>> flaky. > > > >>>>>>>>>> > > > >>>>>>>>>> Will check against 1.4.7 released source package before > > opening > > > >>> any > > > >>>>>>>>> JIRA. > > > >>>>>>>>>> > > > >>>>>>>>>> Best Regards, > > > >>>>>>>>>> Yu > > > >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell < > > > >>>>>> andrew.purt...@gmail.com> > > > >>>>>>>>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>>> And if they pass in my environment , then what should we > call > > > it > > > >>>>>> then. > > > >>>>>>>>> I > > > >>>>>>>>>>> have no doubt you are seeing failures. Therefore can you > > please > > > >>> file > > > >>>>>>>>> JIRAs > > > >>>>>>>>>>> and attach information that can help identify a fix. > Thanks. > > > >>>>>>>>>>> > > > >>>>>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <car...@gmail.com> > > wrote: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> I ran the test suite with the > > > >>> -Dsurefire.rerunFailingTestsCount=2 > > > >>>>>>>>> option > > > >>>>>>>>>>>> and on two different env separately, so it sums up to 6 > > times > > > >>>>>> stable > > > >>>>>>>>>>>> failure for each case, and from my perspective this is not > > > >>> flaky. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no > such > > > >>> issue > > > >>>>>>>>>>> observed, > > > >>>>>>>>>>>> will double check. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Best Regards, > > > >>>>>>>>>>>> Yu > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell < > > > >>>>>>>>> andrew.purt...@gmail.com> > > > >>>>>>>>>>>> wrote: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>> There are two failure cases it looks like. And this looks > > > like > > > >>>>>>>>> flakes. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> The wrong FS assertions are not something I see when I > run > > > >>> these > > > >>>>>>>>> tests > > > >>>>>>>>>>>>> myself. I am not able to investigate something I can’t > > > >>> reproduce. > > > >>>>>>>>> What I > > > >>>>>>>>>>>>> suggest is since you can reproduce do a git bisect to > find > > > the > > > >>>>>> commit > > > >>>>>>>>>>> that > > > >>>>>>>>>>>>> introduced the problem. Then we can revert it. As an > > > >>> alternative > > > >>>>>> we > > > >>>>>>>>> can > > > >>>>>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the > > > test, > > > >>> and > > > >>>>>>>>>>>>> continue. This latter option only should be done if we > are > > > >>> fairly > > > >>>>>>>>>>> confident > > > >>>>>>>>>>>>> it is a test only problem. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> The connect exceptions are interesting. I see these > > sometimes > > > >>> when > > > >>>>>>>>> the > > > >>>>>>>>>>>>> suite is executed, not this particular case, but when the > > > >>> failed > > > >>>>>>>>> test is > > > >>>>>>>>>>>>> executed by itself it always passes. It is possible some > > > >>> change to > > > >>>>>>>>>>> classes > > > >>>>>>>>>>>>> related to the minicluster or startup or shutdown timing > > are > > > >>> the > > > >>>>>>>>> cause, > > > >>>>>>>>>>> but > > > >>>>>>>>>>>>> it is test time flaky behavior. I’m not happy about this > > but > > > it > > > >>>>>>>>> doesn’t > > > >>>>>>>>>>>>> actually fail the release because the failure is never > > > >>> repeatable > > > >>>>>>>>> when > > > >>>>>>>>>>> the > > > >>>>>>>>>>>>> test is run standalone. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> In general it would be great if some attention was paid > to > > > test > > > >>>>>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to > > > insist > > > >>>>>> that > > > >>>>>>>>>>>>> everything is perfect or there will never be another 1.x > > > >>> release, > > > >>>>>>>>>>> certainly > > > >>>>>>>>>>>>> not from branch-1. So, tests which fail repeatedly block > a > > > >>> release > > > >>>>>>>>> IMHO > > > >>>>>>>>>>> but > > > >>>>>>>>>>>>> flakes do not. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <car...@gmail.com> > > > wrote: > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> -1 > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Observed many UT failures when checking the source > package > > > >>> (tried > > > >>>>>>>>>>>>> multiple > > > >>>>>>>>>>>>>> rounds on two different environments, MacOs and Linux, > got > > > the > > > >>>>>> same > > > >>>>>>>>>>>>>> result), including (but not limited to): > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> TestBulkload: > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad) > > > >>>>>>>>>>>>>> Time elapsed: 0.083 s <<< ERROR! > > > >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS: > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29, > > > >>>>>>>>>>>>>> expected: hdfs://localhost:55938 > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246) > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256) > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150) > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> TestStoreFile: > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile) > > > >>>>>>>>>>>>>> Time elapsed: 0.083 s <<< ERROR! > > > >>>>>>>>>>>>>> java.net.ConnectException: Call From localhost/ > 127.0.0.1 > > to > > > >>>>>>>>>>>>> localhost:55938 > > > >>>>>>>>>>>>>> failed on connection exception: > java.net.ConnectException: > > > >>>>>>>>> Connection > > > >>>>>>>>>>>>>> refused; For more details see: > > > >>>>>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047) > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908) > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> TestHFile: > > > >>>>>>>>>>>>>> > testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile) > > > >>> Time > > > >>>>>>>>>>> elapsed: > > > >>>>>>>>>>>>>> 0.08 s <<< ERROR! > > > >>>>>>>>>>>>>> java.net.ConnectException: Call From > > > >>>>>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to > > > >>> localhost:35529 > > > >>>>>>>>> failed > > > >>>>>>>>>>> on > > > >>>>>>>>>>>>>> connection exception: java.net.ConnectException: > > Connection > > > >>>>>> refused; > > > >>>>>>>>>>> For > > > >>>>>>>>>>>>>> more details see: > > > >>>>>> http://wiki.apache.org/hadoop/ConnectionRefused > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> org.apache.hadoop.hbase.io > > > >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90) > > > >>>>>>>>>>>>>> Caused by: java.net.ConnectException: Connection refused > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> org.apache.hadoop.hbase.io > > > >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90) > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> TestBlocksScanned: > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned) > > > >>>>>>>>>>>>>> Time elapsed: 0.069 s <<< ERROR! > > > >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS: > > > >>>>>>>>>>> hdfs://localhost:35529/tmp/ > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08 > > > >>>>>>>>>>>>> , > > > >>>>>>>>>>>>>> expected: file:/// > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90) > > > >>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86) > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> And please let me know if any known issue I'm not aware > > of. > > > >>>>>> Thanks. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Best Regards, > > > >>>>>>>>>>>>>> Yu > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <car...@gmail.com> > > > wrote: > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for the > > lag > > > >>> due > > > >>>>>> to > > > >>>>>>>>>>>>>>> Qingming Festival Holiday here in China) > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> Still verifying the release, just some quick feedback: > > > >>> observed > > > >>>>>>>>> some > > > >>>>>>>>>>>>>>> incompatible changes in compatibility report including > > > >>>>>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in > > > ReleaseNote. > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL > is > > > >>>>>> invalid on > > > >>>>>>>>>>>>>>> https://hbase.apache.org/downloads.html > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> Best Regards, > > > >>>>>>>>>>>>>>> Yu > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell < > > > >>>>>> apurt...@apache.org> > > > >>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> The difference is basically noise per the usual YCSB > > > >>>>>> evaluation. > > > >>>>>>>>>>> Small > > > >>>>>>>>>>>>>>>> differences in workloads D and F (slightly worse) and > > > >>> workload > > > >>>>>> E > > > >>>>>>>>>>>>> (slightly > > > >>>>>>>>>>>>>>>> better) that do not indicate serious regression. > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64 > > > >>>>>>>>>>>>>>>> c3.8xlarge x 5 > > > >>>>>>>>>>>>>>>> OpenJDK Runtime Environment (build > > > 1.8.0_181-shenandoah-b13) > > > >>>>>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch > > > >>> -XX:+UseNUMA > > > >>>>>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled > > > >>>>>>>>>>>>>>>> Hadoop 2.9.2 > > > >>>>>>>>>>>>>>>> Init: Load 100 M rows and snapshot > > > >>>>>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, > run > > > 10 > > > >>> M > > > >>>>>>>>>>>>> operations > > > >>>>>>>>>>>>>>>> Args: -threads 100 -target 50000 > > > >>>>>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', > VERSIONS > > > => > > > >>>>>> '1', > > > >>>>>>>>>>>>> IN_MEMORY > > > >>>>>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', > > > >>> DATA_BLOCK_ENCODING > > > >>>>>> => > > > >>>>>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => > > 'SNAPPY', > > > >>>>>>>>>>>>> MIN_VERSIONS => > > > >>>>>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', > > > >>>>>>>>> REPLICATION_SCOPE => > > > >>>>>>>>>>>>>>>> '0'} > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> YCSB Workload A > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583 > > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855 > > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 544 559 > > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 267 292 > > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087 > > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742 > > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961 > > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181 > > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646 > > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279 > > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652 > > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> YCSB Workload B > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581 > > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855 > > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us), 454 471 > > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 203 213 > > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207 > > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599 > > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172 > > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029 > > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726 > > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631 > > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157 > > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> YCSB Workload C > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538 > > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865 > > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 332 327 > > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 175 179 > > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367 > > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396 > > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> YCSB Workload D > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562 > > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859 > > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 487 547 > > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 210 214 > > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535 > > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529 > > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683 > > > >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152 > > > >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788 > > > >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735 > > > >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243 > > > >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> YCSB Workload E > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568 > > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943 > > > >>>>>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687 > > > >>>>>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678 > > > >>>>>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463 > > > >>>>>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791 > > > >>>>>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415 > > > >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555 > > > >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815 > > > >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623 > > > >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571 > > > >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> YCSB Workload F > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178 > > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976 > > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137 > > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 262 257 > > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335 > > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475 > > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143 > > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917 > > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034 > > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799 > > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 > 7627 > > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 > 8919 > > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777 > > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687 > > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271 > > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147 > > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831 > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li < > car...@gmail.com > > > > > > >>>>>> wrote: > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Thanks for the efforts boss. > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Since it's a new minor release, do we have > performance > > > >>>>>> comparison > > > >>>>>>>>>>>>> report > > > >>>>>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any > > > >>>>>> reference? > > > >>>>>>>>>>> Many > > > >>>>>>>>>>>>>>>>> thanks! > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Best Regards, > > > >>>>>>>>>>>>>>>>> Yu > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell < > > > >>>>>> apurt...@apache.org > > > >>>>>>>>>> > > > >>>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is > > > >>> available > > > >>>>>> for > > > >>>>>>>>>>>>>>>> download > > > >>>>>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>>>>>> > > > >>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ > > > >>>>>>>>> and > > > >>>>>>>>>>>>>>>> Maven > > > >>>>>>>>>>>>>>>>>> artifacts are available in the temporary repository > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > https://repository.apache.org/content/repositories/orgapachehbase-1292/ > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> The git tag corresponding to the candidate is > > '1.5.0RC3’ > > > >>>>>>>>>>>>> (b0bc7225c5). > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> A detailed source and binary compatibility report > for > > > this > > > >>>>>>>>> release > > > >>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>>> available for your review at > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>> > > > > > > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html > > > >>>>>>>>>>>>>>>>>> . > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> A list of the 115 issues resolved in this release > can > > be > > > >>>>>> found > > > >>>>>>>>> at > > > >>>>>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is > > > >>> derived > > > >>>>>> from > > > >>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9. > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1. > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless > > > >>>>>> objection I > > > >>>>>>>>>>> will > > > >>>>>>>>>>>>>>>> try > > > >>>>>>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient > > > >>> votes. > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> Prior to making this announcement I made the > following > > > >>>>>> preflight > > > >>>>>>>>>>>>>>>> checks: > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> RAT check passes (7u80) > > > >>>>>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)* > > > >>>>>>>>>>>>>>>>>> Opened the UI in a browser, poked around > > > >>>>>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% > > > updates > > > >>>>>>>>> (8u181) > > > >>>>>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181) > > > >>>>>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181) > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and > > > >>> HBASE-21905. > > > >>>>>>>>> These > > > >>>>>>>>>>>>>>>> flaky > > > >>>>>>>>>>>>>>>>>> tests do not represent serious test failures that > > would > > > >>>>>> prevent > > > >>>>>>>>> a > > > >>>>>>>>>>>>>>>>> release. > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> -- > > > >>>>>>>>>>>>>>>>>> Best regards, > > > >>>>>>>>>>>>>>>>>> Andrew > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> -- > > > >>>>>>>>>>>>>>>> Best regards, > > > >>>>>>>>>>>>>>>> Andrew > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning > > torn > > > >>> from > > > >>>>>>>>>>> truth's > > > >>>>>>>>>>>>>>>> decrepit hands > > > >>>>>>>>>>>>>>>> - A23, Crosstalk > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>>>> > > > >>> > > > >> > > > > > > > > > -- > > Best regards, > > Andrew > > > > Words like orphans lost among the crosstalk, meaning torn from truth's > > decrepit hands > > - A23, Crosstalk > > > -- Best regards, Andrew Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands - A23, Crosstalk