[jira] [Commented] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing
[ https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546018#comment-16546018 ] stack commented on HBASE-20875: --- bq. In the jitviewer also noticed that in the hot path (in read path in this case) there are instances where it says 'callee too big' because of which inlining does not happen. Yeah. If can make it inline by making stuff smaller and/or dumbing down the options/types, usually goes faster. > MemStoreLABImp::copyIntoCell uses 7% CPU when writing > - > > Key: HBASE-20875 > URL: https://issues.apache.org/jira/browse/HBASE-20875 > Project: HBase > Issue Type: Sub-task > Components: Performance >Affects Versions: 2.0.1 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.2 > > Attachments: > 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, > 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, > HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot > 2018-07-11 at 9.52.46 PM.png > > > Looks like this with a PE random write loading: > {code} > ./hbase/bin/hbase --config ~/conf_hbase > org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40 > --size=30 --columns=10 --valueSize=100 randomWrite 200 > {code} > ... against a single server. > {code} > 12.47% perf-91935.map > [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare > 10.42% libjvm.so > [.] > ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*, > oopDesc*, unsigned long, markOopDesc*) > 6.78% perf-91935.map > [.] > Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto > > {code} > These are top CPU consumers using perf-map-agent ./bin/perf-java-top... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing
[ https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546016#comment-16546016 ] ramkrishna.s.vasudevan commented on HBASE-20875: In the jitviewer also noticed that in the hot path (in read path in this case) there are instances where it says 'callee too big' because of which inlining does not happen. > MemStoreLABImp::copyIntoCell uses 7% CPU when writing > - > > Key: HBASE-20875 > URL: https://issues.apache.org/jira/browse/HBASE-20875 > Project: HBase > Issue Type: Sub-task > Components: Performance >Affects Versions: 2.0.1 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.2 > > Attachments: > 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, > 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, > HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot > 2018-07-11 at 9.52.46 PM.png > > > Looks like this with a PE random write loading: > {code} > ./hbase/bin/hbase --config ~/conf_hbase > org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40 > --size=30 --columns=10 --valueSize=100 randomWrite 200 > {code} > ... against a single server. > {code} > 12.47% perf-91935.map > [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare > 10.42% libjvm.so > [.] > ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*, > oopDesc*, unsigned long, markOopDesc*) > 6.78% perf-91935.map > [.] > Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto > > {code} > These are top CPU consumers using perf-map-agent ./bin/perf-java-top... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing
[ https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546012#comment-16546012 ] ramkrishna.s.vasudevan commented on HBASE-20875: Nice one. +1 on the patch. > MemStoreLABImp::copyIntoCell uses 7% CPU when writing > - > > Key: HBASE-20875 > URL: https://issues.apache.org/jira/browse/HBASE-20875 > Project: HBase > Issue Type: Sub-task > Components: Performance >Affects Versions: 2.0.1 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.2 > > Attachments: > 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, > 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, > HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot > 2018-07-11 at 9.52.46 PM.png > > > Looks like this with a PE random write loading: > {code} > ./hbase/bin/hbase --config ~/conf_hbase > org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40 > --size=30 --columns=10 --valueSize=100 randomWrite 200 > {code} > ... against a single server. > {code} > 12.47% perf-91935.map > [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare > 10.42% libjvm.so > [.] > ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*, > oopDesc*, unsigned long, markOopDesc*) > 6.78% perf-91935.map > [.] > Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto > > {code} > These are top CPU consumers using perf-map-agent ./bin/perf-java-top... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20901) Reducing region replica has no effect
[ https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545988#comment-16545988 ] Ted Yu edited comment on HBASE-20901 at 7/17/18 3:27 AM: - {code} + public static byte[] getRegionStateColumn(int replicaId) { {code} The new methods can be package private, right ? was (Author: yuzhih...@gmail.com): {code} + public static byte[] getRegionStateColumn(int replicaId) { {code} The new methods can be private, right (only accessed in MetaTableAccessor) ? > Reducing region replica has no effect > - > > Key: HBASE-20901 > URL: https://issues.apache.org/jira/browse/HBASE-20901 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Labels: replica > Attachments: HBASE-20901.patch > > > While reducing the region replica, server name(sn) and state column of the > replica are not getting deleted, resulting in assignment manager to think > that these regions are CLOSED and assign them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20901) Reducing region replica has no effect
[ https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545988#comment-16545988 ] Ted Yu commented on HBASE-20901: {code} + public static byte[] getRegionStateColumn(int replicaId) { {code} The new methods can be private, right (only accessed in MetaTableAccessor) ? > Reducing region replica has no effect > - > > Key: HBASE-20901 > URL: https://issues.apache.org/jira/browse/HBASE-20901 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Labels: replica > Attachments: HBASE-20901.patch > > > While reducing the region replica, server name(sn) and state column of the > replica are not getting deleted, resulting in assignment manager to think > that these regions are CLOSED and assign them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-18201) add UT and docs for DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuan-Po Tseng updated HBASE-18201: -- Attachment: HBASE-18201.master.005.patch > add UT and docs for DataBlockEncodingTool > - > > Key: HBASE-18201 > URL: https://issues.apache.org/jira/browse/HBASE-18201 > Project: HBase > Issue Type: Sub-task > Components: tooling >Reporter: Chia-Ping Tsai >Assignee: Kuan-Po Tseng >Priority: Minor > Labels: beginner > Attachments: HBASE-18201.master.001.patch, > HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, > HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, > HBASE-18201.master.005.patch, HBASE-18201.master.005.patch, > HBASE-18201.master.005.patch > > > There is no example, documents, or tests for DataBlockEncodingTool. We should > have it friendly if any use case exists. Otherwise, we should just get rid of > it because DataBlockEncodingTool presumes that the implementation of cell > returned from DataBlockEncoder is KeyValue. The presume may obstruct the > cleanup of KeyValue references in the code base of read/write path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close
[ https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545985#comment-16545985 ] Hadoop QA commented on HBASE-20704: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 52s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 44s{color} | {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 total (was 188) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 12s{color} | {color:red} hbase-server: The patch generated 9 new + 53 unchanged - 0 fixed = 62 total (was 53) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 3s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}146m 50s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}189m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20704 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931863/HBASE-20704.004.draft.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux a48e258cbb76 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 2997b6d071 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC3 | | javac | https://builds.apache.org/job/PreCommit-HBASE-Build/13643/artifact/patchprocess/diff-compile-javac-hbase-server.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/13643/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | Test Results |
[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545969#comment-16545969 ] Duo Zhang commented on HBASE-20846: --- Thanks [~stack]. I checked the code, for a procedure in ROLLEDBACK state, we will call store.delete to remove it so we should not update the lock operation any more. And we are getting closer. Let me check the failed UTs. > Restore procedure locks when master restarts > > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, > HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, > HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold the table's shared lock. A > ModifyTableProcedure can aquire the table's exclusive lock and execute at the > same time. Which is not possible if the master was not restarted. > This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, > I wrote a simple UT to repo this case. > I think we don't have to check the parent for table's shared lock. It is a > shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export
[ https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545966#comment-16545966 ] Chia-Ping Tsai commented on HBASE-20873: Nice docs. +1 I’m on vacation so the patch will be committed later. > Update doc for Endpoint-based Export > > > Key: HBASE-20873 > URL: https://issues.apache.org/jira/browse/HBASE-20873 > Project: HBase > Issue Type: Improvement > Components: documentation >Affects Versions: 2.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Attachments: HBASE-20873.master.001.patch > > > The current documentation on the usage is a little vague. I'd like to take a > stab at expanding it, based on my experience. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing
[ https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20893: --- Attachment: HBASE-20893.branch-2.0.002.patch > Data loss if splitting region while ServerCrashProcedure executing > -- > > Key: HBASE-20893 > URL: https://issues.apache.org/jira/browse/HBASE-20893 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20893.branch-2.0.001.patch, > HBASE-20893.branch-2.0.002.patch > > > Similar case as HBASE-20878. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing
[ https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545949#comment-16545949 ] Hadoop QA commented on HBASE-20875: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 18s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 19s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 44s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}156m 51s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}200m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20875 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931851/HBASE-20875.master.002.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 10b47f4d2675 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 2997b6d071 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/13642/testReport/ | | Max. process+thread count | 5200 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/13642/console | | Powered by |
[jira] [Commented] (HBASE-20873) Update doc for Endpoint-based Export
[ https://issues.apache.org/jira/browse/HBASE-20873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545941#comment-16545941 ] Wei-Chiu Chuang commented on HBASE-20873: - [~chia7712] mind take a look? Thanks > Update doc for Endpoint-based Export > > > Key: HBASE-20873 > URL: https://issues.apache.org/jira/browse/HBASE-20873 > Project: HBase > Issue Type: Improvement > Components: documentation >Affects Versions: 2.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Attachments: HBASE-20873.master.001.patch > > > The current documentation on the usage is a little vague. I'd like to take a > stab at expanding it, based on my experience. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20901) Reducing region replica has no effect
[ https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated HBASE-20901: -- Status: Patch Available (was: Open) > Reducing region replica has no effect > - > > Key: HBASE-20901 > URL: https://issues.apache.org/jira/browse/HBASE-20901 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Labels: replica > Attachments: HBASE-20901.patch > > > While reducing the region replica, server name(sn) and state column of the > replica are not getting deleted, resulting in assignment manager to think > that these regions are CLOSED and assign them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HBASE-20901) Reducing region replica has no effect
[ https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-20901 started by Ankit Singhal. - > Reducing region replica has no effect > - > > Key: HBASE-20901 > URL: https://issues.apache.org/jira/browse/HBASE-20901 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Labels: replica > Attachments: HBASE-20901.patch > > > While reducing the region replica, server name(sn) and state column of the > replica are not getting deleted, resulting in assignment manager to think > that these regions are CLOSED and assign them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work stopped] (HBASE-20901) Reducing region replica has no effect
[ https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-20901 stopped by Ankit Singhal. - > Reducing region replica has no effect > - > > Key: HBASE-20901 > URL: https://issues.apache.org/jira/browse/HBASE-20901 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Labels: replica > Attachments: HBASE-20901.patch > > > While reducing the region replica, server name(sn) and state column of the > replica are not getting deleted, resulting in assignment manager to think > that these regions are CLOSED and assign them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20901) Reducing region replica has no effect
[ https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated HBASE-20901: -- Description: While reducing the region replica, server name(sn) and state column of the replica are not getting deleted, resulting in assignment manager to think that these regions are CLOSED and assign them again. > Reducing region replica has no effect > - > > Key: HBASE-20901 > URL: https://issues.apache.org/jira/browse/HBASE-20901 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Labels: replica > Attachments: HBASE-20901.patch > > > While reducing the region replica, server name(sn) and state column of the > replica are not getting deleted, resulting in assignment manager to think > that these regions are CLOSED and assign them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20901) Reducing region replica has no effect
[ https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated HBASE-20901: -- Attachment: HBASE-20901.patch > Reducing region replica has no effect > - > > Key: HBASE-20901 > URL: https://issues.apache.org/jira/browse/HBASE-20901 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Labels: replica > Attachments: HBASE-20901.patch > > > While reducing the region replica, server name(sn) and state column of the > replica are not getting deleted, resulting in assignment manager to think > that these regions are CLOSED and assign them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20901) Reducing region replica has no effect
[ https://issues.apache.org/jira/browse/HBASE-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankit Singhal updated HBASE-20901: -- Environment: (was: While reducing the region replica, server name(sn) and state column of the replica are not getting deleted, resulting in assignment manager to think that these regions are CLOSED and assign them again. ) > Reducing region replica has no effect > - > > Key: HBASE-20901 > URL: https://issues.apache.org/jira/browse/HBASE-20901 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Labels: replica > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20901) Reducing region replica has no effect
Ankit Singhal created HBASE-20901: - Summary: Reducing region replica has no effect Key: HBASE-20901 URL: https://issues.apache.org/jira/browse/HBASE-20901 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Environment: While reducing the region replica, server name(sn) and state column of the replica are not getting deleted, resulting in assignment manager to think that these regions are CLOSED and assign them again. Reporter: Ankit Singhal Assignee: Ankit Singhal -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20867) RS may get killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20867: --- Attachment: HBASE-20867.branch-2.0.004.patch > RS may get killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, > HBASE-20867.branch-2.0.004.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters
[ https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545914#comment-16545914 ] Hudson commented on HBASE-18477: Results for branch HBASE-18477 [build #266 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} --Failed when running client tests on top of Hadoop 2. [see log for details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/266//artifact/output-integration/hadoop-2.log]. (note that this means we didn't run on Hadoop 3) > Umbrella JIRA for HBase Read Replica clusters > - > > Key: HBASE-18477 > URL: https://issues.apache.org/jira/browse/HBASE-18477 > Project: HBase > Issue Type: New Feature >Reporter: Zach York >Assignee: Zach York >Priority: Major > Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase > Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope > doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf > > > Recently, changes (such as HBASE-17437) have unblocked HBase to run with a > root directory external to the cluster (such as in Amazon S3). This means > that the data is stored outside of the cluster and can be accessible after > the cluster has been terminated. One use case that is often asked about is > pointing multiple clusters to one root directory (sharing the data) to have > read resiliency in the case of a cluster failure. > > This JIRA is an umbrella JIRA to contain all the tasks necessary to create a > read-replica HBase cluster that is pointed at the same root directory. > > This requires making the Read-Replica cluster Read-Only (no metadata > operation or data operations). > Separating the hbase:meta table for each cluster (Otherwise HBase gets > confused with multiple clusters trying to update the meta table with their ip > addresses) > Adding refresh functionality for the meta table to ensure new metadata is > picked up on the read replica cluster. > Adding refresh functionality for HFiles for a given table to ensure new data > is picked up on the read replica cluster. > > This can be used with any existing cluster that is backed by an external > filesystem. > > Please note that this feature is still quite manual (with the potential for > automation later). > > More information on this particular feature can be found here: > https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20900) Improve FsDelegationToken to support KMS delegation tokens
Wei-Chiu Chuang created HBASE-20900: --- Summary: Improve FsDelegationToken to support KMS delegation tokens Key: HBASE-20900 URL: https://issues.apache.org/jira/browse/HBASE-20900 Project: HBase Issue Type: Sub-task Reporter: Wei-Chiu Chuang Assignee: Wei-Chiu Chuang Currently FsDelegationToken acquires HDFS delegation token. Any tools that use it to access encryption zone files could fail because they don't have KMS delegation token. We should fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20899) Add Hadoop KMS dependency and basic HDFS at-rest encryption tests
Wei-Chiu Chuang created HBASE-20899: --- Summary: Add Hadoop KMS dependency and basic HDFS at-rest encryption tests Key: HBASE-20899 URL: https://issues.apache.org/jira/browse/HBASE-20899 Project: HBase Issue Type: Sub-task Components: encryption Affects Versions: 2.0.0 Reporter: Wei-Chiu Chuang Assignee: Wei-Chiu Chuang We should start by adding hadoop-kms dependency in HBase test scope, and add basic HDFS at-rest encryption tests using the hadoop-kms dependency. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545899#comment-16545899 ] huaxiang sun commented on HBASE-20697: -- [~stack] The fix is generic, getAllRegionLocations is not caching all regions' locations, instead, it only caches the first entry. With the fix, the case for region replicas is also taken care of. I think we need to backport this to 1.2 and 1.3 as well. > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6, 2.0.1 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, > HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20686) Asyncfs should retry upon RetryStartFileException
[ https://issues.apache.org/jira/browse/HBASE-20686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HBASE-20686: Issue Type: Sub-task (was: Bug) Parent: HBASE-20898 > Asyncfs should retry upon RetryStartFileException > - > > Key: HBASE-20686 > URL: https://issues.apache.org/jira/browse/HBASE-20686 > Project: HBase > Issue Type: Sub-task > Components: asyncclient >Affects Versions: 2.0.0-beta-1 > Environment: HBase 2.0, Hadoop 3 with at-rest encryption >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > Attachments: HBASE-20686.master.001.patch, > HBASE-20686.master.002.patch > > > In Hadoop-2.6 and above, HDFS client retries on RetryStartFileException when > NameNode experience encryption zone related issue. The code exists in > DFSOutputStream#newStreamForCreate(). (HDFS-6970) > In HBase-2's asyncfs implementation, > FanOutOneBlockAsyncDFSOutputHelper#createOutput() is somewhat an imitation of > HDFS's DFSOutputStream#newStreamForCreate(). However it does not retry upon > RetryStartFileException. So it is less resilient to such issues. > Also, DFSOutputStream#newStreamForCreate() upwraps RemoteExceptions, but > asyncfs does not. Therefore, hbase gets different exceptions than before. > File this jira to get this corrected. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545896#comment-16545896 ] stack commented on HBASE-20697: --- So, this issue fixes caching of region replicas? We weren't doing it previous? > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6, 2.0.1 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, > HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20898) Improve support for HDFS at-rest encryption
Wei-Chiu Chuang created HBASE-20898: --- Summary: Improve support for HDFS at-rest encryption Key: HBASE-20898 URL: https://issues.apache.org/jira/browse/HBASE-20898 Project: HBase Issue Type: Umbrella Components: encryption Affects Versions: 2.0.0 Environment: HBase 2 on Hadoop 2.6.0+ (HDFS at-rest encryption) Reporter: Wei-Chiu Chuang Assignee: Wei-Chiu Chuang * Note * this has nothing to do with HBase's Transparent Encryption of Data At Rest. HDFS's at rest encryption is "transparent" in that encrypt/decrypt itself doesn't require client side change. However, in practice, there re a few cases that need to be taken care of. For example, accessing KMS requires KMS delegation tokens. If HBase tools get only HDFS delegation tokens, it would fail to access files in HDFS encryption zone. Cases such as HBASE-20403 suggests in some cases HBase behaves differently in HDFS-encrypted cluster. I propose an umbrella jira to revisit the HDFS at-rest encryption support in various HBase subcomponents and tools, add additional tests and enhance the tools as we visit them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545890#comment-16545890 ] stack commented on HBASE-20697: --- [~zghaobac] Thanks for the backport. > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6, 2.0.1 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.branch-1.addendum.patch, > HBASE-20697.master.001.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close
[ https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545873#comment-16545873 ] Francis Liu edited comment on HBASE-20704 at 7/17/18 12:14 AM: --- {quote}expecting eventual GC to call a finalizer that cleans things up {quote} AFAIK it should get cleaned up either via another next() rpc (that fails bacause the region is cloased) or scanner lease expiration processing. The readers won't be garbage until the scanner state is cleaned up. In any case it would objects that would give gc more work, tho it doesn't sounds like it's going to be significant and generally just part of normal operation. ie scan lease expiring and pauses between next() rpc calls. The trade off is tho now we have to have concurrent threads access a map during storefilescanner creation and and close for streaming scans. The overhead may be negligible assuming streaming scans are meant for doing large scans. I've attached a rough patch on how it would look. Let me know what you think. was (Author: toffer): {quote}expecting eventual GC to call a finalizer that cleans things up {quote} AFAIK it should get cleaned up either via another next() rpc (that fails bacause the region is cloased) or scanner lease expiration processing. The readers won't be garbage until the scanner state is cleaned up. In any case it would objects that would give gc more work. The trade off is tho now we have to have concurrent threads access a map during storefilescanner creation and and close for streaming scans. The overhead may be negligible assuming streaming scans are meant for doing large scans. I've attached a rough patch on how it would look. Let me know what you think. > Sometimes some compacted storefiles are not archived on region close > > > Key: HBASE-20704 > URL: https://issues.apache.org/jira/browse/HBASE-20704 > Project: HBase > Issue Type: Bug > Components: Compaction >Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0 >Reporter: Francis Liu >Assignee: Francis Liu >Priority: Critical > Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, > HBASE-20704.003.patch, HBASE-20704.004.draft.patch > > > During region close compacted files which have not yet been archived by the > discharger are archived as part of the region closing process. It is > important that these files are wholly archived to insure data consistency. ie > a storefile containing delete tombstones can be archived while older > storefiles containing cells that were supposed to be deleted are left > unarchived thereby undeleting those cells. > On region close a compacted storefile is skipped from archiving if it has > read references (ie open scanners). This behavior is correct for when the > discharger chore runs but on region close consistency is of course more > important so we should add a special case to ignore any references on the > storefile and go ahead and archive it. > Attached patch contains a unit test that reproduces the problem and the > proposed fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close
[ https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545873#comment-16545873 ] Francis Liu commented on HBASE-20704: - {quote}expecting eventual GC to call a finalizer that cleans things up {quote} AFAIK it should get cleaned up either via another next() rpc (that fails bacause the region is cloased) or scanner lease expiration processing. The readers won't be garbage until the scanner state is cleaned up. In any case it would objects that would give gc more work. The trade off is tho now we have to have concurrent threads access a map during storefilescanner creation and and close for streaming scans. The overhead may be negligible assuming streaming scans are meant for doing large scans. I've attached a rough patch on how it would look. Let me know what you think. > Sometimes some compacted storefiles are not archived on region close > > > Key: HBASE-20704 > URL: https://issues.apache.org/jira/browse/HBASE-20704 > Project: HBase > Issue Type: Bug > Components: Compaction >Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0 >Reporter: Francis Liu >Assignee: Francis Liu >Priority: Critical > Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, > HBASE-20704.003.patch, HBASE-20704.004.draft.patch > > > During region close compacted files which have not yet been archived by the > discharger are archived as part of the region closing process. It is > important that these files are wholly archived to insure data consistency. ie > a storefile containing delete tombstones can be archived while older > storefiles containing cells that were supposed to be deleted are left > unarchived thereby undeleting those cells. > On region close a compacted storefile is skipped from archiving if it has > read references (ie open scanners). This behavior is correct for when the > discharger chore runs but on region close consistency is of course more > important so we should add a special case to ignore any references on the > storefile and go ahead and archive it. > Attached patch contains a unit test that reproduces the problem and the > proposed fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64
[ https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545869#comment-16545869 ] Hudson commented on HBASE-20884: Results for branch branch-1.3 [build #394 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Replace usage of our Base64 implementation with java.util.Base64 > > > Key: HBASE-20884 > URL: https://issues.apache.org/jira/browse/HBASE-20884 > Project: HBase > Issue Type: Task >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1 > > Attachments: HBASE-20884.branch-1.001.patch, > HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch > > > We have a public domain implementation of Base64 that is copied into our code > base and infrequently receives updates. We should replace usage of that with > the new Java 8 java.util.Base64 where possible. > For the migration, I propose a phased approach. > * Deprecate on 1.x and 2.x to signal to users that this is going away. > * Replace usages on branch-2 and master with j.u.Base64 > * Delete our implementation of Base64 on master. > Does this seem in line with our API compatibility requirements? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20889) PE scan is failing with NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545870#comment-16545870 ] Hudson commented on HBASE-20889: Results for branch branch-1.3 [build #394 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/394//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > PE scan is failing with NullPointerException > > > Key: HBASE-20889 > URL: https://issues.apache.org/jira/browse/HBASE-20889 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Vikas Vishwakarma >Assignee: Ted Yu >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.6 > > Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt > > > Command used > {code:java} > ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > > scan1{code} > PE scan 1 is failing with NullPointer > {code:java} > java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close
[ https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HBASE-20704: Attachment: HBASE-20704.004.draft.patch > Sometimes some compacted storefiles are not archived on region close > > > Key: HBASE-20704 > URL: https://issues.apache.org/jira/browse/HBASE-20704 > Project: HBase > Issue Type: Bug > Components: Compaction >Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0 >Reporter: Francis Liu >Assignee: Francis Liu >Priority: Critical > Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, > HBASE-20704.003.patch, HBASE-20704.004.draft.patch > > > During region close compacted files which have not yet been archived by the > discharger are archived as part of the region closing process. It is > important that these files are wholly archived to insure data consistency. ie > a storefile containing delete tombstones can be archived while older > storefiles containing cells that were supposed to be deleted are left > unarchived thereby undeleting those cells. > On region close a compacted storefile is skipped from archiving if it has > read references (ie open scanners). This behavior is correct for when the > discharger chore runs but on region close consistency is of course more > important so we should add a special case to ignore any references on the > storefile and go ahead and archive it. > Attached patch contains a unit test that reproduces the problem and the > proposed fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545861#comment-16545861 ] Hadoop QA commented on HBASE-20846: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 46s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 22s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 47s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 8s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s{color} | {color:red} hbase-procedure: The patch generated 1 new + 38 unchanged - 14 fixed = 39 total (was 52) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green} hbase-server: The patch generated 0 new + 316 unchanged - 7 fixed = 316 total (was 323) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 7s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 46s{color} | {color:red} hbase-procedure in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}197m 24s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 7s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}255m 33s{color} |
[jira] [Updated] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing
[ https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20875: -- Attachment: HBASE-20875.master.001.patch > MemStoreLABImp::copyIntoCell uses 7% CPU when writing > - > > Key: HBASE-20875 > URL: https://issues.apache.org/jira/browse/HBASE-20875 > Project: HBase > Issue Type: Sub-task > Components: Performance >Affects Versions: 2.0.1 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.2 > > Attachments: > 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, > 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, > HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot > 2018-07-11 at 9.52.46 PM.png > > > Looks like this with a PE random write loading: > {code} > ./hbase/bin/hbase --config ~/conf_hbase > org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40 > --size=30 --columns=10 --valueSize=100 randomWrite 200 > {code} > ... against a single server. > {code} > 12.47% perf-91935.map > [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare > 10.42% libjvm.so > [.] > ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*, > oopDesc*, unsigned long, markOopDesc*) > 6.78% perf-91935.map > [.] > Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto > > {code} > These are top CPU consumers using perf-map-agent ./bin/perf-java-top... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20875) MemStoreLABImp::copyIntoCell uses 7% CPU when writing
[ https://issues.apache.org/jira/browse/HBASE-20875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20875: -- Attachment: HBASE-20875.master.002.patch > MemStoreLABImp::copyIntoCell uses 7% CPU when writing > - > > Key: HBASE-20875 > URL: https://issues.apache.org/jira/browse/HBASE-20875 > Project: HBase > Issue Type: Sub-task > Components: Performance >Affects Versions: 2.0.1 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.2 > > Attachments: > 0001-HBASE-20875-MemStoreLABImp-copyIntoCell-uses-7-CPU-w.patch, > 2.0707.baseline.91935.cpu.svg, 2.0711.patched.145414.cpu.svg, > HBASE-20875.master.001.patch, HBASE-20875.master.002.patch, Screen Shot > 2018-07-11 at 9.52.46 PM.png > > > Looks like this with a PE random write loading: > {code} > ./hbase/bin/hbase --config ~/conf_hbase > org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --presplit=40 > --size=30 --columns=10 --valueSize=100 randomWrite 200 > {code} > ... against a single server. > {code} > 12.47% perf-91935.map > [.] Lorg/apache/hadoop/hbase/BBKVComparator;::compare > 10.42% libjvm.so > [.] > ParNewGeneration::copy_to_survivor_space_avoiding_promotion_undo(ParScanThreadState*, > oopDesc*, unsigned long, markOopDesc*) > 6.78% perf-91935.map > [.] > Lorg/apache/hadoop/hbase/regionserver/MemStoreLABImpl;::copyCellInto > > {code} > These are top CPU consumers using perf-map-agent ./bin/perf-java-top... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20889) PE scan is failing with NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545810#comment-16545810 ] Hudson commented on HBASE-20889: Results for branch branch-1.4 [build #387 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > PE scan is failing with NullPointerException > > > Key: HBASE-20889 > URL: https://issues.apache.org/jira/browse/HBASE-20889 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Vikas Vishwakarma >Assignee: Ted Yu >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.6 > > Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt > > > Command used > {code:java} > ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > > scan1{code} > PE scan 1 is failing with NullPointer > {code:java} > java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64
[ https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545809#comment-16545809 ] Hudson commented on HBASE-20884: Results for branch branch-1.4 [build #387 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/387//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Replace usage of our Base64 implementation with java.util.Base64 > > > Key: HBASE-20884 > URL: https://issues.apache.org/jira/browse/HBASE-20884 > Project: HBase > Issue Type: Task >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1 > > Attachments: HBASE-20884.branch-1.001.patch, > HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch > > > We have a public domain implementation of Base64 that is copied into our code > base and infrequently receives updates. We should replace usage of that with > the new Java 8 java.util.Base64 where possible. > For the migration, I propose a phased approach. > * Deprecate on 1.x and 2.x to signal to users that this is going away. > * Replace usages on branch-2 and master with j.u.Base64 > * Delete our implementation of Base64 on master. > Does this seem in line with our API compatibility requirements? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf
[ https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545773#comment-16545773 ] Vladimir Rodionov commented on HBASE-20894: --- What are criteria of "better", [~mdrob]? Protobuf is much heavier on CPU and Memory than java - serialization. From performance point of view, I do not think protobuf is faster, but I would gladly accept perf numbers. Protobuf adds a additional (useless) generated code to HBase code base. What else? Yes, this BuckeCache is totally internal feature which is not supposed to be exposed to a public (I mean serialized data). > Move BucketCache from java serialization to protobuf > > > Key: HBASE-20894 > URL: https://issues.apache.org/jira/browse/HBASE-20894 > Project: HBase > Issue Type: Task > Components: BucketCache >Affects Versions: 2.0.0 >Reporter: Mike Drob >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch > > > We should use a better serialization format instead of Java Serialization for > the BucketCache entry persistence. > Suggested by Chris McCown, who does not appear to have a JIRA account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-15654) Optimize client's MetaCache handling
[ https://issues.apache.org/jira/browse/HBASE-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545721#comment-16545721 ] huaxiang sun commented on HBASE-15654: -- link HBASE-20697 with this jira. > Optimize client's MetaCache handling > > > Key: HBASE-15654 > URL: https://issues.apache.org/jira/browse/HBASE-15654 > Project: HBase > Issue Type: Umbrella > Components: Client >Affects Versions: 1.3.0 >Reporter: Mikhail Antonov >Assignee: Mikhail Antonov >Priority: Critical > Fix For: 3.0.0, 1.5.0, 2.2.0 > > > This is an umbrella jira to track all individual issues, bugfixes and small > optimizations around MetaCache (region locations cache) in the client. > Motivation is that under the load one could see a spikes in the number of > requests going to meta - reaching tens of thousands requests per second. > That covers issues when we clear entries from location cache unnecessary, as > well as when we do more lookups than necessary when entries are legitimately > evicted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf
[ https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545671#comment-16545671 ] Mike Drob commented on HBASE-20894: --- Want to check with folks to see if there is agreement that this is a reasonable approach to take. A few design questions: * Should I be building a separate IO Engine implementation to do this instead of trying to handle it inline? * Is it ok to put the messy PB logic in the persist/retrieve methods, or should that go to various classes with toPB/fromPB methods in those? I see some examples of both in our code. * What is the difference for PB between writeTo and writeDelimitedTo (and the corresponding read methods) * Are my protobuf message definitions fine or do they need to be organized differently? I haven't spent too much thought on these. Regarding my previous question, I think recording cache size and IO Engine class seems fine, but tracking the backing map class is probably not necessary. Also, maybe we can simplify the logic and not worry about the old serialization types - it's "just" a cache hint anyway so nothing critical lost if it doesn't come up with the RS. > Move BucketCache from java serialization to protobuf > > > Key: HBASE-20894 > URL: https://issues.apache.org/jira/browse/HBASE-20894 > Project: HBase > Issue Type: Task > Components: BucketCache >Affects Versions: 2.0.0 >Reporter: Mike Drob >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch > > > We should use a better serialization format instead of Java Serialization for > the BucketCache entry persistence. > Suggested by Chris McCown, who does not appear to have a JIRA account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20894) Move BucketCache from java serialization to protobuf
[ https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-20894: -- Attachment: HBASE-20894.WIP-2.patch > Move BucketCache from java serialization to protobuf > > > Key: HBASE-20894 > URL: https://issues.apache.org/jira/browse/HBASE-20894 > Project: HBase > Issue Type: Task > Components: BucketCache >Affects Versions: 2.0.0 >Reporter: Mike Drob >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20894.WIP-2.patch, HBASE-20894.WIP.patch > > > We should use a better serialization format instead of Java Serialization for > the BucketCache entry persistence. > Suggested by Chris McCown, who does not appear to have a JIRA account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545637#comment-16545637 ] Andrew Purtell commented on HBASE-20866: Oh, I see commit was done. Updated this JIRA. I opened two subtasks for follow up > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Critical > Fix For: 1.3.3 > > Attachments: HBASE-20866.branch-1.3.001.patch, > HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch > > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { > .. > } > if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE > if (values != null) { > for (int v=0; v Result rs = values[v]; > > cache.add(rs); > ... > } else { // DO ALL THE REGULAR PARTIAL RESULT
[jira] [Updated] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4
[ https://issues.apache.org/jira/browse/HBASE-20896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20896: --- Fix Version/s: 1.4.6 1.5.0 > Port HBASE-20866 to branch-1 and branch-1.4 > > > Key: HBASE-20896 > URL: https://issues.apache.org/jira/browse/HBASE-20896 > Project: HBase > Issue Type: Sub-task >Reporter: Andrew Purtell >Assignee: Vikas Vishwakarma >Priority: Major > Fix For: 1.5.0, 1.4.6 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20897) Port HBASE-20866 to branch-2 and up
[ https://issues.apache.org/jira/browse/HBASE-20897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20897: --- Fix Version/s: 2.2.0 2.0.2 2.1.0 3.0.0 > Port HBASE-20866 to branch-2 and up > --- > > Key: HBASE-20897 > URL: https://issues.apache.org/jira/browse/HBASE-20897 > Project: HBase > Issue Type: Sub-task >Reporter: Andrew Purtell >Assignee: Vikas Vishwakarma >Priority: Major > Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20897) Port HBASE-20866 to branch-2 and up
Andrew Purtell created HBASE-20897: -- Summary: Port HBASE-20866 to branch-2 and up Key: HBASE-20897 URL: https://issues.apache.org/jira/browse/HBASE-20897 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Vikas Vishwakarma -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20896) Port HBASE-20866 to branch-1 and branch-1.4
Andrew Purtell created HBASE-20896: -- Summary: Port HBASE-20866 to branch-1 and branch-1.4 Key: HBASE-20896 URL: https://issues.apache.org/jira/browse/HBASE-20896 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Vikas Vishwakarma -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20866: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: (was: 1.4.6) (was: 1.2.7) (was: 1.5.0) Status: Resolved (was: Patch Available) > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Critical > Fix For: 1.3.3 > > Attachments: HBASE-20866.branch-1.3.001.patch, > HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch > > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { > .. > } > if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE > if (values != null) { > for (int v=0; v Result rs =
[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545635#comment-16545635 ] Andrew Purtell commented on HBASE-20866: Ok. Then the same advice applies, after committing set the fix version here to what was committed and open another JIRA for follow on work (with fix versions set appropriately there). Thanks! > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Critical > Fix For: 1.3.3 > > Attachments: HBASE-20866.branch-1.3.001.patch, > HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch > > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { > .. > } > if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE > if (values != null) { > for (int
[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf
[ https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545614#comment-16545614 ] Mike Drob commented on HBASE-20894: --- Hmm, starting to try to do the actual read/write here and maybe we don't need the cache size or the class names recorded. Will leave them in for now, and then prune them later if we can get away with it. > Move BucketCache from java serialization to protobuf > > > Key: HBASE-20894 > URL: https://issues.apache.org/jira/browse/HBASE-20894 > Project: HBase > Issue Type: Task > Components: BucketCache >Affects Versions: 2.0.0 >Reporter: Mike Drob >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20894.WIP.patch > > > We should use a better serialization format instead of Java Serialization for > the BucketCache entry persistence. > Suggested by Chris McCown, who does not appear to have a JIRA account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20889) PE scan is failing with NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20889: --- Fix Version/s: 1.4.6 1.5.0 > PE scan is failing with NullPointerException > > > Key: HBASE-20889 > URL: https://issues.apache.org/jira/browse/HBASE-20889 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Vikas Vishwakarma >Assignee: Ted Yu >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.6 > > Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt > > > Command used > {code:java} > ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > > scan1{code} > PE scan 1 is failing with NullPointer > {code:java} > java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess
[ https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20895: --- Description: {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote}count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) \{ process(); }{quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether concurrent access to the Connection is happening and intended.The above is just a theory. We should also look at other execution sequences that could lead to 'data' being null in this location. At a glance I didn't find one but the store to 'data' happens behind conditionals so it is possible. was: {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote}count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) Unknown macro: \{ process(); }{quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether concurrent access to the Connection is happening and intended.The above is just a theory. We should also look at other execution sequences that could lead to 'data' being null in this location. At a glance I didn't find one the store to 'data' happens behind conditionals so it is possible. > NPE in RpcServer#readAndProcess > --- > > Key: HBASE-20895 > URL: https://issues.apache.org/jira/browse/HBASE-20895 > Project: HBase > Issue Type: Bug > Components: rpc >Affects Versions: 1.3.2 >Reporter: Andrew Purtell >Assignee: Monani Mihir >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.6 > > > {noformat} > 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - > RpcServer.listener,port=60020: Caught exception while reading: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) > at >
[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess
[ https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20895: --- Description: {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote}count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) Unknown macro: \{ process(); }{quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether concurrent access to the Connection is happening and intended.The above is just a theory. We should also look at other execution sequences that could lead to 'data' being null in this location. At a glance I didn't find one the store to 'data' happens behind conditionals so it is possible. was: {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote}count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) { process(); } {quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether concurrent access to the Connection is happening and intended.The above is just a theory. We should also look at other execution sequences that could lead to 'data' being null in this location. At a glance I didn't find one but 'data' is allocated behind conditionals so it is possible. > NPE in RpcServer#readAndProcess > --- > > Key: HBASE-20895 > URL: https://issues.apache.org/jira/browse/HBASE-20895 > Project: HBase > Issue Type: Bug > Components: rpc >Affects Versions: 1.3.2 >Reporter: Andrew Purtell >Assignee: Monani Mihir >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.6 > > > {noformat} > 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - > RpcServer.listener,port=60020: Caught exception while reading: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) > at >
[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess
[ https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20895: --- Description: {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote}count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) { process(); } {quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether concurrent access to the Connection is happening and intended.The above is just a theory. We should also look at other execution sequences that could lead to 'data' being null in this location. At a glance I didn't find one but 'data' is allocated behind conditionals so it is possible. was: {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote} count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) \{ process(); \} {quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether the concurrent access to the Connection is intended. > NPE in RpcServer#readAndProcess > --- > > Key: HBASE-20895 > URL: https://issues.apache.org/jira/browse/HBASE-20895 > Project: HBase > Issue Type: Bug > Components: rpc >Affects Versions: 1.3.2 >Reporter: Andrew Purtell >Assignee: Monani Mihir >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.6 > > > {noformat} > 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - > RpcServer.listener,port=60020: Caught exception while reading: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at >
[jira] [Updated] (HBASE-20895) NPE in RpcServer#readAndProcess
[ https://issues.apache.org/jira/browse/HBASE-20895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20895: --- Description: {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote} count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) \{ process(); \} {quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether the concurrent access to the Connection is intended. was: {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote} count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) { // count==0 if dataLength == 0 process(); } {quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether the concurrent access to the Connection is intended. > NPE in RpcServer#readAndProcess > --- > > Key: HBASE-20895 > URL: https://issues.apache.org/jira/browse/HBASE-20895 > Project: HBase > Issue Type: Bug > Components: rpc >Affects Versions: 1.3.2 >Reporter: Andrew Purtell >Assignee: Monani Mihir >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.4.6 > > > {noformat} > 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - > RpcServer.listener,port=60020: Caught exception while reading: > java.lang.NullPointerException > at > org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) > at > org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} > This looks like it could be a use after close problem if there is concurrent > access to a Connection. > In process() we
[jira] [Created] (HBASE-20895) NPE in RpcServer#readAndProcess
Andrew Purtell created HBASE-20895: -- Summary: NPE in RpcServer#readAndProcess Key: HBASE-20895 URL: https://issues.apache.org/jira/browse/HBASE-20895 Project: HBase Issue Type: Bug Components: rpc Affects Versions: 1.3.2 Reporter: Andrew Purtell Assignee: Monani Mihir Fix For: 1.5.0, 1.3.3, 1.4.6 {noformat} 2018-07-10 16:25:55,005 DEBUG [.sfdc.net,port=60020] ipc.RpcServer - RpcServer.listener,port=60020: Caught exception while reading: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1761) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:949) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:730) at org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:706) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This looks like it could be a use after close problem if there is concurrent access to a Connection. In process() we might store a null back to the 'data' field. Meanwhile in readAndProcess() we have a case where we might be blocked on a channel read and then after coming back from the read we go to use 'data' after a null has been written back, leading to a NPE. {quote} count = channelRead(channel, data); 1761 ---> if (count >= 0 && *data.remaining()* == 0) { // count==0 if dataLength == 0 process(); } {quote} Whether a NPE happens or not is going to depend on the timing of the store back to 'data' in another thread and use of 'data' in this thread and whether or not the JVM has optimized away a reload of 'data' (it's not declared volatile) We should do a null check here just to be defensive. We should also look at whether the concurrent access to the Connection is intended. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20889) PE scan is failing with NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545527#comment-16545527 ] Hudson commented on HBASE-20889: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #435 (See [https://builds.apache.org/job/HBase-1.3-IT/435/]) HBASE-20889 PE scan is failing with NullPointerException (tedyu: rev 08f9837795164e1603825a382d9bb1cab9c2cb3e) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java > PE scan is failing with NullPointerException > > > Key: HBASE-20889 > URL: https://issues.apache.org/jira/browse/HBASE-20889 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Vikas Vishwakarma >Assignee: Ted Yu >Priority: Major > Fix For: 1.3.3 > > Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt > > > Command used > {code:java} > ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > > scan1{code} > PE scan 1 is failing with NullPointer > {code:java} > java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20894) Move BucketCache from java serialization to protobuf
[ https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545516#comment-16545516 ] Mike Drob commented on HBASE-20894: --- Attaching WIP patch that includes the proposed new proto definitions. Would appreciate some review before I start trying to glue that in to the code paths, since I don't have a ton of experience with protos in general. > Move BucketCache from java serialization to protobuf > > > Key: HBASE-20894 > URL: https://issues.apache.org/jira/browse/HBASE-20894 > Project: HBase > Issue Type: Task > Components: BucketCache >Affects Versions: 2.0.0 >Reporter: Mike Drob >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20894.WIP.patch > > > We should use a better serialization format instead of Java Serialization for > the BucketCache entry persistence. > Suggested by Chris McCown, who does not appear to have a JIRA account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20894) Move BucketCache from java serialization to protobuf
[ https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-20894: -- Attachment: HBASE-20894.WIP.patch > Move BucketCache from java serialization to protobuf > > > Key: HBASE-20894 > URL: https://issues.apache.org/jira/browse/HBASE-20894 > Project: HBase > Issue Type: Task > Components: BucketCache >Affects Versions: 2.0.0 >Reporter: Mike Drob >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20894.WIP.patch > > > We should use a better serialization format instead of Java Serialization for > the BucketCache entry persistence. > Suggested by Chris McCown, who does not appear to have a JIRA account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20894) Move BucketCache from java serialization to protobuf
[ https://issues.apache.org/jira/browse/HBASE-20894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-20894: -- Description: We should use a better serialization format instead of Java Serialization for the BucketCache entry persistence. Suggested by Chris McCown, who does not appear to have a JIRA account. was:We should use a better serialization format instead of Java Serialization for the BucketCache entry persistence. > Move BucketCache from java serialization to protobuf > > > Key: HBASE-20894 > URL: https://issues.apache.org/jira/browse/HBASE-20894 > Project: HBase > Issue Type: Task > Components: BucketCache >Affects Versions: 2.0.0 >Reporter: Mike Drob >Priority: Major > Fix For: 3.0.0 > > > We should use a better serialization format instead of Java Serialization for > the BucketCache entry persistence. > Suggested by Chris McCown, who does not appear to have a JIRA account. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20894) Move BucketCache from java serialization to protobuf
Mike Drob created HBASE-20894: - Summary: Move BucketCache from java serialization to protobuf Key: HBASE-20894 URL: https://issues.apache.org/jira/browse/HBASE-20894 Project: HBase Issue Type: Task Components: BucketCache Affects Versions: 2.0.0 Reporter: Mike Drob Fix For: 3.0.0 We should use a better serialization format instead of Java Serialization for the BucketCache entry persistence. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20889) PE scan is failing with NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-20889: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 1.3.3 Status: Resolved (was: Patch Available) Thanks for the review, Vikas > PE scan is failing with NullPointerException > > > Key: HBASE-20889 > URL: https://issues.apache.org/jira/browse/HBASE-20889 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Vikas Vishwakarma >Assignee: Ted Yu >Priority: Major > Fix For: 1.3.3 > > Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt > > > Command used > {code:java} > ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > > scan1{code} > PE scan 1 is failing with NullPointer > {code:java} > java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20889) PE scan is failing with NullPointerException
[ https://issues.apache.org/jira/browse/HBASE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-20889: --- Summary: PE scan is failing with NullPointerException (was: PE scan is failing with NullPointer) > PE scan is failing with NullPointerException > > > Key: HBASE-20889 > URL: https://issues.apache.org/jira/browse/HBASE-20889 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.3 >Reporter: Vikas Vishwakarma >Assignee: Ted Yu >Priority: Major > Attachments: 20889.branch-1.3.txt, 20889.branch-1.3.v2.txt > > > Command used > {code:java} > ~/current/bigdata-hbase/hbase/hbase/bin/hbase pe --nomapred scan 1 > > scan1{code} > PE scan 1 is failing with NullPointer > {code:java} > java.io.IOException: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation.doLocalClients(PerformanceEvaluation.java:447) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runTest(PerformanceEvaluation.java:1920) > at > org.apache.hadoop.hbase.PerformanceEvaluation.run(PerformanceEvaluation.java:2305) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.hbase.PerformanceEvaluation.main(PerformanceEvaluation.java:2326) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hbase.PerformanceEvaluation$ScanTest.testTakedown(PerformanceEvaluation.java:1530) > at > org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1165) > at > org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1896) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:429) > at > org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:424) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20883) HMaster Read / Write Requests Per Sec across RegionServers, currently only Total Requests Per Sec
[ https://issues.apache.org/jira/browse/HBASE-20883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545463#comment-16545463 ] Andrew Purtell commented on HBASE-20883: bq. , it doesn't seem like it hurts to expose that same information in JMX. It doesn't, and I was talking about the UI, so let's be clear about that. > HMaster Read / Write Requests Per Sec across RegionServers, currently only > Total Requests Per Sec > -- > > Key: HBASE-20883 > URL: https://issues.apache.org/jira/browse/HBASE-20883 > Project: HBase > Issue Type: Improvement > Components: Admin, master, metrics, monitoring, UI, Usability >Affects Versions: 1.1.2 >Reporter: Hari Sekhon >Priority: Major > > HMaster currently shows Requests Per Second per RegionServer under HMaster > UI's /master-status page -> Region Servers -> Base Stats section in the Web > UI. > Please add Reads Per Second and Writes Per Second per RegionServer alongside > this in the HMaster UI, and also expose the Read/Write/Total requests per sec > information in the HMaster JMX API. > This will make it easier to find read or write hotspotting on HBase as a > combined total will minimize and mask differences between RegionServers. For > example, we do 30,000 reads/sec but only 900 writes/sec to each RegionServer, > so write skew will be masked as it won't show enough significant difference > in the much larger combined Total Requests Per Second stat. > For now I've written a Python tool to calculate this info from RegionServers > JMX read/write/total request counts but since HMaster is collecting this info > anyway it shouldn't be a big change to improve it to also show Reads / Writes > Per Sec as well as Total. > Find my tools for more granular Read/Write Requests Per Sec Per Regionserver > and also Per Region at my [PyTools github > repo|https://github.com/harisekhon/pytools] along with a selection of other > HBase tools I've used for performance debugging over the years. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545465#comment-16545465 ] Hadoop QA commented on HBASE-18201: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 22s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 5m 11s{color} | {color:blue} branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 40s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 36s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 5m 22s{color} | {color:blue} patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 34s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}202m 42s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}280m 53s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.snapshot.TestMobSecureExportSnapshot | | | hadoop.hbase.snapshot.TestExportSnapshot | | | hadoop.hbase.snapshot.TestMobExportSnapshot
[jira] [Commented] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing
[ https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545441#comment-16545441 ] Hadoop QA commented on HBASE-20893: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 1s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 1s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 12s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 48s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 35s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 45s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 10m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 35s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 0s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 2s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 25s{color} | {color:red} hbase-server generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}123m 38s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}186m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20893 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931791/HBASE-20893.branch-2.0.001.patch | | Optional Tests | asflicense cc unit hbaseprotoc javac javadoc findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 98df0921cd94 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12
[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64
[ https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545367#comment-16545367 ] Hudson commented on HBASE-20884: SUCCESS: Integrated in Jenkins build HBase-1.3-IT #433 (See [https://builds.apache.org/job/HBase-1.3-IT/433/]) HBASE-20884 Reclassify Base64 as IA.Private (mdrob: rev 830d105eade8b8549418a0bcd6a8915bdcef82f4) * (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/util/Base64.java > Replace usage of our Base64 implementation with java.util.Base64 > > > Key: HBASE-20884 > URL: https://issues.apache.org/jira/browse/HBASE-20884 > Project: HBase > Issue Type: Task >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1 > > Attachments: HBASE-20884.branch-1.001.patch, > HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch > > > We have a public domain implementation of Base64 that is copied into our code > base and infrequently receives updates. We should replace usage of that with > the new Java 8 java.util.Base64 where possible. > For the migration, I propose a phased approach. > * Deprecate on 1.x and 2.x to signal to users that this is going away. > * Replace usages on branch-2 and master with j.u.Base64 > * Delete our implementation of Base64 on master. > Does this seem in line with our API compatibility requirements? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64
[ https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545348#comment-16545348 ] Hudson commented on HBASE-20884: SUCCESS: Integrated in Jenkins build HBase-1.2-IT #1133 (See [https://builds.apache.org/job/HBase-1.2-IT/1133/]) HBASE-20884 Reclassify Base64 as IA.Private (mdrob: rev fe7306ebc5bb5b8e0103c2db27961da63b6db8a1) * (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/util/Base64.java > Replace usage of our Base64 implementation with java.util.Base64 > > > Key: HBASE-20884 > URL: https://issues.apache.org/jira/browse/HBASE-20884 > Project: HBase > Issue Type: Task >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1 > > Attachments: HBASE-20884.branch-1.001.patch, > HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch > > > We have a public domain implementation of Base64 that is copied into our code > base and infrequently receives updates. We should replace usage of that with > the new Java 8 java.util.Base64 where possible. > For the migration, I propose a phased approach. > * Deprecate on 1.x and 2.x to signal to users that this is going away. > * Replace usages on branch-2 and master with j.u.Base64 > * Delete our implementation of Base64 on master. > Does this seem in line with our API compatibility requirements? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20867) RS may get killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545330#comment-16545330 ] Allan Yang commented on HBASE-20867: {code} Will we be stuck there for ever when master shutdown? The reason we close the connection when shutdown master is that we want the operations against the connection fail quickly and give up immediately. {code} It won't stuck. Retrying happens in a thread pool in RemoteProcedureDispatcher, we shut it down when stopping. > RS may get killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"
[ https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545317#comment-16545317 ] Hadoop QA commented on HBASE-20853: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 7s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 49s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 16s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 45m 5s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20853 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931798/HBASE-20853.master.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux d0b4ca9a9bc3 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 2997b6d071 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/13640/testReport/ | | Max. process+thread count | 270 (vs. ulimit of 1) | | modules | C: hbase-client U: hbase-client | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/13640/console | | Powered by |
[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545309#comment-16545309 ] stack commented on HBASE-20846: --- bq. Now the biggest problem is that, the original code does not allow storing ROLLEDBACK procedure into the procedure store. You can't store ROLLBACK steps as we do forward steps; the framework does not currently support this. Shout if you want more detail. > Restore procedure locks when master restarts > > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, > HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, > HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold the table's shared lock. A > ModifyTableProcedure can aquire the table's exclusive lock and execute at the > same time. Which is not possible if the master was not restarted. > This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, > I wrote a simple UT to repo this case. > I think we don't have to check the parent for table's shared lock. It is a > shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool
[ https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545274#comment-16545274 ] Mike Drob commented on HBASE-20870: --- +1 > Wrong HBase root dir in ITBLL's Search Tool > --- > > Key: HBASE-20870 > URL: https://issues.apache.org/jira/browse/HBASE-20870 > Project: HBase > Issue Type: Bug > Components: integration tests >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Minor > Attachments: HBASE-20870.branch-2.0.001.patch, > HBASE-20870.branch-2.0.002.patch > > > When using IntegrationTestBigLinkedList's Search tools, it always fails since > it tries to read WALs in a wrong HBase root dir. Turned out that when > initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its > super class HBaseTestingUtility will change hbase.rootdir to a local random > dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. > But for IntegrationTest runs on distributed clusters, we should change it > back. > Here is the error info. > {code:java} > 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting > hbase.rootdir to > /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb > 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running > command-line tool java.io.FileNotFoundException: File > file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool
[ https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545267#comment-16545267 ] Hadoop QA commented on HBASE-20870: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 50s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 37s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 16s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 13s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 58s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}116m 3s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s{color} | {color:green} hbase-it in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}160m 20s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore | | | hadoop.hbase.regionserver.TestCompactingMemStore | | | hadoop.hbase.regionserver.TestDefaultMemStore | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20870 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931773/HBASE-20870.branch-2.0.002.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 08493ac87cb7 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (HBASE-20884) Replace usage of our Base64 implementation with java.util.Base64
[ https://issues.apache.org/jira/browse/HBASE-20884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob updated HBASE-20884: -- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to branch-1 family, thanks for reviews Andrew and Ted. > Replace usage of our Base64 implementation with java.util.Base64 > > > Key: HBASE-20884 > URL: https://issues.apache.org/jira/browse/HBASE-20884 > Project: HBase > Issue Type: Task >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2, 2.1.1 > > Attachments: HBASE-20884.branch-1.001.patch, > HBASE-20884.branch-1.002.patch, HBASE-20884.master.001.patch > > > We have a public domain implementation of Base64 that is copied into our code > base and infrequently receives updates. We should replace usage of that with > the new Java 8 java.util.Base64 where possible. > For the migration, I propose a phased approach. > * Deprecate on 1.x and 2.x to signal to users that this is going away. > * Replace usages on branch-2 and master with j.u.Base64 > * Delete our implementation of Base64 on master. > Does this seem in line with our API compatibility requirements? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20846: -- Attachment: HBASE-20846-v2.patch > Restore procedure locks when master restarts > > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, > HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, > HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold the table's shared lock. A > ModifyTableProcedure can aquire the table's exclusive lock and execute at the > same time. Which is not possible if the master was not restarted. > This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, > I wrote a simple UT to repo this case. > I think we don't have to check the parent for table's shared lock. It is a > shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20878) Data loss if merging regions while ServerCrashProcedure executing
[ https://issues.apache.org/jira/browse/HBASE-20878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545260#comment-16545260 ] Hadoop QA commented on HBASE-20878: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red}419m 52s{color} | {color:red} Docker failed to build yetus/hbase:6f01af0. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-20878 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931746/HBASE-20878.branch-2.0.003.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/13633/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Data loss if merging regions while ServerCrashProcedure executing > - > > Key: HBASE-20878 > URL: https://issues.apache.org/jira/browse/HBASE-20878 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Critical > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20878.branch-2.0.001.patch, > HBASE-20878.branch-2.0.002.patch, HBASE-20878.branch-2.0.003.patch > > > In MergeTableRegionsProcedure, we close the regions to merge using > UnassignProcedure. But, if the RS these regions on is crashed, a > ServerCrashProcedure will execute at the same time. UnassignProcedures will > be blockd until all logs are split. But since these regions are closed for > merging, the regions won't open again, the recovered.edit in the region dir > won't be replay, thus, data will loss. > I provided a test to repo this case. I seriously doubt Split region procedure > also has this kind of problem. I will check later -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"
[ https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545238#comment-16545238 ] Balazs Meszaros commented on HBASE-20853: - I added a default implementation for getTableDescriptor. I did not change getName, because it does not throw an exception unlike getDescriptor.. > Polish "Add defaults to Table Interface so Implementors don't have to" > -- > > Key: HBASE-20853 > URL: https://issues.apache.org/jira/browse/HBASE-20853 > Project: HBase > Issue Type: Sub-task > Components: API >Reporter: stack >Assignee: Balazs Meszaros >Priority: Major > Labels: beginner, beginners > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20853.master.001.patch, > HBASE-20853.master.002.patch, HBASE-20853.master.003.patch > > > This issue is to address feedback that came in after commit on the parent > (FYI [~chia7712]). See tail of parent issue and amendment attached to parent > adding better defaults to the Table Interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20853) Polish "Add defaults to Table Interface so Implementors don't have to"
[ https://issues.apache.org/jira/browse/HBASE-20853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balazs Meszaros updated HBASE-20853: Attachment: HBASE-20853.master.003.patch > Polish "Add defaults to Table Interface so Implementors don't have to" > -- > > Key: HBASE-20853 > URL: https://issues.apache.org/jira/browse/HBASE-20853 > Project: HBase > Issue Type: Sub-task > Components: API >Reporter: stack >Assignee: Balazs Meszaros >Priority: Major > Labels: beginner, beginners > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20853.master.001.patch, > HBASE-20853.master.002.patch, HBASE-20853.master.003.patch > > > This issue is to address feedback that came in after commit on the parent > (FYI [~chia7712]). See tail of parent issue and amendment attached to parent > adding better defaults to the Table Interface. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20883) HMaster Read / Write Requests Per Sec across RegionServers, currently only Total Requests Per Sec
[ https://issues.apache.org/jira/browse/HBASE-20883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545231#comment-16545231 ] Hari Sekhon commented on HBASE-20883: - [~andrewcheng] thanks for mentioning the other ticket but it's not exactly the same issue. That asks to use a more accurate counting to account for multi requests. I'm just asking that the Read + Writes Requests Per Sec are shown in the UI next to each RegionServer which already shows the Total Requests Per Sec, to be able to detect Read or Write skew more easily. > HMaster Read / Write Requests Per Sec across RegionServers, currently only > Total Requests Per Sec > -- > > Key: HBASE-20883 > URL: https://issues.apache.org/jira/browse/HBASE-20883 > Project: HBase > Issue Type: Improvement > Components: Admin, master, metrics, monitoring, UI, Usability >Affects Versions: 1.1.2 >Reporter: Hari Sekhon >Priority: Major > > HMaster currently shows Requests Per Second per RegionServer under HMaster > UI's /master-status page -> Region Servers -> Base Stats section in the Web > UI. > Please add Reads Per Second and Writes Per Second per RegionServer alongside > this in the HMaster UI, and also expose the Read/Write/Total requests per sec > information in the HMaster JMX API. > This will make it easier to find read or write hotspotting on HBase as a > combined total will minimize and mask differences between RegionServers. For > example, we do 30,000 reads/sec but only 900 writes/sec to each RegionServer, > so write skew will be masked as it won't show enough significant difference > in the much larger combined Total Requests Per Second stat. > For now I've written a Python tool to calculate this info from RegionServers > JMX read/write/total request counts but since HMaster is collecting this info > anyway it shouldn't be a big change to improve it to also show Reads / Writes > Per Sec as well as Total. > Find my tools for more granular Read/Write Requests Per Sec Per Regionserver > and also Per Region at my [PyTools github > repo|https://github.com/harisekhon/pytools] along with a selection of other > HBase tools I've used for performance debugging over the years. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20883) HMaster Read / Write Requests Per Sec across RegionServers, currently only Total Requests Per Sec
[ https://issues.apache.org/jira/browse/HBASE-20883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545229#comment-16545229 ] Hari Sekhon commented on HBASE-20883: - {quote}This won't scale {quote} HMaster UI already shows Total Requests Per Sec next to each RegionServer, which I think is already calculated from readRequestCount + writeRequestCount or totalRequestCount differentials. It's just two more columns to expose that information in the existing table. I already have OpenTSDB but it's handy for some tools and scripts to be able to get this information from HBase directly, perhaps you don't want to have to set up OpenTSDB on HBase to be able to debug somebody's HBase installation and since it appears that HMaster is already collecting and averaging the information, it doesn't seem like it hurts to expose that same information in JMX. > HMaster Read / Write Requests Per Sec across RegionServers, currently only > Total Requests Per Sec > -- > > Key: HBASE-20883 > URL: https://issues.apache.org/jira/browse/HBASE-20883 > Project: HBase > Issue Type: Improvement > Components: Admin, master, metrics, monitoring, UI, Usability >Affects Versions: 1.1.2 >Reporter: Hari Sekhon >Priority: Major > > HMaster currently shows Requests Per Second per RegionServer under HMaster > UI's /master-status page -> Region Servers -> Base Stats section in the Web > UI. > Please add Reads Per Second and Writes Per Second per RegionServer alongside > this in the HMaster UI, and also expose the Read/Write/Total requests per sec > information in the HMaster JMX API. > This will make it easier to find read or write hotspotting on HBase as a > combined total will minimize and mask differences between RegionServers. For > example, we do 30,000 reads/sec but only 900 writes/sec to each RegionServer, > so write skew will be masked as it won't show enough significant difference > in the much larger combined Total Requests Per Second stat. > For now I've written a Python tool to calculate this info from RegionServers > JMX read/write/total request counts but since HMaster is collecting this info > anyway it shouldn't be a big change to improve it to also show Reads / Writes > Per Sec as well as Total. > Find my tools for more granular Read/Write Requests Per Sec Per Regionserver > and also Per Region at my [PyTools github > repo|https://github.com/harisekhon/pytools] along with a selection of other > HBase tools I've used for performance debugging over the years. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20867) RS may get killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-20867: --- Summary: RS may get killed while master restarts (was: RS may got killed while master restarts) > RS may get killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545192#comment-16545192 ] Hadoop QA commented on HBASE-20846: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 37s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 9s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 8s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} hbase-procedure: The patch generated 1 new + 38 unchanged - 14 fixed = 39 total (was 52) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s{color} | {color:green} hbase-server: The patch generated 0 new + 316 unchanged - 7 fixed = 316 total (was 323) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 5s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 27s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 42s{color} | {color:red} hbase-procedure in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}241m 18s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}294m 57s{color} |
[jira] [Commented] (HBASE-20867) RS may got killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545187#comment-16545187 ] Duo Zhang commented on HBASE-20867: --- Will we be stuck there for ever when master shutdown? The reason we close the connection when shutdown master is that we want the operations against the connection fail quickly and give up immediately. The patch LGTM. Above is the only concern for me. Thanks. > RS may got killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing
[ https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20893: --- Attachment: HBASE-20893.branch-2.0.001.patch > Data loss if splitting region while ServerCrashProcedure executing > -- > > Key: HBASE-20893 > URL: https://issues.apache.org/jira/browse/HBASE-20893 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20893.branch-2.0.001.patch > > > Similar case as HBASE-20878. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing
[ https://issues.apache.org/jira/browse/HBASE-20893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20893: --- Status: Patch Available (was: Open) > Data loss if splitting region while ServerCrashProcedure executing > -- > > Key: HBASE-20893 > URL: https://issues.apache.org/jira/browse/HBASE-20893 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.1, 3.0.0, 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > > Similar case as HBASE-20878. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20893) Data loss if splitting region while ServerCrashProcedure executing
Allan Yang created HBASE-20893: -- Summary: Data loss if splitting region while ServerCrashProcedure executing Key: HBASE-20893 URL: https://issues.apache.org/jira/browse/HBASE-20893 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.1, 3.0.0, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang Similar case as HBASE-20878. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20867) RS may got killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545122#comment-16545122 ] Hadoop QA commented on HBASE-20867: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 2s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 22s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 31s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 43s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 58s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s{color} | {color:red} hbase-client: The patch generated 1 new + 13 unchanged - 2 fixed = 14 total (was 15) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 44s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 24s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 57s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}183m 26s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}231m 5s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20867 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931749/HBASE-20867.branch-2.0.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux c4f9dd9c6829 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2.0 / 5594f0b9fd | | maven | version: Apache Maven 3.5.4
[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545111#comment-16545111 ] Kuan-Po Tseng commented on HBASE-18201: --- It seems fail test isn't relate to this patch, resummit patch. > add UT and docs for DataBlockEncodingTool > - > > Key: HBASE-18201 > URL: https://issues.apache.org/jira/browse/HBASE-18201 > Project: HBase > Issue Type: Sub-task > Components: tooling >Reporter: Chia-Ping Tsai >Assignee: Kuan-Po Tseng >Priority: Minor > Labels: beginner > Attachments: HBASE-18201.master.001.patch, > HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, > HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, > HBASE-18201.master.005.patch, HBASE-18201.master.005.patch > > > There is no example, documents, or tests for DataBlockEncodingTool. We should > have it friendly if any use case exists. Otherwise, we should just get rid of > it because DataBlockEncodingTool presumes that the implementation of cell > returned from DataBlockEncoder is KeyValue. The presume may obstruct the > cleanup of KeyValue references in the code base of read/write path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-18201) add UT and docs for DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kuan-Po Tseng updated HBASE-18201: -- Attachment: HBASE-18201.master.005.patch > add UT and docs for DataBlockEncodingTool > - > > Key: HBASE-18201 > URL: https://issues.apache.org/jira/browse/HBASE-18201 > Project: HBase > Issue Type: Sub-task > Components: tooling >Reporter: Chia-Ping Tsai >Assignee: Kuan-Po Tseng >Priority: Minor > Labels: beginner > Attachments: HBASE-18201.master.001.patch, > HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, > HBASE-18201.master.003.patch, HBASE-18201.master.004.patch, > HBASE-18201.master.005.patch, HBASE-18201.master.005.patch > > > There is no example, documents, or tests for DataBlockEncodingTool. We should > have it friendly if any use case exists. Otherwise, we should just get rid of > it because DataBlockEncodingTool presumes that the implementation of cell > returned from DataBlockEncoder is KeyValue. The presume may obstruct the > cleanup of KeyValue references in the code base of read/write path. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool
[ https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20870: --- Attachment: (was: HBASE-20870.branch-2.0.002.patch) > Wrong HBase root dir in ITBLL's Search Tool > --- > > Key: HBASE-20870 > URL: https://issues.apache.org/jira/browse/HBASE-20870 > Project: HBase > Issue Type: Bug > Components: integration tests >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Minor > Attachments: HBASE-20870.branch-2.0.001.patch, > HBASE-20870.branch-2.0.002.patch > > > When using IntegrationTestBigLinkedList's Search tools, it always fails since > it tries to read WALs in a wrong HBase root dir. Turned out that when > initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its > super class HBaseTestingUtility will change hbase.rootdir to a local random > dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. > But for IntegrationTest runs on distributed clusters, we should change it > back. > Here is the error info. > {code:java} > 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting > hbase.rootdir to > /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb > 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running > command-line tool java.io.FileNotFoundException: File > file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool
[ https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20870: --- Attachment: HBASE-20870.branch-2.0.002.patch > Wrong HBase root dir in ITBLL's Search Tool > --- > > Key: HBASE-20870 > URL: https://issues.apache.org/jira/browse/HBASE-20870 > Project: HBase > Issue Type: Bug > Components: integration tests >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Minor > Attachments: HBASE-20870.branch-2.0.001.patch, > HBASE-20870.branch-2.0.002.patch > > > When using IntegrationTestBigLinkedList's Search tools, it always fails since > it tries to read WALs in a wrong HBase root dir. Turned out that when > initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its > super class HBaseTestingUtility will change hbase.rootdir to a local random > dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. > But for IntegrationTest runs on distributed clusters, we should change it > back. > Here is the error info. > {code:java} > 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting > hbase.rootdir to > /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb > 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running > command-line tool java.io.FileNotFoundException: File > file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18201) add UT and docs for DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545068#comment-16545068 ] Hadoop QA commented on HBASE-18201: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 34s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 16s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 5m 17s{color} | {color:blue} branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 44s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 27s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 6m 4s{color} | {color:blue} patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 16s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 52s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}139m 45s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 6s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}222m 31s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce
[jira] [Commented] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool
[ https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545046#comment-16545046 ] Hadoop QA commented on HBASE-20870: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 40s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 33s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 6s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 5s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 16s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 20s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s{color} | {color:green} hbase-it in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}148m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestDefaultMemStore | | | hadoop.hbase.regionserver.TestCompactingMemStore | | | hadoop.hbase.regionserver.TestCompactingToCellFlatMapMemStore | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20870 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931748/HBASE-20870.branch-2.0.002.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 7a5153609589 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544979#comment-16544979 ] Vikas Vishwakarma commented on HBASE-20866: --- The code in branch-1.4 onwards is similar to the master branch and will require considerable change for implementing the above change in these branches. But once done it should be easy to apply the same from branch-1.4 to master branch. Will work on the same and update. [~apurtell] so for now I was able to commit the patch only for 1.3 branch. > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Critical > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6 > > Attachments: HBASE-20866.branch-1.3.001.patch, > HBASE-20866.branch-1.3.002.patch, HBASE-20866.branch-1.3.003.patch > > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > }
[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20846: -- Attachment: (was: HBASE-20846-v1.patch) > Restore procedure locks when master restarts > > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846.branch-2.0.002.patch, > HBASE-20846.branch-2.0.patch, HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold the table's shared lock. A > ModifyTableProcedure can aquire the table's exclusive lock and execute at the > same time. Which is not possible if the master was not restarted. > This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, > I wrote a simple UT to repo this case. > I think we don't have to check the parent for table's shared lock. It is a > shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20846) Restore procedure locks when master restarts
[ https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20846: -- Attachment: HBASE-20846-v1.patch > Restore procedure locks when master restarts > > > Key: HBASE-20846 > URL: https://issues.apache.org/jira/browse/HBASE-20846 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0 >Reporter: Allan Yang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20846-v1.patch, HBASE-20846.branch-2.0.002.patch, > HBASE-20846.branch-2.0.patch, HBASE-20846.patch > > > Found this one when investigating ModifyTableProcedure got stuck while there > was a MoveRegionProcedure going on after master restart. > Though this issue can be solved by HBASE-20752. But I discovered something > else. > Before a MoveRegionProcedure can execute, it will hold the table's shared > lock. so,, when a UnassignProcedure was spwaned, it will not check the > table's shared lock since it is sure that its parent(MoveRegionProcedure) has > aquired the table's lock. > {code:java} > // If there is parent procedure, it would have already taken xlock, so no > need to take > // shared lock here. Otherwise, take shared lock. > if (!procedure.hasParent() > && waitTableQueueSharedLock(procedure, table) == null) { > return true; > } > {code} > But, it is not the case when Master was restarted. The child > procedure(UnassignProcedure) will be executed first after restart. Though it > has a parent(MoveRegionProcedure), but apprently the parent didn't hold the > table's lock. > So, since it began to execute without hold the table's shared lock. A > ModifyTableProcedure can aquire the table's exclusive lock and execute at the > same time. Which is not possible if the master was not restarted. > This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, > I wrote a simple UT to repo this case. > I think we don't have to check the parent for table's shared lock. It is a > shared lock, right? I think we can acquire it every time we need it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20867) RS may got killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20867: --- Issue Type: Sub-task (was: Bug) Parent: HBASE-20828 > RS may got killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20867) RS may got killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544935#comment-16544935 ] Allan Yang commented on HBASE-20867: [~Apache9], can you review this one? Thanks! > RS may got killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20867) RS may got killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20867: --- Attachment: HBASE-20867.branch-2.0.003.patch > RS may got killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20870) Wrong HBase root dir in ITBLL's Search Tool
[ https://issues.apache.org/jira/browse/HBASE-20870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20870: --- Attachment: HBASE-20870.branch-2.0.002.patch > Wrong HBase root dir in ITBLL's Search Tool > --- > > Key: HBASE-20870 > URL: https://issues.apache.org/jira/browse/HBASE-20870 > Project: HBase > Issue Type: Bug > Components: integration tests >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Minor > Attachments: HBASE-20870.branch-2.0.001.patch, > HBASE-20870.branch-2.0.002.patch > > > When using IntegrationTestBigLinkedList's Search tools, it always fails since > it tries to read WALs in a wrong HBase root dir. Turned out that when > initializing IntegrationTestingUtility in IntegrationTestBigLinkedList, its > super class HBaseTestingUtility will change hbase.rootdir to a local random > dir. It is not wrong since HBaseTestingUtility is mostly used by Minicluster. > But for IntegrationTest runs on distributed clusters, we should change it > back. > Here is the error info. > {code:java} > 2018-07-11 16:35:49,679 DEBUG [main] hbase.HBaseCommonTestingUtility: Setting > hbase.rootdir to > /home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb > 2018-07-11 16:35:50,736 ERROR [main] util.AbstractHBaseTool: Error running > command-line tool java.io.FileNotFoundException: File > file:/home/hadoop/target/test-data/deb67611-2737-4696-abe9-32a7783df7bb/WALs > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:431) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)