[jira] [Updated] (HBASE-19293) Support add a disabled state replication peer directly
[ https://issues.apache.org/jira/browse/HBASE-19293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-19293: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master and branch-2. Thanks all for reviewing. > Support add a disabled state replication peer directly > -- > > Key: HBASE-19293 > URL: https://issues.apache.org/jira/browse/HBASE-19293 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 3.0.0, 2.0.0-beta-1 > > Attachments: HBASE-19293.master.001.patch, > HBASE-19293.master.002.patch, HBASE-19293.master.003.patch > > > Now when add a replication peer, the default state is enabled. If you want > add a disabled replication peer, you need add a peer first, then disable it. > It need two step to finish now. > Use case for add a disabled replication peer. When user want sync data from a > cluster A to a new peer cluster. > 1. Add a disabled replication peer. And config the table to peer config. > 2. Take a snapshot of table and export snapshot to peer cluster. > 3. Restore snapshot in peer cluster. > 4. Enable the peer and wait all stuck replication log replicated to peer > cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster
[ https://issues.apache.org/jira/browse/HBASE-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260366#comment-16260366 ] Duo Zhang commented on HBASE-19311: --- {quote} Do we need two tools? Maybe TestAcidGuarantees.java can be plan java unit test now? {quote} Is there a compatible issue? This will also be committed to branch-1 I think. {quote} All our tools (unless they are very old and no one touched them) extend AbstractHBaseTool now instead of directly extending Tool since the former takes care of bunch of boilerplate work. Let's do the same here? {quote} Do not want to spend too much time on this. The issue aims to solve the long running time only. You can open new issue to rewrite the tool if you want. Actually the parameters gotten from Configuration in the run method should be passed by command line. But still, compatible issue... Thanks. > Promote TestAcidGuarantees to LargeTests and start mini cluster once to make > it faster > -- > > Key: HBASE-19311 > URL: https://issues.apache.org/jira/browse/HBASE-19311 > Project: HBase > Issue Type: Improvement > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang > Attachments: HBASE-19311.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster
[ https://issues.apache.org/jira/browse/HBASE-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260364#comment-16260364 ] Hadoop QA commented on HBASE-19311: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 6m 30s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 5s{color} | {color:red} hbase-server: The patch generated 5 new + 13 unchanged - 7 fixed = 18 total (was 20) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} hbase-it: The patch generated 0 new + 0 unchanged - 3 fixed = 0 total (was 3) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 2s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 55m 35s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}104m 15s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s{color} | {color:green} hbase-it in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}183m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19311 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12898608/HBASE-19311.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 14e01e9401c0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | |
[jira] [Updated] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-18309: -- Status: Patch Available (was: Open) > Support multi threads in CleanerChore > - > > Key: HBASE-18309 > URL: https://issues.apache.org/jira/browse/HBASE-18309 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: binlijin >Assignee: Reid Chan > Attachments: HBASE-18309.master.001.patch, > HBASE-18309.master.002.patch, HBASE-18309.master.004.patch, > HBASE-18309.master.005.patch, HBASE-18309.master.006.patch, > HBASE-18309.master.007.patch, HBASE-18309.master.008.patch, > HBASE-18309.master.009.patch, HBASE-18309.master.010.patch, > HBASE-18309.master.011.patch, HBASE-18309.master.012.patch, > space_consumption_in_archive.png > > > There is only one thread in LogCleaner to clean oldWALs and in our big > cluster we find this is not enough. The number of files under oldWALs reach > the max-directory-items limit of HDFS and cause region server crash, so we > use multi threads for LogCleaner and the crash not happened any more. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-18309: -- Attachment: HBASE-18309.master.012.patch Trigger QA again. > Support multi threads in CleanerChore > - > > Key: HBASE-18309 > URL: https://issues.apache.org/jira/browse/HBASE-18309 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: binlijin >Assignee: Reid Chan > Attachments: HBASE-18309.master.001.patch, > HBASE-18309.master.002.patch, HBASE-18309.master.004.patch, > HBASE-18309.master.005.patch, HBASE-18309.master.006.patch, > HBASE-18309.master.007.patch, HBASE-18309.master.008.patch, > HBASE-18309.master.009.patch, HBASE-18309.master.010.patch, > HBASE-18309.master.011.patch, HBASE-18309.master.012.patch, > space_consumption_in_archive.png > > > There is only one thread in LogCleaner to clean oldWALs and in our big > cluster we find this is not enough. The number of files under oldWALs reach > the max-directory-items limit of HDFS and cause region server crash, so we > use multi threads for LogCleaner and the crash not happened any more. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-18309: -- Attachment: (was: HBASE-18309.master.012.patch) > Support multi threads in CleanerChore > - > > Key: HBASE-18309 > URL: https://issues.apache.org/jira/browse/HBASE-18309 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: binlijin >Assignee: Reid Chan > Attachments: HBASE-18309.master.001.patch, > HBASE-18309.master.002.patch, HBASE-18309.master.004.patch, > HBASE-18309.master.005.patch, HBASE-18309.master.006.patch, > HBASE-18309.master.007.patch, HBASE-18309.master.008.patch, > HBASE-18309.master.009.patch, HBASE-18309.master.010.patch, > HBASE-18309.master.011.patch, space_consumption_in_archive.png > > > There is only one thread in LogCleaner to clean oldWALs and in our big > cluster we find this is not enough. The number of files under oldWALs reach > the max-directory-items limit of HDFS and cause region server crash, so we > use multi threads for LogCleaner and the crash not happened any more. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-18309: -- Status: Open (was: Patch Available) > Support multi threads in CleanerChore > - > > Key: HBASE-18309 > URL: https://issues.apache.org/jira/browse/HBASE-18309 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: binlijin >Assignee: Reid Chan > Attachments: HBASE-18309.master.001.patch, > HBASE-18309.master.002.patch, HBASE-18309.master.004.patch, > HBASE-18309.master.005.patch, HBASE-18309.master.006.patch, > HBASE-18309.master.007.patch, HBASE-18309.master.008.patch, > HBASE-18309.master.009.patch, HBASE-18309.master.010.patch, > HBASE-18309.master.011.patch, HBASE-18309.master.012.patch, > space_consumption_in_archive.png > > > There is only one thread in LogCleaner to clean oldWALs and in our big > cluster we find this is not enough. The number of files under oldWALs reach > the max-directory-items limit of HDFS and cause region server crash, so we > use multi threads for LogCleaner and the crash not happened any more. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260349#comment-16260349 ] Hadoop QA commented on HBASE-18309: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 53s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} hbase-server: The patch generated 0 new + 184 unchanged - 4 fixed = 184 total (was 188) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 51s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 54m 5s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 39s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}179m 47s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-18309 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12898606/HBASE-18309.master.012.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 09211c8d705d 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8f806ab486 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/9937/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/9937/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/9937/console | | Powered by | Apache Yetus 0.6.0 http://yetus.apache.org | This message was automatically
[jira] [Commented] (HBASE-19293) Support add a disabled state replication peer directly
[ https://issues.apache.org/jira/browse/HBASE-19293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260347#comment-16260347 ] Guanghao Zhang commented on HBASE-19293: bq. On commit fix the whitespace issues The whitespaces result may be not right. The patch didn't introduce whitespace. Let me check it again when commit. bq. Needs a release note so others can read about this nice improvement. Updated release note. > Support add a disabled state replication peer directly > -- > > Key: HBASE-19293 > URL: https://issues.apache.org/jira/browse/HBASE-19293 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19293.master.001.patch, > HBASE-19293.master.002.patch, HBASE-19293.master.003.patch > > > Now when add a replication peer, the default state is enabled. If you want > add a disabled replication peer, you need add a peer first, then disable it. > It need two step to finish now. > Use case for add a disabled replication peer. When user want sync data from a > cluster A to a new peer cluster. > 1. Add a disabled replication peer. And config the table to peer config. > 2. Take a snapshot of table and export snapshot to peer cluster. > 3. Restore snapshot in peer cluster. > 4. Enable the peer and wait all stuck replication log replicated to peer > cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19293) Support add a disabled state replication peer directly
[ https://issues.apache.org/jira/browse/HBASE-19293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-19293: --- Release Note: Add a boolean parameter which means the new replication peer's state is enabled or disabled for Admin/AsyncAdmin's addReplicationPeer method. Meanwhile, you can use shell cmd to add a enabled/disabled replication peer. The STATE parameter is optional and the default state is enabled. hbase> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase", STATE => "ENABLED" hbase> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase", STATE => "DISABLED" > Support add a disabled state replication peer directly > -- > > Key: HBASE-19293 > URL: https://issues.apache.org/jira/browse/HBASE-19293 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19293.master.001.patch, > HBASE-19293.master.002.patch, HBASE-19293.master.003.patch > > > Now when add a replication peer, the default state is enabled. If you want > add a disabled replication peer, you need add a peer first, then disable it. > It need two step to finish now. > Use case for add a disabled replication peer. When user want sync data from a > cluster A to a new peer cluster. > 1. Add a disabled replication peer. And config the table to peer config. > 2. Take a snapshot of table and export snapshot to peer cluster. > 3. Restore snapshot in peer cluster. > 4. Enable the peer and wait all stuck replication log replicated to peer > cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18965) Create alternate API to processRowsWithLock() that doesn't take RowProcessor as an argument
[ https://issues.apache.org/jira/browse/HBASE-18965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260338#comment-16260338 ] stack commented on HBASE-18965: --- Moving out of beta-1. Ain't going to be done for beta-1. > Create alternate API to processRowsWithLock() that doesn't take RowProcessor > as an argument > --- > > Key: HBASE-18965 > URL: https://issues.apache.org/jira/browse/HBASE-18965 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 2.0.0-alpha-3 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-beta-2 > > > Create alternate API to processRowsWithLock() that doesn't take RowProcessor > as an argument. Also write example showing how coprocessors and batchMutate() > can be used instead of RowProcessors. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18965) Create alternate API to processRowsWithLock() that doesn't take RowProcessor as an argument
[ https://issues.apache.org/jira/browse/HBASE-18965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-18965: -- Fix Version/s: (was: 2.0.0-beta-1) 2.0.0-beta-2 > Create alternate API to processRowsWithLock() that doesn't take RowProcessor > as an argument > --- > > Key: HBASE-18965 > URL: https://issues.apache.org/jira/browse/HBASE-18965 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 2.0.0-alpha-3 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-beta-2 > > > Create alternate API to processRowsWithLock() that doesn't take RowProcessor > as an argument. Also write example showing how coprocessors and batchMutate() > can be used instead of RowProcessors. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations
[ https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260324#comment-16260324 ] stack commented on HBASE-19301: --- Go for it. > Provide way for CPs to create short circuited connection with custom > configurations > --- > > Key: HBASE-19301 > URL: https://issues.apache.org/jira/browse/HBASE-19301 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19301.patch > > > Over in HBASE-18359 we have discussions for this. > Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But > this returns a pre created connection (per server). This uses the configs at > hbase-site.xml at that server. > Phoenix needs creating connection in CP with some custom configs. Having this > custom changes in hbase-site.xml is harmful as that will affect all > connections been created at that server. > This issue is for providing an overloaded getConnection(Configuration) API -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19293) Support add a disabled state replication peer directly
[ https://issues.apache.org/jira/browse/HBASE-19293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260320#comment-16260320 ] stack commented on HBASE-19293: --- [~zghaobac] Our Sean is gone till Feb. I think he'd be good w/ your changes given the explanations. I took a look over the patch. LGTM. On commit fix the whitespace issues if you don't mind. Needs a release note so others can read about this nice improvement. Perhaps paste your nice additions to the replication command on the release note? Thanks. > Support add a disabled state replication peer directly > -- > > Key: HBASE-19293 > URL: https://issues.apache.org/jira/browse/HBASE-19293 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19293.master.001.patch, > HBASE-19293.master.002.patch, HBASE-19293.master.003.patch > > > Now when add a replication peer, the default state is enabled. If you want > add a disabled replication peer, you need add a peer first, then disable it. > It need two step to finish now. > Use case for add a disabled replication peer. When user want sync data from a > cluster A to a new peer cluster. > 1. Add a disabled replication peer. And config the table to peer config. > 2. Take a snapshot of table and export snapshot to peer cluster. > 3. Restore snapshot in peer cluster. > 4. Enable the peer and wait all stuck replication log replicated to peer > cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations
[ https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260319#comment-16260319 ] Anoop Sam John commented on HBASE-19301: Ok I will change it. I hope Stack is not really against it. :-) Stop me if am wrong boss. > Provide way for CPs to create short circuited connection with custom > configurations > --- > > Key: HBASE-19301 > URL: https://issues.apache.org/jira/browse/HBASE-19301 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19301.patch > > > Over in HBASE-18359 we have discussions for this. > Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But > this returns a pre created connection (per server). This uses the configs at > hbase-site.xml at that server. > Phoenix needs creating connection in CP with some custom configs. Having this > custom changes in hbase-site.xml is harmful as that will affect all > connections been created at that server. > This issue is for providing an overloaded getConnection(Configuration) API -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations
[ https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260318#comment-16260318 ] Guanghao Zhang commented on HBASE-19301: bq. We can do in another issue? Agree. bq. Ya getConnection() or createConnection() both names having their good parts. I would like to go with getConnection() too because of what Stack said. I prefer to createConnection(conf).. > Provide way for CPs to create short circuited connection with custom > configurations > --- > > Key: HBASE-19301 > URL: https://issues.apache.org/jira/browse/HBASE-19301 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19301.patch > > > Over in HBASE-18359 we have discussions for this. > Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But > this returns a pre created connection (per server). This uses the configs at > hbase-site.xml at that server. > Phoenix needs creating connection in CP with some custom configs. Having this > custom changes in hbase-site.xml is harmful as that will affect all > connections been created at that server. > This issue is for providing an overloaded getConnection(Configuration) API -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19092) Make Tag IA.LimitedPrivate and expose for CPs
[ https://issues.apache.org/jira/browse/HBASE-19092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260311#comment-16260311 ] ramkrishna.s.vasudevan commented on HBASE-19092: bq. its just not clean in hbase2. I think we can live w/ that (needs good doc). [~saint@gmail.com] So you agree with putting the getTags and getTag in ExtendedCell? And add some docs around it? So can the doc clearly say that users needs to do a typecast or assume the cell in CP is always an ExtendedCell? > Make Tag IA.LimitedPrivate and expose for CPs > - > > Key: HBASE-19092 > URL: https://issues.apache.org/jira/browse/HBASE-19092 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19092-branch-2.patch, > HBASE-19092-branch-2_5.patch, HBASE-19092-branch-2_5.patch, > HBASE-19092.branch-2.0.02.patch, HBASE-19092_001-branch-2.patch, > HBASE-19092_001.patch, HBASE-19092_002-branch-2.patch, HBASE-19092_002.patch > > > We need to make tags as LimitedPrivate as some use cases are trying to use > tags like timeline server. The same topic was discussed in dev@ and also in > HBASE-18995. > Shall we target this for beta1 - cc [~saint@gmail.com]. > So once we do this all related Util methods and APIs should also move to > LimitedPrivate Util classes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19123) Purge 'complete' support from Coprocesor Observers
[ https://issues.apache.org/jira/browse/HBASE-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260306#comment-16260306 ] stack commented on HBASE-19123: --- Looks like I pushed this patch to master branch here unintentionally commit 08544e54a999df16cb0cef7cf45a17da1eeef42d Author: Michael StackDate: Thu Nov 16 18:46:27 2017 -0800 HBASE-19123 Purge 'complete' support from Coprocesor Observers Leaving it in. Pushed to branch-2. > Purge 'complete' support from Coprocesor Observers > -- > > Key: HBASE-19123 > URL: https://issues.apache.org/jira/browse/HBASE-19123 > Project: HBase > Issue Type: Task > Components: Coprocessors >Reporter: stack >Assignee: stack > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19123.master.001.patch, > HBASE-19123.master.002.patch, HBASE-19123.master.003.patch, > HBASE-19123.master.004.patch > > > Up on dev list under '[DISCUSSION] Removing the bypass semantic from the > Coprocessor APIs', we are discussing purge of 'complete'. Unless objection, > lets purge for beta-1. > [~andrew.purt...@gmail.com] says the following up on the dev list: > It would simplify the theory of operation for coprocessors if we can assume > either the entire chain will complete or one of the coprocessors in the chain > will throw an exception that not only terminates processing of the rest of > the chain but also the operation in progress. > Security coprocessors interrupt processing by throwing an exception, which is > meant to propagate all the way back to the user. > I think it's more than fair to ask the same question about 'complete' as we > did about 'bypass': Does anyone use it? Is it needed? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19123) Purge 'complete' support from Coprocesor Observers
[ https://issues.apache.org/jira/browse/HBASE-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19123: -- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to branch-2 and master. Resolving. > Purge 'complete' support from Coprocesor Observers > -- > > Key: HBASE-19123 > URL: https://issues.apache.org/jira/browse/HBASE-19123 > Project: HBase > Issue Type: Task > Components: Coprocessors >Reporter: stack >Assignee: stack > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19123.master.001.patch, > HBASE-19123.master.002.patch, HBASE-19123.master.003.patch, > HBASE-19123.master.004.patch > > > Up on dev list under '[DISCUSSION] Removing the bypass semantic from the > Coprocessor APIs', we are discussing purge of 'complete'. Unless objection, > lets purge for beta-1. > [~andrew.purt...@gmail.com] says the following up on the dev list: > It would simplify the theory of operation for coprocessors if we can assume > either the entire chain will complete or one of the coprocessors in the chain > will throw an exception that not only terminates processing of the rest of > the chain but also the operation in progress. > Security coprocessors interrupt processing by throwing an exception, which is > meant to propagate all the way back to the user. > I think it's more than fair to ask the same question about 'complete' as we > did about 'bypass': Does anyone use it? Is it needed? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations
[ https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260304#comment-16260304 ] Guanghao Zhang commented on HBASE-19301: What I care about is how to avoid user misuse these methods. If return a RegionServer Connection, then user can't close it. If return a new created connection, user should close it to avoid resource leak. > Provide way for CPs to create short circuited connection with custom > configurations > --- > > Key: HBASE-19301 > URL: https://issues.apache.org/jira/browse/HBASE-19301 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19301.patch > > > Over in HBASE-18359 we have discussions for this. > Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But > this returns a pre created connection (per server). This uses the configs at > hbase-site.xml at that server. > Phoenix needs creating connection in CP with some custom configs. Having this > custom changes in hbase-site.xml is harmful as that will affect all > connections been created at that server. > This issue is for providing an overloaded getConnection(Configuration) API -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations
[ https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260302#comment-16260302 ] Anoop Sam John commented on HBASE-19301: bq.I left a "TODO: Use Table.put(Put) instead." in AccessControlLists when work for HBASE-18500. This can be fixed if we use the new created connection to get ACL Table. We can do in another issue? I would like to stick with what the subject says. Ya getConnection() or createConnection() both names having their good parts. I would like to go with getConnection() too because of what Stack said. You ok [~zghaobac]? > Provide way for CPs to create short circuited connection with custom > configurations > --- > > Key: HBASE-19301 > URL: https://issues.apache.org/jira/browse/HBASE-19301 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19301.patch > > > Over in HBASE-18359 we have discussions for this. > Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But > this returns a pre created connection (per server). This uses the configs at > hbase-site.xml at that server. > Phoenix needs creating connection in CP with some custom configs. Having this > custom changes in hbase-site.xml is harmful as that will affect all > connections been created at that server. > This issue is for providing an overloaded getConnection(Configuration) API -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19298) CellScanner should be declared as IA.Public
[ https://issues.apache.org/jira/browse/HBASE-19298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260301#comment-16260301 ] stack commented on HBASE-19298: --- bq. Is it the consensus in our community? If so, I'm fine to close this issue as Won't Fix. [~chia7712] Our Sean is gone till Feb FYI as of today (For the best of reasons -- smile). I'd be good w/ CellScanner being public. It is a dumb interface that is pervasive; e.g. Result implements it. On the patch, whats up w/ the ExtendedCellScanner. Why we need it? Thanks. > CellScanner should be declared as IA.Public > --- > > Key: HBASE-19298 > URL: https://issues.apache.org/jira/browse/HBASE-19298 > Project: HBase > Issue Type: Task >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19298.v0.patch > > > User can create the {{CellScanner}} via IA.Public {{CellUtil}}, hence > {{CellScanner}} should be IA.Public. However, the {{CellScanner}} is used in > the server code base so making {{CellScanner}} IA.Public may flaw our HBASE > in the future. In my opinion, we should introduce the {{ExtendedCellScanner}} > to replace the {{CellScanner}} for server code. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations
[ https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260294#comment-16260294 ] stack commented on HBASE-19301: --- Good stuff [~anoop.hbase] See the [~zghaobac] comment. [~zghaobac] I hear you. The nice thing about the getConnection(Configuration) is that there is a bit of symmetry w/ the existing getConnection. > Provide way for CPs to create short circuited connection with custom > configurations > --- > > Key: HBASE-19301 > URL: https://issues.apache.org/jira/browse/HBASE-19301 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19301.patch > > > Over in HBASE-18359 we have discussions for this. > Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But > this returns a pre created connection (per server). This uses the configs at > hbase-site.xml at that server. > Phoenix needs creating connection in CP with some custom configs. Having this > custom changes in hbase-site.xml is harmful as that will affect all > connections been created at that server. > This issue is for providing an overloaded getConnection(Configuration) API -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19312) Find out why sometimes we need to spend more than one second to get the cluster id
[ https://issues.apache.org/jira/browse/HBASE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-19312: -- Priority: Blocker (was: Major) Fix Version/s: 2.0.0 Component/s: Zookeeper Client asyncclient This is a very critical problem as sleeping in foreground will cause very bad performance impact for asynchronous program. Marked as a blocker for 2.0.0 release. If we can not find a way to purge this in curator, then I think we need to drop curator and implement our own read only zk client. Thanks. > Find out why sometimes we need to spend more than one second to get the > cluster id > -- > > Key: HBASE-19312 > URL: https://issues.apache.org/jira/browse/HBASE-19312 > Project: HBase > Issue Type: Bug > Components: asyncclient, Client, Zookeeper >Reporter: Duo Zhang >Priority: Blocker > Fix For: 2.0.0 > > > See the discussion in HBASE-19266. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260291#comment-16260291 ] Duo Zhang commented on HBASE-19266: --- +1 on #2. TestAcidGuarantees is already slow enough even without different memstore compaction policies... > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations
[ https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260290#comment-16260290 ] Anoop Sam John commented on HBASE-19301: Thanks for the detailed comment Stack. Ya agree to all.. We were never doing the short circuit with avoiding the PB marshal unmarshal stuff. I believe I mentioned this in the other sub task u did for providing the OnlineRegions return from RegionCPEnv. Calling APIs on Region will be much more cheaper as we deal with POJOs then not PBs. I will remove that statement from the javadoc around the CPEnv.getConnection(). Instead will add a TODO to check it later. I dont know how we can avoid the PB overhead. For the Admin/Client stub, we need PB based args and so this is not really avoidable? Will add the doc as u said. Let me fix the javadoc issues reported by QA. > Provide way for CPs to create short circuited connection with custom > configurations > --- > > Key: HBASE-19301 > URL: https://issues.apache.org/jira/browse/HBASE-19301 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19301.patch > > > Over in HBASE-18359 we have discussions for this. > Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But > this returns a pre created connection (per server). This uses the configs at > hbase-site.xml at that server. > Phoenix needs creating connection in CP with some custom configs. Having this > custom changes in hbase-site.xml is harmful as that will affect all > connections been created at that server. > This issue is for providing an overloaded getConnection(Configuration) API -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19301) Provide way for CPs to create short circuited connection with custom configurations
[ https://issues.apache.org/jira/browse/HBASE-19301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260288#comment-16260288 ] Guanghao Zhang commented on HBASE-19301: How about name it to createConnection(conf)? This will be clear for user to know this method will create a new connection and the user should maintain it by themselves. I left a "TODO: Use Table.put(Put) instead." in AccessControlLists when work for HBASE-18500. This can be fixed if we use the new created connection to get ACL Table. > Provide way for CPs to create short circuited connection with custom > configurations > --- > > Key: HBASE-19301 > URL: https://issues.apache.org/jira/browse/HBASE-19301 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19301.patch > > > Over in HBASE-18359 we have discussions for this. > Right now HBase provide getConnection() in RegionCPEnv, MasterCPEnv etc. But > this returns a pre created connection (per server). This uses the configs at > hbase-site.xml at that server. > Phoenix needs creating connection in CP with some custom configs. Having this > custom changes in hbase-site.xml is harmful as that will affect all > connections been created at that server. > This issue is for providing an overloaded getConnection(Configuration) API -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260285#comment-16260285 ] Chia-Ping Tsai commented on HBASE-19266: [~tedyu] [~Apache9] Any suggestions about #2? > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260279#comment-16260279 ] Chia-Ping Tsai edited comment on HBASE-19266 at 11/21/17 5:51 AM: -- bq. What is the running time before HBASE-19200? before: 655.004s HBASE-19200: 865.749s bq. I can see that we spend more than 1 second to retrieve the cluster id, too long I think. Will open a issue for this. The cost i observed is same with you.:) I don't think the 1 second will kill us, and we don't have to revert HBASE-19200. The root cause is that {{TestAcidGuarantees}} is a parameterized test, so any small cost will make the executed time become even larger. There are many solutions about speeding up {{TestAcidGuarantees}}. # start mini cluster once (HBASE-19311) # separate {{TestAcidGuarantees}} #1 will be addressed by [~Apache9], and it is a good solution as the initialization of mini cluster is slower than any phases. #2 is necessary also I think as it not only speed up the {{TestAcidGuarantees}} but also help us to debug for different MemStore in the future. was (Author: chia7712): bq. What is the running time before HBASE-19200? before: 655.004s HBASE-19200: 865.749s bq. I can see that we spend more than 1 second to retrieve the cluster id, too long I think. Will open a issue for this. The cost i observed is same with you.:) I don't think the 1 second will kill us, and we don't have to revert HBASE-19200. The root cause is that {{TestAcidGuarantees}} is a parameterized test, so any small cost will make the executed time become even larger. There are many solution about speeding up {{TestAcidGuarantees}}. # start mini cluster once (HBASE-19311) # separate {{TestAcidGuarantees}} #1 will be addressed by [~Apache9], and it is a good solution as the initialization of mini cluster is slower than any phases. #2 is necessary also I think as it not only speed up the {{TestAcidGuarantees}} but also help us to debug for different MemStore in the future. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19304) KEEP_DELETED_CELLS should ignore case
[ https://issues.apache.org/jira/browse/HBASE-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260281#comment-16260281 ] Chia-Ping Tsai commented on HBASE-19304: Will commit it with checkstyle fix. > KEEP_DELETED_CELLS should ignore case > -- > > Key: HBASE-19304 > URL: https://issues.apache.org/jira/browse/HBASE-19304 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0-alpha-4 >Reporter: Sergey Soldatov >Assignee: Sergey Soldatov >Priority: Blocker > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19304-v1.patch > > > Since HBASE-12363 we start using an enum instead of boolean for > keep_deleted_cells. In ColumnFamilyDescriptorBuilder we are using valueOf to > find out the value of the property. But there is a problem: all values in > ENUM are uppercase, so if we provide the value in lowercase (and java Boolean > returns it in lowercase in toString), the table creation may fail with an > exception: > {code} > java.io.IOException: java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.hbase.KeepDeletedCells.true > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1028) > at > org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:891) > at > org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:859) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6966) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6923) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6894) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6850) > at > org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6801) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:285) > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:110) > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalArgumentException: No enum constant > org.apache.hadoop.hbase.KeepDeletedCells.true > at java.lang.Enum.valueOf(Enum.java:238) > at > org.apache.hadoop.hbase.KeepDeletedCells.valueOf(KeepDeletedCells.java:30) > at > org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.lambda$getStringOrDefault$23(ColumnFamilyDescriptorBuilder.java:719) > at > org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.getOrDefault(ColumnFamilyDescriptorBuilder.java:727) > at > org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.getStringOrDefault(ColumnFamilyDescriptorBuilder.java:719) > at > org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder$ModifyableColumnFamilyDescriptor.getKeepDeletedCells(ColumnFamilyDescriptorBuilder.java:901) > at > org.apache.hadoop.hbase.regionserver.ScanInfo.(ScanInfo.java:69) > at org.apache.hadoop.hbase.regionserver.HStore.(HStore.java:265) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:5485) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:992) > at > org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:989) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260279#comment-16260279 ] Chia-Ping Tsai commented on HBASE-19266: bq. What is the running time before HBASE-19200? before: 655.004s HBASE-19200: 865.749s bq. I can see that we spend more than 1 second to retrieve the cluster id, too long I think. Will open a issue for this. The cost i observed is same with you.:) I don't think the 1 second will kill us, and we don't have to revert HBASE-19200. The root cause is that {{TestAcidGuarantees}} is a parameterized test, so any small cost will make the executed time become even larger. There are many solution about speeding up {{TestAcidGuarantees}}. # start mini cluster once (HBASE-19311) # separate {{TestAcidGuarantees}} #1 will be addressed by [~Apache9], and it is a good solution as the initialization of mini cluster is slower than any phases. #2 is necessary also I think as it not only speed up the {{TestAcidGuarantees}} but also help us to debug for different MemStore in the future. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19312) Find out why sometimes we need to spend more than one second to get the cluster id
[ https://issues.apache.org/jira/browse/HBASE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260271#comment-16260271 ] Duo Zhang commented on HBASE-19312: --- [~mdrob] hey curator committer, could you please explain this line? https://github.com/apache/curator/blob/6d36a4793b31cdacaf4bbf6554e05d68bc680001/curator-framework/src/main/java/org/apache/curator/framework/imps/CuratorFrameworkImpl.java#L943 {code} if ( !operationAndData.isConnectionRequired() || client.isConnected() ) { operationAndData.callPerformBackgroundOperation(); } else { client.getZooKeeper(); // important - allow connection resets, timeouts, etc. to occur if ( operationAndData.getElapsedTimeMs() >= client.getConnectionTimeoutMs() ) { throw new CuratorConnectionLossException(); } operationAndData.sleepFor(1, TimeUnit.SECONDS); // <= here queueOperation(operationAndData); } {code} Why there is a sleep if the client is not connected yet? In fact I do not want to block the client thread so I do not call blockUntilConnected and use the asynchronous API of CuratorFramework(inBackground) to get data. But sadly if the connection to zk is not established yet we will sleep 1 second here... Is this intentional? If yes, then what is the correct way to write fully asynchronous code with curator? Thanks. > Find out why sometimes we need to spend more than one second to get the > cluster id > -- > > Key: HBASE-19312 > URL: https://issues.apache.org/jira/browse/HBASE-19312 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang > > See the discussion in HBASE-19266. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19312) Find out why sometimes we need to spend more than one second to get the cluster id
Duo Zhang created HBASE-19312: - Summary: Find out why sometimes we need to spend more than one second to get the cluster id Key: HBASE-19312 URL: https://issues.apache.org/jira/browse/HBASE-19312 Project: HBase Issue Type: Bug Reporter: Duo Zhang -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19312) Find out why sometimes we need to spend more than one second to get the cluster id
[ https://issues.apache.org/jira/browse/HBASE-19312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-19312: -- Description: See the discussion in HBASE-19266. > Find out why sometimes we need to spend more than one second to get the > cluster id > -- > > Key: HBASE-19312 > URL: https://issues.apache.org/jira/browse/HBASE-19312 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang > > See the discussion in HBASE-19266. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19290) Reduce zk request when doing split log
[ https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260258#comment-16260258 ] Hadoop QA commented on HBASE-19290: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 23s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 53s{color} | {color:red} hbase-server: The patch generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 25s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 46m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 96m 24s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}160m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19290 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12898598/HBASE-19290.master.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux f0af9476849a 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8f806ab486 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/9935/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/9935/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output |
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260257#comment-16260257 ] Duo Zhang commented on HBASE-19266: --- You can do what you like, i do not care. I will do my job. Thanks. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster
[ https://issues.apache.org/jira/browse/HBASE-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260252#comment-16260252 ] Appy commented on HBASE-19311: -- Reason makes sense. Had a very quick look. Some high level comments. Do we need two tools? Maybe TestAcidGuarantees.java can be plan java unit test now? All our tools (unless they are very old and no one touched them) extend AbstractHBaseTool now instead of directly extending Tool since the former takes care of bunch of boilerplate work. Let's do the same here? {quote} util = getTestingUtil(getConf()); util.initializeCluster(SERVER_COUNT); conf = getConf(); -conf.set(HConstants.HREGION_MEMSTORE_FLUSH_SIZE, String.valueOf(128*1024)); +conf.set(HConstants.HREGION_MEMSTORE_FLUSH_SIZE, String.valueOf(128 * 1024)); // prevent aggressive region split conf.set(HConstants.HBASE_REGION_SPLIT_POLICY_KEY, -ConstantSizeRegionSplitPolicy.class.getName()); -this.setConf(util.getConfiguration()); + ConstantSizeRegionSplitPolicy.class.getName()); -// replace the HBaseTestingUtility in the unit test with the integration test's -// IntegrationTestingUtility -tag = new TestAcidGuarantees(CompactingMemStore.COMPACTING_MEMSTORE_TYPE_DEFAULT); -tag.setHBaseTestingUtil(util); +this.setConf(util.getConfiguration()); +tool = new AcidGuaranteesTestTool(); +tool.setConf(conf); {quote} That's confusing. Aren't {{conf=getConf()}} and {{util.getConfiguration()}} same after we have done {{ util = getTestingUtil(getConf());}}. > Promote TestAcidGuarantees to LargeTests and start mini cluster once to make > it faster > -- > > Key: HBASE-19311 > URL: https://issues.apache.org/jira/browse/HBASE-19311 > Project: HBase > Issue Type: Improvement > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang > Attachments: HBASE-19311.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260250#comment-16260250 ] Ted Yu commented on HBASE-19266: QA would not run known flaky tests, including TestAcidGuarantees. How about reverting HBASE-19200 first ? This way, HBASE-19311 can run thru QA bot and you have enough time to find out what (from HBASE-19200) caused the test to run slower. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260249#comment-16260249 ] Duo Zhang commented on HBASE-19266: --- I guess there is a retry in the background, the retry interval for curator is 1 second. Let me see if we can get the retry log. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260246#comment-16260246 ] Duo Zhang commented on HBASE-19266: --- And I'm still trying to find out the actual problem. In TestZKAsyncRegistry the getClusterId runs pretty fast... > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260243#comment-16260243 ] Duo Zhang commented on HBASE-19266: --- Please see HBASE-19311, the time cost is more for starting mini cluster every time. Thanks. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260242#comment-16260242 ] Ted Yu commented on HBASE-19266: Please work on shortening the runtime of cluster id retrieval first (so that TestAcidGuarantees no longer times out). > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster
[ https://issues.apache.org/jira/browse/HBASE-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260239#comment-16260239 ] Duo Zhang commented on HBASE-19311: --- [~chia7712] [~tedyu] FYI. > Promote TestAcidGuarantees to LargeTests and start mini cluster once to make > it faster > -- > > Key: HBASE-19311 > URL: https://issues.apache.org/jira/browse/HBASE-19311 > Project: HBase > Issue Type: Improvement > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang > Attachments: HBASE-19311.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster
[ https://issues.apache.org/jira/browse/HBASE-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-19311: -- Assignee: Duo Zhang Status: Patch Available (was: Open) > Promote TestAcidGuarantees to LargeTests and start mini cluster once to make > it faster > -- > > Key: HBASE-19311 > URL: https://issues.apache.org/jira/browse/HBASE-19311 > Project: HBase > Issue Type: Improvement > Components: test >Reporter: Duo Zhang >Assignee: Duo Zhang > Attachments: HBASE-19311.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster
[ https://issues.apache.org/jira/browse/HBASE-19311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-19311: -- Attachment: HBASE-19311.patch > Promote TestAcidGuarantees to LargeTests and start mini cluster once to make > it faster > -- > > Key: HBASE-19311 > URL: https://issues.apache.org/jira/browse/HBASE-19311 > Project: HBase > Issue Type: Improvement > Components: test >Reporter: Duo Zhang > Attachments: HBASE-19311.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19311) Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster
Duo Zhang created HBASE-19311: - Summary: Promote TestAcidGuarantees to LargeTests and start mini cluster once to make it faster Key: HBASE-19311 URL: https://issues.apache.org/jira/browse/HBASE-19311 Project: HBase Issue Type: Improvement Components: test Reporter: Duo Zhang -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260238#comment-16260238 ] Duo Zhang commented on HBASE-19266: --- I rewrite TestAcidGuarantees to only start mini cluster once and now it runs faster, and I think we should promote it to LargeTests. It does not make sense for a medium tests to run 10 mins or more. Will open a issue to upload the patch. And another problem is about curator. I can see that we spend more than 1 second to retrieve the cluster id, too long I think. Will open a issue for this. Thanks. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-18309: -- Status: Patch Available (was: Open) > Support multi threads in CleanerChore > - > > Key: HBASE-18309 > URL: https://issues.apache.org/jira/browse/HBASE-18309 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: binlijin >Assignee: Reid Chan > Attachments: HBASE-18309.master.001.patch, > HBASE-18309.master.002.patch, HBASE-18309.master.004.patch, > HBASE-18309.master.005.patch, HBASE-18309.master.006.patch, > HBASE-18309.master.007.patch, HBASE-18309.master.008.patch, > HBASE-18309.master.009.patch, HBASE-18309.master.010.patch, > HBASE-18309.master.011.patch, HBASE-18309.master.012.patch, > space_consumption_in_archive.png > > > There is only one thread in LogCleaner to clean oldWALs and in our big > cluster we find this is not enough. The number of files under oldWALs reach > the max-directory-items limit of HDFS and cause region server crash, so we > use multi threads for LogCleaner and the crash not happened any more. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-18309: -- Attachment: HBASE-18309.master.012.patch Update as [~chia7712] reviews > Support multi threads in CleanerChore > - > > Key: HBASE-18309 > URL: https://issues.apache.org/jira/browse/HBASE-18309 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: binlijin >Assignee: Reid Chan > Attachments: HBASE-18309.master.001.patch, > HBASE-18309.master.002.patch, HBASE-18309.master.004.patch, > HBASE-18309.master.005.patch, HBASE-18309.master.006.patch, > HBASE-18309.master.007.patch, HBASE-18309.master.008.patch, > HBASE-18309.master.009.patch, HBASE-18309.master.010.patch, > HBASE-18309.master.011.patch, HBASE-18309.master.012.patch, > space_consumption_in_archive.png > > > There is only one thread in LogCleaner to clean oldWALs and in our big > cluster we find this is not enough. The number of files under oldWALs reach > the max-directory-items limit of HDFS and cause region server crash, so we > use multi threads for LogCleaner and the crash not happened any more. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19148) Edit of default configuration
[ https://issues.apache.org/jira/browse/HBASE-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-19148: --- Status: Patch Available (was: Open) > Edit of default configuration > - > > Key: HBASE-19148 > URL: https://issues.apache.org/jira/browse/HBASE-19148 > Project: HBase > Issue Type: Bug > Components: defaults >Reporter: stack >Priority: Blocker > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19148.master.001.patch > > > Remove cruft and mythologies. Make descriptions more digestible. Change > defaults given experience. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18309) Support multi threads in CleanerChore
[ https://issues.apache.org/jira/browse/HBASE-18309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-18309: -- Status: Open (was: Patch Available) > Support multi threads in CleanerChore > - > > Key: HBASE-18309 > URL: https://issues.apache.org/jira/browse/HBASE-18309 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: binlijin >Assignee: Reid Chan > Attachments: HBASE-18309.master.001.patch, > HBASE-18309.master.002.patch, HBASE-18309.master.004.patch, > HBASE-18309.master.005.patch, HBASE-18309.master.006.patch, > HBASE-18309.master.007.patch, HBASE-18309.master.008.patch, > HBASE-18309.master.009.patch, HBASE-18309.master.010.patch, > HBASE-18309.master.011.patch, space_consumption_in_archive.png > > > There is only one thread in LogCleaner to clean oldWALs and in our big > cluster we find this is not enough. The number of files under oldWALs reach > the max-directory-items limit of HDFS and cause region server crash, so we > use multi threads for LogCleaner and the crash not happened any more. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)
[ https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260212#comment-16260212 ] Vladimir Rodionov commented on HBASE-17852: --- {quote} My gut reaction is that the number of backups which would need to be retained in the system (e.g. rows in the hbase backup "system" table) would have to be quite large to even grow beyond a single region (many thousands to millions). As such, the snapshot restore isn't much more than grabbing the write lock and replacing some one data file and some Region metadata. This is on my list today to investigate confirm. {quote} Yes, [~elserj], you are right. Backup system table for vast majority of deployments will fit a single region. It is a metadata - not a data. Therefore, creation of snapshot and restoring from snapshot is a very lightweight operation. That is was a major reason I have chosen rollback-via-snapshot approach. > Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental > backup) > > > Key: HBASE-17852 > URL: https://issues.apache.org/jira/browse/HBASE-17852 > Project: HBase > Issue Type: Sub-task >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-17852-v1.patch, HBASE-17852-v2.patch, > HBASE-17852-v3.patch, HBASE-17852-v4.patch, HBASE-17852-v5.patch, > HBASE-17852-v6.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19200) make hbase-client only depend on ZKAsyncRegistry and ZNodePaths
[ https://issues.apache.org/jira/browse/HBASE-19200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260211#comment-16260211 ] Ted Yu commented on HBASE-19200: I ran TestAcidGuarantees with commit 31234eb862fdd7ee4917a2f74001182565ffbfd9 : {code} [INFO] Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 710.702 s - in org.apache.hadoop.hbase.TestAcidGuarantees {code} With this change: {code} [INFO] Apache HBase - Server .. FAILURE [15:07 min] {code} Same machine, Linux 3.10.0-327.28.3.el7.x86_64 #1 SMP Thu Aug 18 19:05:49 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux {code} Maven home: /apache-maven-3.5.0 Java version: 1.8.0_131, vendor: Oracle Corporation {code} [~chia7712] first discovered what caused TestAcidGuarantees to hang. > make hbase-client only depend on ZKAsyncRegistry and ZNodePaths > --- > > Key: HBASE-19200 > URL: https://issues.apache.org/jira/browse/HBASE-19200 > Project: HBase > Issue Type: Task > Components: Client, Zookeeper >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19200-v1.patch, HBASE-19200-v2.patch, > HBASE-19200-v3.patch, HBASE-19200-v4.patch, HBASE-19200-v5.patch, > HBASE-19200.patch > > > So that we can move most of the zookeeper related code out of hbase-client > module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19148) Edit of default configuration
[ https://issues.apache.org/jira/browse/HBASE-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-19148: --- Attachment: HBASE-19148.master.001.patch Attach a patch for retry number to see the hadoop QA results. > Edit of default configuration > - > > Key: HBASE-19148 > URL: https://issues.apache.org/jira/browse/HBASE-19148 > Project: HBase > Issue Type: Bug > Components: defaults >Reporter: stack >Priority: Blocker > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19148.master.001.patch > > > Remove cruft and mythologies. Make descriptions more digestible. Change > defaults given experience. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19290) Reduce zk request when doing split log
[ https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260196#comment-16260196 ] binlijin commented on HBASE-19290: -- bq. Why randomize? Can be constant? No particular reason for randomize, i change it to constant. bq. So there are 2 available splitters, and one grabbed task, we don't stop here and keep hammering zk? Yes. bq. Can do it in "if" condition itself? Yes, it can do, done it. bq. That while condition is just to handle spurious wakeups. See Object#wait. You can definitely remove the second sleep (unless there's a concrete reason not to). The while loop will enter only if when seq_start == taskReadySeq.get(), and when every splitLogZNode's children changed the taskReadySeq will increment, so it will not enter the while (seq_start == taskReadySeq.get()) {} and kill trying to grab task and issue zk request. > Reduce zk request when doing split log > -- > > Key: HBASE-19290 > URL: https://issues.apache.org/jira/browse/HBASE-19290 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: binlijin > Attachments: HBASE-19290.master.001.patch, > HBASE-19290.master.002.patch, HBASE-19290.master.003.patch > > > We observe once the cluster has 1000+ nodes and when hundreds of nodes abort > and doing split log, the split is very very slow, and we find the > regionserver and master wait on the zookeeper response, so we need to reduce > zookeeper request and pressure for big cluster. > (1) Reduce request to rsZNode, every time calculateAvailableSplitters will > get rsZNode's children from zookeeper, when cluster is huge, this is heavy. > This patch reduce the request. > (2) When the regionserver has max split tasks running, it may still trying to > grab task and issue zookeeper request, we should sleep and wait until we can > grab tasks again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19293) Support add a disabled state replication peer directly
[ https://issues.apache.org/jira/browse/HBASE-19293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260194#comment-16260194 ] Guanghao Zhang commented on HBASE-19293: Ping [~busbey]. Any more concerns? > Support add a disabled state replication peer directly > -- > > Key: HBASE-19293 > URL: https://issues.apache.org/jira/browse/HBASE-19293 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-19293.master.001.patch, > HBASE-19293.master.002.patch, HBASE-19293.master.003.patch > > > Now when add a replication peer, the default state is enabled. If you want > add a disabled replication peer, you need add a peer first, then disable it. > It need two step to finish now. > Use case for add a disabled replication peer. When user want sync data from a > cluster A to a new peer cluster. > 1. Add a disabled replication peer. And config the table to peer config. > 2. Take a snapshot of table and export snapshot to peer cluster. > 3. Restore snapshot in peer cluster. > 4. Enable the peer and wait all stuck replication log replicated to peer > cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19290) Reduce zk request when doing split log
[ https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-19290: - Attachment: HBASE-19290.master.003.patch > Reduce zk request when doing split log > -- > > Key: HBASE-19290 > URL: https://issues.apache.org/jira/browse/HBASE-19290 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: binlijin > Attachments: HBASE-19290.master.001.patch, > HBASE-19290.master.002.patch, HBASE-19290.master.003.patch > > > We observe once the cluster has 1000+ nodes and when hundreds of nodes abort > and doing split log, the split is very very slow, and we find the > regionserver and master wait on the zookeeper response, so we need to reduce > zookeeper request and pressure for big cluster. > (1) Reduce request to rsZNode, every time calculateAvailableSplitters will > get rsZNode's children from zookeeper, when cluster is huge, this is heavy. > This patch reduce the request. > (2) When the regionserver has max split tasks running, it may still trying to > grab task and issue zookeeper request, we should sleep and wait until we can > grab tasks again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17165) Add retry to LoadIncrementalHFiles tool
[ https://issues.apache.org/jira/browse/HBASE-17165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260149#comment-16260149 ] Zach York commented on HBASE-17165: --- [~stack] Can you take a look when you get a chance? Testing looks good. > Add retry to LoadIncrementalHFiles tool > --- > > Key: HBASE-17165 > URL: https://issues.apache.org/jira/browse/HBASE-17165 > Project: HBase > Issue Type: Improvement > Components: hbase, HFile, tooling >Affects Versions: 2.0.0, 1.2.3 >Reporter: Mike Grimes >Assignee: Mike Grimes >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17165.branch-1.001.patch, > HBASE-17165.branch-1.001.patch, HBASE-17165.branch-1.002.patch, > HBASE-17165.branch-1.002.patch, HBASE-17165.branch-1.2.001.patch, > HBASE-17165.branch-1.2.002.patch, HBASE-17165.branch-1.2.003.patch, > HBASE-17165.branch-1.2.004.patch, HBASE-17165.master.001.patch, > HBASE-17165.master.002.patch, HBASE-17165.master.003.patch, > HBASE-17165.master.004.patch, HBASE-17165.master.004.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > As using the LoadIncrementalHFiles tool with S3 as the filesystem is prone to > failing due to FileNotFoundExceptions due to inconsistency, simple, > configurable retry logic was added. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17165) Add retry to LoadIncrementalHFiles tool
[ https://issues.apache.org/jira/browse/HBASE-17165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260147#comment-16260147 ] Hadoop QA commented on HBASE-17165: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 22s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 27s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 46m 51s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 96m 45s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}161m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-17165 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12898560/HBASE-17165.master.004.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 445b59c3823d 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8f806ab486 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/9934/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/9934/console | | Powered by | Apache Yetus 0.6.0 http://yetus.apache.org | This message was automatically generated. > Add retry to LoadIncrementalHFiles tool > --- > > Key: HBASE-17165 > URL:
[jira] [Commented] (HBASE-19163) "Maximum lock count exceeded" from region server's batch processing
[ https://issues.apache.org/jira/browse/HBASE-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260131#comment-16260131 ] Hadoop QA commented on HBASE-19163: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 55s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 51s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 51m 10s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}114m 48s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}187m 9s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19163 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12898553/HBASE-19163.master.006.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 1d68fe596716 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 12:48:20 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8f806ab486 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/9933/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/9933/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/9933/console | | Powered by | Apache Yetus 0.6.0 http://yetus.apache.org | This message was automatically generated. > "Maximum lock count exceeded" from region server's batch
[jira] [Commented] (HBASE-19305) hbase alter table very slowly
[ https://issues.apache.org/jira/browse/HBASE-19305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260128#comment-16260128 ] Elians Wan commented on HBASE-19305: Ok,thank you very much! > hbase alter table very slowly > - > > Key: HBASE-19305 > URL: https://issues.apache.org/jira/browse/HBASE-19305 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 > Environment: OS: CentOS release 6.5 (Final) > hadoop : hadoop-2.5.1 > hbase:hbase-1.3.0 > zookeeper:zookeeper-3.4.6 >Reporter: Elians Wan > Labels: beginner > > when i alter hbase ttl, i found the alter cannot success for more than one > day, the table size is not large enough. > The the table size is : 303.6 G /hbase/data/default/SqlMetaData_Ver2 > The alter begin time is : > 2017-11-17 15:14:22,128 > but util 2017-11-20 18:00, the alter action has not been completed,so I do > not know what went wrong ? > Someone can help me? Thank you very much! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HBASE-19290) Reduce zk request when doing split log
[ https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260121#comment-16260121 ] Appy edited comment on HBASE-19290 at 11/21/17 1:09 AM: bq. int sleepTime = RandomUtils.nextInt(0, 100) + 500; Why randomize? Can be constant? bq if (taskGrabbed == 0 && !shouldStop) { So there are 2 available splitters, and one grabbed task, we don't stop here and keep hammering zk? {quote} int idx = (i + offset) % paths.size(); 446 // don't call ZKSplitLog.getNodeName() because that will lead to 447 // double encoding of the path name 448 taskGrabbed += grabTask(ZNodePaths.joinZNode(watcher.znodePaths.splitLogZNode, paths.get(idx))) ? 1 : 0; {quote} Can do it in "if" condition itself? bq. taskReadySeq.wait may not execute because it has condition. That while condition is just to handle spurious wakeups. See Object#wait. You can definitely remove the second sleep (unless there's a concrete reason not to). was (Author: appy): bq. int sleepTime = RandomUtils.nextInt(0, 100) + 500; Why randomize? Can be constant? bq if (taskGrabbed == 0 && !shouldStop) { So there are 2 available splitters, and one grabbed task, we don't stop here and keep hammering zk? Probably change taskGrabbed {quote} int idx = (i + offset) % paths.size(); 446 // don't call ZKSplitLog.getNodeName() because that will lead to 447 // double encoding of the path name 448 taskGrabbed += grabTask(ZNodePaths.joinZNode(watcher.znodePaths.splitLogZNode, paths.get(idx))) ? 1 : 0; {quote} Can do it in "if" condition itself? bq. taskReadySeq.wait may not execute because it has condition. That while condition is just to handle spurious wakeups. See Object#wait. You can definitely remove the second sleep (unless there's a concrete reason not to). > Reduce zk request when doing split log > -- > > Key: HBASE-19290 > URL: https://issues.apache.org/jira/browse/HBASE-19290 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: binlijin > Attachments: HBASE-19290.master.001.patch, > HBASE-19290.master.002.patch > > > We observe once the cluster has 1000+ nodes and when hundreds of nodes abort > and doing split log, the split is very very slow, and we find the > regionserver and master wait on the zookeeper response, so we need to reduce > zookeeper request and pressure for big cluster. > (1) Reduce request to rsZNode, every time calculateAvailableSplitters will > get rsZNode's children from zookeeper, when cluster is huge, this is heavy. > This patch reduce the request. > (2) When the regionserver has max split tasks running, it may still trying to > grab task and issue zookeeper request, we should sleep and wait until we can > grab tasks again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19290) Reduce zk request when doing split log
[ https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260121#comment-16260121 ] Appy commented on HBASE-19290: -- bq. int sleepTime = RandomUtils.nextInt(0, 100) + 500; Why randomize? Can be constant? bq if (taskGrabbed == 0 && !shouldStop) { So there are 2 available splitters, and one grabbed task, we don't stop here and keep hammering zk? Probably change taskGrabbed {quote} int idx = (i + offset) % paths.size(); 446 // don't call ZKSplitLog.getNodeName() because that will lead to 447 // double encoding of the path name 448 taskGrabbed += grabTask(ZNodePaths.joinZNode(watcher.znodePaths.splitLogZNode, paths.get(idx))) ? 1 : 0; {quote} Can do it in "if" condition itself? bq. taskReadySeq.wait may not execute because it has condition. That while condition is just to handle spurious wakeups. See Object#wait. You can definitely remove the second sleep (unless there's a concrete reason not to). > Reduce zk request when doing split log > -- > > Key: HBASE-19290 > URL: https://issues.apache.org/jira/browse/HBASE-19290 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: binlijin > Attachments: HBASE-19290.master.001.patch, > HBASE-19290.master.002.patch > > > We observe once the cluster has 1000+ nodes and when hundreds of nodes abort > and doing split log, the split is very very slow, and we find the > regionserver and master wait on the zookeeper response, so we need to reduce > zookeeper request and pressure for big cluster. > (1) Reduce request to rsZNode, every time calculateAvailableSplitters will > get rsZNode's children from zookeeper, when cluster is huge, this is heavy. > This patch reduce the request. > (2) When the regionserver has max split tasks running, it may still trying to > grab task and issue zookeeper request, we should sleep and wait until we can > grab tasks again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19287) master hangs forever if RecoverMeta send assign meta region request to target server fail
[ https://issues.apache.org/jira/browse/HBASE-19287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260100#comment-16260100 ] stack commented on HBASE-19287: --- Thanks for digging in here [~easyliangjob] Yeah. That looks like a bind. Master is starting up. Assigns hbase:meta. Assign fails because server crashes soon after assign. Master is starting up so queues crash processing (because ServerCrashProcedure can't run until after HMaster comes up). Should the assign of hbase:meta be synchronous so it can timeout/verify the hbase:meta assign, the important needed to get us up off the ground? Or at least, if we get a crash for the server we are currently trying to assign hbase:meta too during startup, we should notice and recalibrate the assign? In the AMv2 doc., I claim that I talk 'elsewhere' about the fact that assigns never timeout and we need a timeout... But there is no 'elsewhere' where I talk of timeout (smile): https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.y71bhu8smpqp General contract is that we assign async and we will continue w/ the assign for ever unless we get notice of a server crash; see servercrashprocedure how it fails any ongoing assigns/unassigns as part of its cleanup. > master hangs forever if RecoverMeta send assign meta region request to target > server fail > - > > Key: HBASE-19287 > URL: https://issues.apache.org/jira/browse/HBASE-19287 > Project: HBase > Issue Type: Bug >Reporter: Yi Liang > > 2017-11-10 19:26:56,019 INFO [ProcExecWrkr-1] > procedure.RecoverMetaProcedure: pid=138, > state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure > failedMetaServer=null, splitWal=true; Retaining meta assignment to > server=hadoop-slave1.hadoop,16020,1510341981454 > 2017-11-10 19:26:56,029 INFO [ProcExecWrkr-1] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454}] > 2017-11-10 19:26:56,067 INFO [ProcExecWrkr-2] > procedure.MasterProcedureScheduler: pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454 hbase:meta > hbase:meta,,1.1588230740 > 2017-11-10 19:26:56,071 INFO [ProcExecWrkr-2] assignment.AssignProcedure: > Start pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; > AssignProcedure table=hbase:meta, region=1588230740, > target=hadoop-slave1.hadoop,16020,1510341981454; rit=OFFLINE, > location=hadoop-slave1.hadoop,16020,1510341981454; forceNewPlan=false, > retain=false > 2017-11-10 19:26:56,224 INFO [ProcExecWrkr-4] zookeeper.MetaTableLocator: > Setting hbase:meta (replicaId=0) location in ZooKeeper as > hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:56,230 INFO [ProcExecWrkr-4] > assignment.RegionTransitionProcedure: Dispatch pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454; > rit=OPENING, location=hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:56,382 INFO [ProcedureDispatcherTimeoutThread] > procedure.RSProcedureDispatcher: Using procedure batch rpc execution for > serverName=hadoop-slave2.hadoop,16020,1510341988652 version=2097152 > 2017-11-10 19:26:57,542 INFO [main-EventThread] > zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, > processing expiration [hadoop-slave2.hadoop,16020,1510341988652] > 2017-11-10 19:26:57,543 INFO [main-EventThread] master.ServerManager: Master > doesn't enable ServerShutdownHandler during initialization, delay expiring > server hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:58,875 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Registering > server=hadoop-slave1.hadoop,16020,1510342016106 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Registering > server=hadoop-slave2.hadoop,16020,1510342023184 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Triggering server recovery; existingServer > hadoop-slave2.hadoop,16020,1510341988652 looks stale, new > server:hadoop-slave2.hadoop,16020,1510342023184 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Master doesn't enable ServerShutdownHandler during > initialization, delay expiring server hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:27:49,815 INFO >
[jira] [Resolved] (HBASE-19305) hbase alter table very slowly
[ https://issues.apache.org/jira/browse/HBASE-19305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-19305. --- Resolution: Invalid [~calvink] Please ask for help up on the user mailing list [1]. JIRA is not for help with runtime. Thanks. 1. http://hbase.apache.org/mail-lists.html > hbase alter table very slowly > - > > Key: HBASE-19305 > URL: https://issues.apache.org/jira/browse/HBASE-19305 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0 > Environment: OS: CentOS release 6.5 (Final) > hadoop : hadoop-2.5.1 > hbase:hbase-1.3.0 > zookeeper:zookeeper-3.4.6 >Reporter: Elians Wan > Labels: beginner > > when i alter hbase ttl, i found the alter cannot success for more than one > day, the table size is not large enough. > The the table size is : 303.6 G /hbase/data/default/SqlMetaData_Ver2 > The alter begin time is : > 2017-11-17 15:14:22,128 > but util 2017-11-20 18:00, the alter action has not been completed,so I do > not know what went wrong ? > Someone can help me? Thank you very much! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19287) master hangs forever if RecoverMeta send assign meta region request to target server fail
[ https://issues.apache.org/jira/browse/HBASE-19287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260014#comment-16260014 ] Yi Liang commented on HBASE-19287: -- Workers stuck at assign hbase-meta, there seems no mechanism for a timeout procedure. Still dig into the code > master hangs forever if RecoverMeta send assign meta region request to target > server fail > - > > Key: HBASE-19287 > URL: https://issues.apache.org/jira/browse/HBASE-19287 > Project: HBase > Issue Type: Bug >Reporter: Yi Liang > > 2017-11-10 19:26:56,019 INFO [ProcExecWrkr-1] > procedure.RecoverMetaProcedure: pid=138, > state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure > failedMetaServer=null, splitWal=true; Retaining meta assignment to > server=hadoop-slave1.hadoop,16020,1510341981454 > 2017-11-10 19:26:56,029 INFO [ProcExecWrkr-1] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454}] > 2017-11-10 19:26:56,067 INFO [ProcExecWrkr-2] > procedure.MasterProcedureScheduler: pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454 hbase:meta > hbase:meta,,1.1588230740 > 2017-11-10 19:26:56,071 INFO [ProcExecWrkr-2] assignment.AssignProcedure: > Start pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; > AssignProcedure table=hbase:meta, region=1588230740, > target=hadoop-slave1.hadoop,16020,1510341981454; rit=OFFLINE, > location=hadoop-slave1.hadoop,16020,1510341981454; forceNewPlan=false, > retain=false > 2017-11-10 19:26:56,224 INFO [ProcExecWrkr-4] zookeeper.MetaTableLocator: > Setting hbase:meta (replicaId=0) location in ZooKeeper as > hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:56,230 INFO [ProcExecWrkr-4] > assignment.RegionTransitionProcedure: Dispatch pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454; > rit=OPENING, location=hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:56,382 INFO [ProcedureDispatcherTimeoutThread] > procedure.RSProcedureDispatcher: Using procedure batch rpc execution for > serverName=hadoop-slave2.hadoop,16020,1510341988652 version=2097152 > 2017-11-10 19:26:57,542 INFO [main-EventThread] > zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, > processing expiration [hadoop-slave2.hadoop,16020,1510341988652] > 2017-11-10 19:26:57,543 INFO [main-EventThread] master.ServerManager: Master > doesn't enable ServerShutdownHandler during initialization, delay expiring > server hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:58,875 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Registering > server=hadoop-slave1.hadoop,16020,1510342016106 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Registering > server=hadoop-slave2.hadoop,16020,1510342023184 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Triggering server recovery; existingServer > hadoop-slave2.hadoop,16020,1510341988652 looks stale, new > server:hadoop-slave2.hadoop,16020,1510342023184 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Master doesn't enable ServerShutdownHandler during > initialization, delay expiring server hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:27:49,815 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > client.RpcRetryingCallerImpl: tarted=38594 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: hbase:meta,,1 is not > online on hadoop-slave2.hadoop,16020,1510342023184 > at > org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3290) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1370) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2401) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41544) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:278) > at >
[jira] [Commented] (HBASE-19287) master hangs forever if RecoverMeta send assign meta region request to target server fail
[ https://issues.apache.org/jira/browse/HBASE-19287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16260012#comment-16260012 ] Yi Liang commented on HBASE-19287: -- {code} 836 2017-11-20 23:05:24,829 INFO [ProcExecWrkr-2] client.AsyncRequestFutureImpl: #1, waiting for 1 actions to finish on table: hbase:meta 837 2017-11-20 23:05:28,570 WARN [ProcExecTimeout] procedure2.ProcedureExecutor: Worker stuck ProcExecWrkr-2(pid=81) run time 13.8040sec 838 2017-11-20 23:05:33,571 WARN [ProcExecTimeout] procedure2.ProcedureExecutor: Worker stuck ProcExecWrkr-2(pid=81) run time 18.8050sec 839 2017-11-20 23:05:34,836 INFO [ProcExecWrkr-2] client.AsyncRequestFutureImpl: #1, waiting for 1 actions to finish on table: hbase:meta 840 2017-11-20 23:05:38,572 WARN [ProcExecTimeout] procedure2.ProcedureExecutor: Worker stuck ProcExecWrkr-2(pid=81) run time 23.8060sec {code} > master hangs forever if RecoverMeta send assign meta region request to target > server fail > - > > Key: HBASE-19287 > URL: https://issues.apache.org/jira/browse/HBASE-19287 > Project: HBase > Issue Type: Bug >Reporter: Yi Liang > > 2017-11-10 19:26:56,019 INFO [ProcExecWrkr-1] > procedure.RecoverMetaProcedure: pid=138, > state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure > failedMetaServer=null, splitWal=true; Retaining meta assignment to > server=hadoop-slave1.hadoop,16020,1510341981454 > 2017-11-10 19:26:56,029 INFO [ProcExecWrkr-1] procedure2.ProcedureExecutor: > Initialized subprocedures=[{pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454}] > 2017-11-10 19:26:56,067 INFO [ProcExecWrkr-2] > procedure.MasterProcedureScheduler: pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454 hbase:meta > hbase:meta,,1.1588230740 > 2017-11-10 19:26:56,071 INFO [ProcExecWrkr-2] assignment.AssignProcedure: > Start pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; > AssignProcedure table=hbase:meta, region=1588230740, > target=hadoop-slave1.hadoop,16020,1510341981454; rit=OFFLINE, > location=hadoop-slave1.hadoop,16020,1510341981454; forceNewPlan=false, > retain=false > 2017-11-10 19:26:56,224 INFO [ProcExecWrkr-4] zookeeper.MetaTableLocator: > Setting hbase:meta (replicaId=0) location in ZooKeeper as > hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:56,230 INFO [ProcExecWrkr-4] > assignment.RegionTransitionProcedure: Dispatch pid=139, ppid=138, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta, > region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454; > rit=OPENING, location=hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:56,382 INFO [ProcedureDispatcherTimeoutThread] > procedure.RSProcedureDispatcher: Using procedure batch rpc execution for > serverName=hadoop-slave2.hadoop,16020,1510341988652 version=2097152 > 2017-11-10 19:26:57,542 INFO [main-EventThread] > zookeeper.RegionServerTracker: RegionServer ephemeral node deleted, > processing expiration [hadoop-slave2.hadoop,16020,1510341988652] > 2017-11-10 19:26:57,543 INFO [main-EventThread] master.ServerManager: Master > doesn't enable ServerShutdownHandler during initialization, delay expiring > server hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:26:58,875 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Registering > server=hadoop-slave1.hadoop,16020,1510342016106 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Registering > server=hadoop-slave2.hadoop,16020,1510342023184 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Triggering server recovery; existingServer > hadoop-slave2.hadoop,16020,1510341988652 looks stale, new > server:hadoop-slave2.hadoop,16020,1510342023184 > 2017-11-10 19:27:05,832 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > master.ServerManager: Master doesn't enable ServerShutdownHandler during > initialization, delay expiring server hadoop-slave2.hadoop,16020,1510341988652 > 2017-11-10 19:27:49,815 INFO > [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] > client.RpcRetryingCallerImpl: tarted=38594 ms ago, cancelled=false, > msg=org.apache.hadoop.hbase.NotServingRegionException: hbase:meta,,1 is not > online on hadoop-slave2.hadoop,16020,1510342023184 > at >
[jira] [Commented] (HBASE-16890) Analyze the performance of AsyncWAL and fix the same
[ https://issues.apache.org/jira/browse/HBASE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259995#comment-16259995 ] Duo Zhang commented on HBASE-16890: --- It is designed to be single threaded and we can use multi WAL to increase performance. And for AsyncFSWAL#consume is non blocking. All I/Os are handled by netty asynchronously. And I think it is OK to set a lower timeout, you can have a try. One goal of the new DFSOutputStream is that it will fail fast without any retry when we hit an IO error as for WAL, we can just open a new file and write to it, no need to try recovering the old pipeline. Thanks. > Analyze the performance of AsyncWAL and fix the same > > > Key: HBASE-16890 > URL: https://issues.apache.org/jira/browse/HBASE-16890 > Project: HBase > Issue Type: Sub-task > Components: wal >Affects Versions: 2.0.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Blocker > Fix For: 2.0.0-beta-1 > > Attachments: AsyncWAL_disruptor.patch, AsyncWAL_disruptor_1 > (2).patch, AsyncWAL_disruptor_3.patch, AsyncWAL_disruptor_3.patch, > AsyncWAL_disruptor_4.patch, AsyncWAL_disruptor_6.patch, > HBASE-16890-rc-v2.patch, HBASE-16890-rc-v3.patch, > HBASE-16890-remove-contention-v1.patch, HBASE-16890-remove-contention.patch, > Screen Shot 2016-10-25 at 7.34.47 PM.png, Screen Shot 2016-10-25 at 7.39.07 > PM.png, Screen Shot 2016-10-25 at 7.39.48 PM.png, Screen Shot 2016-11-04 at > 5.21.27 PM.png, Screen Shot 2016-11-04 at 5.30.18 PM.png, async.svg, > classic.svg, contention.png, contention_defaultWAL.png > > > Tests reveal that AsyncWAL under load in single node cluster performs slower > than the Default WAL. This task is to analyze and see if we could fix it. > See some discussions in the tail of JIRA HBASE-15536. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19310) Verify IntegrationTests don't rely on Rules outside of JUnit context
[ https://issues.apache.org/jira/browse/HBASE-19310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HBASE-19310: --- Description: {noformat} 2017-11-16 00:43:41,204 INFO [main] mapreduce.IntegrationTestImportTsv: Running test testGenerateAndLoad. Exception in thread "main" java.lang.NullPointerException at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:461) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.testGenerateAndLoad(IntegrationTestImportTsv.java:189) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.run(IntegrationTestImportTsv.java:229) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.main(IntegrationTestImportTsv.java:239) {noformat} (Potential line-number skew) {code} @Test public void testGenerateAndLoad() throws Exception { LOG.info("Running test testGenerateAndLoad."); final TableName table = TableName.valueOf(name.getMethodName()); {code} The JUnit framework sets the test method name inside of the JUnit {{Rule}}. When we invoke the test directly (ala {{hbase org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv}}), this {{getMethodName()}} returns {{null}} and we get the above stacktrace. Should make a pass over the ITs with main methods and {{Rule}}'s to make sure we don't have this lurking. Another alternative is to just remove the main methods and just force use of {{IntegrationTestsDriver}} instead. was: {noformat} 2017-11-16 00:43:41,204 INFO [main] mapreduce.IntegrationTestImportTsv: Running test testGenerateAndLoad. Exception in thread "main" java.lang.NullPointerException at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:461) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.testGenerateAndLoad(IntegrationTestImportTsv.java:189) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.run(IntegrationTestImportTsv.java:229) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.main(IntegrationTestImportTsv.java:239) {noformat} (Potential line-number skew) {code} @Test public void testGenerateAndLoad() throws Exception { LOG.info("Running test testGenerateAndLoad."); final TableName table = TableName.valueOf(name.getMethodName()); {code} The JUnit framework sets the test method name inside of the JUnit {{Rule}}. When we invoke the test directly (ala {{hbase org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv}}), this {{getMethodName()}} returns {{null}} and we get the above stacktrace. Should make a pass over the ITs with main methods and {{Rule}}s to make sure we don't have this lurking. Another alternative is to just remove the main methods and just force use of {{IntegrationTestsDriver}} instead. > Verify IntegrationTests don't rely on Rules outside of JUnit context > > > Key: HBASE-19310 > URL: https://issues.apache.org/jira/browse/HBASE-19310 > Project: HBase > Issue Type: Bug > Components: integration tests >Reporter: Romil Choksi >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0-beta-1 > > > {noformat} > 2017-11-16 00:43:41,204 INFO [main] mapreduce.IntegrationTestImportTsv: > Running test testGenerateAndLoad. > Exception in thread "main" java.lang.NullPointerException > at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:461) > at > org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.testGenerateAndLoad(IntegrationTestImportTsv.java:189) > at > org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.run(IntegrationTestImportTsv.java:229) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at > org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.main(IntegrationTestImportTsv.java:239) > {noformat} > (Potential line-number skew) > {code} > @Test > public void testGenerateAndLoad() throws Exception { > LOG.info("Running test testGenerateAndLoad."); > final TableName table = TableName.valueOf(name.getMethodName()); > {code} > The JUnit framework sets the test method name inside of the JUnit {{Rule}}. > When we invoke the test directly (ala {{hbase > org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv}}), this > {{getMethodName()}} returns {{null}} and we get the above stacktrace. > Should make a pass over the ITs with main methods and {{Rule}}'s to make sure > we don't have this lurking. Another alternative is to just remove the main > methods and just force use of {{IntegrationTestsDriver}} instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-17165) Add retry to LoadIncrementalHFiles tool
[ https://issues.apache.org/jira/browse/HBASE-17165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Grimes updated HBASE-17165: Attachment: HBASE-17165.master.004.patch HBASE-17165.branch-1.002.patch Reuploading to retry timed-out tests.. > Add retry to LoadIncrementalHFiles tool > --- > > Key: HBASE-17165 > URL: https://issues.apache.org/jira/browse/HBASE-17165 > Project: HBase > Issue Type: Improvement > Components: hbase, HFile, tooling >Affects Versions: 2.0.0, 1.2.3 >Reporter: Mike Grimes >Assignee: Mike Grimes >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17165.branch-1.001.patch, > HBASE-17165.branch-1.001.patch, HBASE-17165.branch-1.002.patch, > HBASE-17165.branch-1.002.patch, HBASE-17165.branch-1.2.001.patch, > HBASE-17165.branch-1.2.002.patch, HBASE-17165.branch-1.2.003.patch, > HBASE-17165.branch-1.2.004.patch, HBASE-17165.master.001.patch, > HBASE-17165.master.002.patch, HBASE-17165.master.003.patch, > HBASE-17165.master.004.patch, HBASE-17165.master.004.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > As using the LoadIncrementalHFiles tool with S3 as the filesystem is prone to > failing due to FileNotFoundExceptions due to inconsistency, simple, > configurable retry logic was added. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19308) batch processing does not consider partial success
[ https://issues.apache.org/jira/browse/HBASE-19308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huaxiang sun updated HBASE-19308: - Fix Version/s: 2.0.0 > batch processing does not consider partial success > -- > > Key: HBASE-19308 > URL: https://issues.apache.org/jira/browse/HBASE-19308 > Project: HBase > Issue Type: Bug >Reporter: huaxiang sun >Assignee: huaxiang sun > Fix For: 2.0.0 > > > While working on HBASE-19163, some unittest failures exposes an interesting > issue. > When client sends a batch, at the region server, it can be divided into > multiple minibatchs. If one of minibatches runs into some exception(resource > for an example), the exception gets back to the client and it thinks that the > whole batch fails. The client will send the whole batch again will cause the > already-succeeded minibatches to be processed again. We need to have > mechanism to allow partial success of the batch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259981#comment-16259981 ] Duo Zhang commented on HBASE-19266: --- Anyway I do not think it is necessary to start/stop mini cluster every time. We have 6 * 3 = 18 tests, and with one more parameter it will be 24, start/stop mini cluster will spend more than 10 mins... > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19310) Verify IntegrationTests don't rely on Rules outside of JUnit context
Josh Elser created HBASE-19310: -- Summary: Verify IntegrationTests don't rely on Rules outside of JUnit context Key: HBASE-19310 URL: https://issues.apache.org/jira/browse/HBASE-19310 Project: HBase Issue Type: Bug Components: integration tests Reporter: Romil Choksi Assignee: Josh Elser Priority: Critical Fix For: 2.0.0-beta-1 {noformat} 2017-11-16 00:43:41,204 INFO [main] mapreduce.IntegrationTestImportTsv: Running test testGenerateAndLoad. Exception in thread "main" java.lang.NullPointerException at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:461) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.testGenerateAndLoad(IntegrationTestImportTsv.java:189) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.run(IntegrationTestImportTsv.java:229) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv.main(IntegrationTestImportTsv.java:239) {noformat} (Potential line-number skew) {code} @Test public void testGenerateAndLoad() throws Exception { LOG.info("Running test testGenerateAndLoad."); final TableName table = TableName.valueOf(name.getMethodName()); {code} The JUnit framework sets the test method name inside of the JUnit {{Rule}}. When we invoke the test directly (ala {{hbase org.apache.hadoop.hbase.mapreduce.IntegrationTestImportTsv}}), this {{getMethodName()}} returns {{null}} and we get the above stacktrace. Should make a pass over the ITs with main methods and {{Rule}}s to make sure we don't have this lurking. Another alternative is to just remove the main methods and just force use of {{IntegrationTestsDriver}} instead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259961#comment-16259961 ] Duo Zhang commented on HBASE-19266: --- What is the running time before HBASE-19200? It seems that we only have 5 read/writers for each test, not a big number. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-16574) Add backup / restore feature to refguide
[ https://issues.apache.org/jira/browse/HBASE-16574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259955#comment-16259955 ] Hudson commented on HBASE-16574: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4088 (See [https://builds.apache.org/job/HBase-Trunk_matrix/4088/]) HBASE-16574 Book updates for backup and restore (elserj: rev 8f806ab48643b16a975691bf6edf7887706327f1) * (add) src/main/site/resources/images/backup-app-components.png * (edit) src/main/asciidoc/book.adoc * (add) src/main/site/resources/images/backup-cloud-appliance.png * (add) src/main/site/resources/images/backup-intra-cluster.png * (add) src/main/asciidoc/_chapters/backup_restore.adoc * (add) src/main/site/resources/images/backup-dedicated-cluster.png > Add backup / restore feature to refguide > > > Key: HBASE-16574 > URL: https://issues.apache.org/jira/browse/HBASE-16574 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Frank Welsch > Labels: backup > Fix For: 2.0.0-beta-1 > > Attachments: B command-line tools and configuration (updated).pdf, > Backup-and-Restore-Apache_19Sep2016.pdf, HBASE-16574.001.patch, > HBASE-16574.002.patch, HBASE-16574.003.branch-2.patch, > HBASE-16574.004.branch-2.patch, HBASE-16574.005.branch-2.patch, > HBASE-16574.006.branch-2.patch, HBASE-16574.007.branch-2.patch, > HBASE-16574.008.branch-2.patch, HBASE-16574.009.branch-2.patch, > apache_hbase_reference_guide_004.pdf, apache_hbase_reference_guide_007.pdf, > apache_hbase_reference_guide_008.pdf, apache_hbase_reference_guide_009.pdf, > hbase-book-16574.003.pdf, hbase_reference_guide.v1.pdf > > > This issue is to add backup / restore feature description to hbase refguide. > The description should cover: > scenarios where backup / restore is used > backup / restore commands and sample usage > considerations in setup -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19300) TestMultithreadedTableMapper fails in branch-1.4
[ https://issues.apache.org/jira/browse/HBASE-19300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259954#comment-16259954 ] Ted Yu commented on HBASE-19300: After several test runs, it seems commit 108ea30e3d84150b64fd644747c2f17170594704 caused the test to fail. With commit daf1fc6f1f0ee54221bc26692cedcb6478958a28, the test passes. [~apurtell] FYI > TestMultithreadedTableMapper fails in branch-1.4 > > > Key: HBASE-19300 > URL: https://issues.apache.org/jira/browse/HBASE-19300 > Project: HBase > Issue Type: Test >Reporter: Ted Yu > > From > https://builds.apache.org/job/HBase-1.4/1023/jdk=JDK_1_7,label=Hadoop&&!H13/testReport/org.apache.hadoop.hbase.mapreduce/TestMultithreadedTableMapper/testMultithreadedTableMapper/ > : > {code} > java.lang.AssertionError > at > org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper.verify(TestMultithreadedTableMapper.java:195) > at > org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper.runTestOnTable(TestMultithreadedTableMapper.java:163) > at > org.apache.hadoop.hbase.mapreduce.TestMultithreadedTableMapper.testMultithreadedTableMapper(TestMultithreadedTableMapper.java:136) > {code} > I ran the test locally which failed. > Noticed the following in test output: > {code} > 2017-11-18 19:28:13,929 ERROR [hconnection-0x11db8653-shared--pool24-t9] > protobuf.ResponseConverter(425): Results sent from server=703. But only got 0 > results completely atclient. Resetting the scanner to scan again. > 2017-11-18 19:28:13,929 ERROR [hconnection-0x11db8653-shared--pool24-t3] > protobuf.ResponseConverter(425): Results sent from server=703. But only got 0 > results completely atclient. Resetting the scanner to scan again. > 2017-11-18 19:28:14,461 ERROR [hconnection-0x11db8653-shared--pool24-t8] > protobuf.ResponseConverter(432): Exception while reading cells from > result.Resetting the scanner toscan again. > org.apache.hadoop.hbase.DoNotRetryIOException: Results sent from server=703. > But only got 0 results completely at client. Resetting the scanner to scan > again. > at > org.apache.hadoop.hbase.protobuf.ResponseConverter.getResults(ResponseConverter.java:426) > at > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:284) > at > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:219) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:388) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:362) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:142) > at > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2017-11-18 19:28:14,464 ERROR [hconnection-0x11db8653-shared--pool24-t2] > protobuf.ResponseConverter(432): Exception while reading cells from > result.Resetting the scanner toscan again. > java.io.EOFException: Partial cell read > at > org.apache.hadoop.hbase.codec.BaseDecoder.rethrowEofException(BaseDecoder.java:86) > at org.apache.hadoop.hbase.codec.BaseDecoder.advance(BaseDecoder.java:70) > at > org.apache.hadoop.hbase.protobuf.ResponseConverter.getResults(ResponseConverter.java:419) > at > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:284) > at > org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:219) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:388) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:362) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:142) > at > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Premature EOF from inputStream > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:202) > at
[jira] [Commented] (HBASE-19299) Assert only one Connection is constructed when calculating splits in a MultiTableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259950#comment-16259950 ] Duo Zhang commented on HBASE-19299: --- Thanks [~stack]. > Assert only one Connection is constructed when calculating splits in a > MultiTableInputFormat > > > Key: HBASE-19299 > URL: https://issues.apache.org/jira/browse/HBASE-19299 > Project: HBase > Issue Type: Test > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 2.0.0-beta-1 > > Attachments: > 0002-HBASE-19299-Assert-only-one-Connection-is-constructe.patch > > > Small test that verifies only one Connection made when calculating splits. We > used to put up one per Table until recently and before that, a Connection per > table per split which put a heavy load on hbase;meta. Here is a test to > ensure we don't go back to the bad-old-days. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259948#comment-16259948 ] Ted Yu commented on HBASE-19266: bq. inject a special implementation to make the test run faster. We need to consider real life client behavior - not just in test environment. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19163) "Maximum lock count exceeded" from region server's batch processing
[ https://issues.apache.org/jira/browse/HBASE-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259945#comment-16259945 ] huaxiang sun commented on HBASE-19163: -- Post another patch to address unittest failures. [~saint@gmail.com], [~allan163], [~tedyu], comments? Thanks. > "Maximum lock count exceeded" from region server's batch processing > --- > > Key: HBASE-19163 > URL: https://issues.apache.org/jira/browse/HBASE-19163 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 3.0.0, 1.2.7, 2.0.0-alpha-3 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-19163-master-v001.patch, > HBASE-19163.master.001.patch, HBASE-19163.master.002.patch, > HBASE-19163.master.004.patch, HBASE-19163.master.005.patch, > HBASE-19163.master.006.patch, unittest-case.diff > > > In one of use cases, we found the following exception and replication is > stuck. > {code} > 2017-10-25 19:41:17,199 WARN [hconnection-0x28db294f-shared--pool4-t936] > client.AsyncProcess: #3, table=foo, attempt=5/5 failed=262836ops, last > exception: java.io.IOException: java.io.IOException: Maximum lock count > exceeded > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2215) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165) > Caused by: java.lang.Error: Maximum lock count exceeded > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(ReentrantReadWriteLock.java:528) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(ReentrantReadWriteLock.java:488) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1327) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871) > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:5163) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3018) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2877) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2819) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:753) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:715) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2148) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33656) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170) > ... 3 more > {code} > While we are still examining the data pattern, it is sure that there are too > many mutations in the batch against the same row, this exceeds the maximum > 64k shared lock count and it throws an error and failed the whole batch. > There are two approaches to solve this issue. > 1). Let's say there are mutations against the same row in the batch, we just > need to acquire the lock once for the same row vs to acquire the lock for > each mutation. > 2). We catch the error and start to process whatever it gets and loop back. > With HBASE-17924, approach 1 seems easy to implement now. > Create the jira and will post update/patch when investigation moving forward. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19289) CommonFSUtils$StreamLacksCapabilityException: hflush when running test against hadoop3 beta1
[ https://issues.apache.org/jira/browse/HBASE-19289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259947#comment-16259947 ] Sean Busbey commented on HBASE-19289: - [~yuzhih...@gmail.com] v1 is expressly unsafe based on the discussion so far, please don't commit it. How onerous is converting the failed tests to use MiniDFSCluster instead of LocalFileSystem? Does this mean we're also broken in standalone mode? > CommonFSUtils$StreamLacksCapabilityException: hflush when running test > against hadoop3 beta1 > > > Key: HBASE-19289 > URL: https://issues.apache.org/jira/browse/HBASE-19289 > Project: HBase > Issue Type: Test >Reporter: Ted Yu > Attachments: 19289.v1.txt > > > As of commit d8fb10c8329b19223c91d3cda6ef149382ad4ea0 , I encountered the > following exception when running unit test against hadoop3 beta1: > {code} > testRefreshStoreFiles(org.apache.hadoop.hbase.regionserver.TestHStore) Time > elapsed: 0.061 sec <<< ERROR! > java.io.IOException: cannot get log writer > at > org.apache.hadoop.hbase.regionserver.TestHStore.initHRegion(TestHStore.java:215) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:220) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:195) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:190) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:185) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:179) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:173) > at > org.apache.hadoop.hbase.regionserver.TestHStore.testRefreshStoreFiles(TestHStore.java:962) > Caused by: > org.apache.hadoop.hbase.util.CommonFSUtils$StreamLacksCapabilityException: > hflush > at > org.apache.hadoop.hbase.regionserver.TestHStore.initHRegion(TestHStore.java:215) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:220) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:195) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:190) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:185) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:179) > at > org.apache.hadoop.hbase.regionserver.TestHStore.init(TestHStore.java:173) > at > org.apache.hadoop.hbase.regionserver.TestHStore.testRefreshStoreFiles(TestHStore.java:962) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259944#comment-16259944 ] Duo Zhang commented on HBASE-19266: --- ConnectionImplementation needs to fetch the cluster id from zk even without HBASE-19200 so a zk connection is always necessary. So this means that curator initialization is much slower than creating a ZooKeeper instance directly? [~mdrob] is this expected sir? And if connecting zk is the bottleneck, I think we can use the 'hbase.client.registry.impl ' to inject a special implementation to make the test run faster. Thanks. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19163) "Maximum lock count exceeded" from region server's batch processing
[ https://issues.apache.org/jira/browse/HBASE-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huaxiang sun updated HBASE-19163: - Attachment: HBASE-19163.master.006.patch > "Maximum lock count exceeded" from region server's batch processing > --- > > Key: HBASE-19163 > URL: https://issues.apache.org/jira/browse/HBASE-19163 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 3.0.0, 1.2.7, 2.0.0-alpha-3 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-19163-master-v001.patch, > HBASE-19163.master.001.patch, HBASE-19163.master.002.patch, > HBASE-19163.master.004.patch, HBASE-19163.master.005.patch, > HBASE-19163.master.006.patch, unittest-case.diff > > > In one of use cases, we found the following exception and replication is > stuck. > {code} > 2017-10-25 19:41:17,199 WARN [hconnection-0x28db294f-shared--pool4-t936] > client.AsyncProcess: #3, table=foo, attempt=5/5 failed=262836ops, last > exception: java.io.IOException: java.io.IOException: Maximum lock count > exceeded > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2215) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165) > Caused by: java.lang.Error: Maximum lock count exceeded > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(ReentrantReadWriteLock.java:528) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(ReentrantReadWriteLock.java:488) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1327) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871) > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:5163) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3018) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2877) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2819) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:753) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:715) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2148) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33656) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170) > ... 3 more > {code} > While we are still examining the data pattern, it is sure that there are too > many mutations in the batch against the same row, this exceeds the maximum > 64k shared lock count and it throws an error and failed the whole batch. > There are two approaches to solve this issue. > 1). Let's say there are mutations against the same row in the batch, we just > need to acquire the lock once for the same row vs to acquire the lock for > each mutation. > 2). We catch the error and start to process whatever it gets and loop back. > With HBASE-17924, approach 1 seems easy to implement now. > Create the jira and will post update/patch when investigation moving forward. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19299) Assert only one Connection is constructed when calculating splits in a MultiTableInputFormat
[ https://issues.apache.org/jira/browse/HBASE-19299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259939#comment-16259939 ] Hudson commented on HBASE-19299: FAILURE: Integrated in Jenkins build HBase-2.0 #886 (See [https://builds.apache.org/job/HBase-2.0/886/]) HBASE-19299 Assert only one Connection is constructed when calculating (stack: rev eb17a2f285fab68cc722065336af1e7f65a02eb2) * (add) hbase-mapreduce/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormatBase.java > Assert only one Connection is constructed when calculating splits in a > MultiTableInputFormat > > > Key: HBASE-19299 > URL: https://issues.apache.org/jira/browse/HBASE-19299 > Project: HBase > Issue Type: Test > Components: test >Reporter: stack >Assignee: stack >Priority: Minor > Fix For: 2.0.0-beta-1 > > Attachments: > 0002-HBASE-19299-Assert-only-one-Connection-is-constructe.patch > > > Small test that verifies only one Connection made when calculating splits. We > used to put up one per Table until recently and before that, a Connection per > table per split which put a heavy load on hbase;meta. Here is a test to > ensure we don't go back to the bad-old-days. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-16574) Add backup / restore feature to refguide
[ https://issues.apache.org/jira/browse/HBASE-16574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259940#comment-16259940 ] Hudson commented on HBASE-16574: FAILURE: Integrated in Jenkins build HBase-2.0 #886 (See [https://builds.apache.org/job/HBase-2.0/886/]) HBASE-16574 Book updates for backup and restore (elserj: rev 086a03797ef2e4d249f4ebfb1da02c8c9d9e4de7) * (edit) src/main/asciidoc/book.adoc * (add) src/main/site/resources/images/backup-intra-cluster.png * (add) src/main/site/resources/images/backup-cloud-appliance.png * (add) src/main/site/resources/images/backup-app-components.png * (add) src/main/site/resources/images/backup-dedicated-cluster.png * (add) src/main/asciidoc/_chapters/backup_restore.adoc > Add backup / restore feature to refguide > > > Key: HBASE-16574 > URL: https://issues.apache.org/jira/browse/HBASE-16574 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Frank Welsch > Labels: backup > Fix For: 2.0.0-beta-1 > > Attachments: B command-line tools and configuration (updated).pdf, > Backup-and-Restore-Apache_19Sep2016.pdf, HBASE-16574.001.patch, > HBASE-16574.002.patch, HBASE-16574.003.branch-2.patch, > HBASE-16574.004.branch-2.patch, HBASE-16574.005.branch-2.patch, > HBASE-16574.006.branch-2.patch, HBASE-16574.007.branch-2.patch, > HBASE-16574.008.branch-2.patch, HBASE-16574.009.branch-2.patch, > apache_hbase_reference_guide_004.pdf, apache_hbase_reference_guide_007.pdf, > apache_hbase_reference_guide_008.pdf, apache_hbase_reference_guide_009.pdf, > hbase-book-16574.003.pdf, hbase_reference_guide.v1.pdf > > > This issue is to add backup / restore feature description to hbase refguide. > The description should cover: > scenarios where backup / restore is used > backup / restore commands and sample usage > considerations in setup -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19266) TestAcidGuarantees should cover adaptive in-memory compaction
[ https://issues.apache.org/jira/browse/HBASE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259928#comment-16259928 ] Duo Zhang commented on HBASE-19266: --- Let me take a look first. > TestAcidGuarantees should cover adaptive in-memory compaction > - > > Key: HBASE-19266 > URL: https://issues.apache.org/jira/browse/HBASE-19266 > Project: HBase > Issue Type: Test >Reporter: Ted Yu >Priority: Minor > > Currently TestAcidGuarantees populates 3 policies of (in-memory) compaction. > Adaptive in-memory compaction is new and should be added as 4th compaction > policy. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19285) Add per-table latency histograms
[ https://issues.apache.org/jira/browse/HBASE-19285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HBASE-19285: --- Attachment: HBASE-19285.002.branch-1.3.patch .002 Forgot to update the MetricsRegionServer constructor calls after needing to pass down the Configuration. > Add per-table latency histograms > > > Key: HBASE-19285 > URL: https://issues.apache.org/jira/browse/HBASE-19285 > Project: HBase > Issue Type: Bug > Components: metrics >Reporter: Clay B. >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.3 > > Attachments: HBASE-19285.001.branch-1.3.patch, > HBASE-19285.002.branch-1.3.patch, HBaseTableLatencyMetrics.png > > > HBASE-17017 removed the per-region latency histograms (e.g. Get, Put, Scan at > p75, p85, etc) > HBASE-15518 added some per-table metrics, but not the latency histograms. > Given the previous conversations, it seems like it these per-table > aggregations weren't intentionally omitted, just never re-implemented after > the per-region removal. They're some really nice out-of-the-box metrics we > can provide to our users/admins as long as it's not detrimental. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17852) Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)
[ https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259911#comment-16259911 ] Vladimir Rodionov commented on HBASE-17852: --- {quote} 2) recover from client side failure (and, probably, implicitly meant to include un-handled server-side failure conditions too). {quote} For 2. we have backup repair tool, client will be asked to run repair tool next time he/she will try to run backup/restore/merge/delete > Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental > backup) > > > Key: HBASE-17852 > URL: https://issues.apache.org/jira/browse/HBASE-17852 > Project: HBase > Issue Type: Sub-task >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Fix For: 2.0.0-beta-1 > > Attachments: HBASE-17852-v1.patch, HBASE-17852-v2.patch, > HBASE-17852-v3.patch, HBASE-17852-v4.patch, HBASE-17852-v5.patch, > HBASE-17852-v6.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HBASE-18987) Raise value of HConstants#MAX_ROW_LENGTH
[ https://issues.apache.org/jira/browse/HBASE-18987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Esteban Gutierrez resolved HBASE-18987. --- Resolution: Later Solving as later since we could only do this with a new HFile format. > Raise value of HConstants#MAX_ROW_LENGTH > > > Key: HBASE-18987 > URL: https://issues.apache.org/jira/browse/HBASE-18987 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.0.0, 2.0.0 >Reporter: Esteban Gutierrez >Assignee: Esteban Gutierrez >Priority: Minor > Attachments: HBASE-18987.master.001.patch, > HBASE-18987.master.002.patch > > > Short.MAX_VALUE hasn't been a problem for a long time but one of our > customers ran into an edgy case when the midKey used for the split point was > very close to Short.MAX_VALUE. When the split is submitted, we attempt to > create the new two daughter regions and we name those regions via > {{HRegionInfo.createRegionName()}} in order to be added to META. > Unfortunately, since {{HRegionInfo.createRegionName()}} uses midKey as the > startKey {{Put}} will fail since the row key length will now fail checkRow > and thus causing the split to fail. > I tried a couple of alternatives to address this problem, e.g. truncating the > startKey. But the number of changes in the code doesn't justify for this edge > condition. Since we already use {{Integer.MAX_VALUE - 1}} for > {{HConstants#MAXIMUM_VALUE_LENGTH}} it should be ok to use the same limit for > the maximum row key. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19309) Lower HConstants#MAX_ROW_LENGTH as guardrail in order to avoid HBASE-18987
Esteban Gutierrez created HBASE-19309: - Summary: Lower HConstants#MAX_ROW_LENGTH as guardrail in order to avoid HBASE-18987 Key: HBASE-19309 URL: https://issues.apache.org/jira/browse/HBASE-19309 Project: HBase Issue Type: Bug Components: HFile, regionserver Reporter: Esteban Gutierrez As discussed in HBASE-18987. A problem of having a row about the maximum size of a row (Short.MAX_VALUE) is when a split happens, there is a possibility that the midkey could be that row and the Put created to add the new entry in META will exceed the maximum row size since the new row key will include the table name and that will cause the split to abort. Since is not possible to raise that row key size in HFileV3, a reasonable solution is to reduce the maximum size of row key in order to avoid exceeding Short.MAX_VALUE. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19285) Add per-table latency histograms
[ https://issues.apache.org/jira/browse/HBASE-19285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259887#comment-16259887 ] Hadoop QA commented on HBASE-19285: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} branch-1.3 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 49s{color} | {color:green} branch-1.3 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_141 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_151 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 36s{color} | {color:green} branch-1.3 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 54s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 36s{color} | {color:green} branch-1.3 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} branch-1.3 passed with JDK v1.8.0_141 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} branch-1.3 passed with JDK v1.7.0_151 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 1m 10s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 19s{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_141. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 19s{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_141. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 24s{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_151. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 24s{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_151. {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 10s{color} | {color:red} hbase-hadoop2-compat: The patch generated 1 new + 3 unchanged - 1 fixed = 4 total (was 4) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 16s{color} | {color:red} hbase-server: The patch generated 1 new + 139 unchanged - 1 fixed = 140 total (was 140) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedjars {color} | {color:red} 1m 21s{color} | {color:red} patch has 28 errors when building our shaded downstream artifacts. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 2m 8s{color} | {color:red} The patch causes 28 errors with Hadoop v2.4.0. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 2m 55s{color} | {color:red} The patch causes 28 errors with Hadoop v2.4.1. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 3m 43s{color} | {color:red} The patch causes 28 errors with Hadoop v2.5.0. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 4m 32s{color} | {color:red} The patch causes 28 errors with Hadoop v2.5.1. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 5m 19s{color} | {color:red} The patch causes 28 errors with Hadoop v2.5.2. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 6m
[jira] [Comment Edited] (HBASE-19291) Use common header and footer for JSP pages
[ https://issues.apache.org/jira/browse/HBASE-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259869#comment-16259869 ] Appy edited comment on HBASE-19291 at 11/20/17 9:23 PM: [~Apache9] I accessed all the pages on a standalone cluster and they looked alright. Thanks for the review [~elserj]. I am absolutely comfortable since i tried it like 2-3 times. :) But let me wait for reply from Duo before committing. was (Author: appy): [~Apache9] I accessed all the pages on a standalone cluster and they looked alright. Thanks for the review [~elserj]. Let me wait for reply from Duo before committing. > Use common header and footer for JSP pages > -- > > Key: HBASE-19291 > URL: https://issues.apache.org/jira/browse/HBASE-19291 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy > Attachments: HBASE-19291.master.001.patch > > > Use header and footer in our *.jsp pages to avoid unnecessary redundancy > (copy-paste of code) > (Been sitting in my local repo for long, best to get following pesky > user-facing things fixed before the next major release) > Misc edits: > - Due to redundancy, new additions make it to some places but not others. For > eg there are missing links to "/logLevel", "/processRS.jsp" in few places. > - Fix processMaster.jsp wrongly pointing to rs-status instead of > master-status (probably due to copy paste from processRS.jsp) > - Deleted a bunch of extraneous "" in processMaster.jsp & processRS.jsp > - Added missing tag in snapshot.jsp > - Deleted fossils of html5shiv.js. It's uses and the js itself were deleted > in the commit "819aed4ccd073d818bfef5931ec8d248bfae5f1f" > - Fixed wrongly matched heading tags > - Deleted some unused variables > Tested: > Ran standalone cluster and opened each page to make sure it looked right. > Sidenote: > Looks like HBASE-3835 started the work of converting from jsp to jamon, but > the work didn't finish. Now we have a mix of jsp and jamon. Needs > reconciling, but later. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19257) Document tool to dump information from MasterProcWALs file
[ https://issues.apache.org/jira/browse/HBASE-19257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259867#comment-16259867 ] Ted Yu commented on HBASE-19257: How about using procwalprinter as short name ? > Document tool to dump information from MasterProcWALs file > -- > > Key: HBASE-19257 > URL: https://issues.apache.org/jira/browse/HBASE-19257 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu > > I was troubleshooting customer case where high number of files piled up under > MasterProcWALs directory. > Gaining insight into (sample) file from MasterProcWALs dir would help find > the root cause. > This JIRA is to document ProcedureWALPrettyPrinter which reads proc wal file > and prints (selected) information. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HBASE-19291) Use common header and footer for JSP pages
[ https://issues.apache.org/jira/browse/HBASE-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259869#comment-16259869 ] Appy edited comment on HBASE-19291 at 11/20/17 9:22 PM: [~Apache9] I accessed all the pages on a standalone cluster and they looked alright. Thanks for the review [~elserj]. Let me wait for reply from Duo before committing. was (Author: appy): [~Apache9] I accessed all the pages on a standalone cluster and they looked alright. > Use common header and footer for JSP pages > -- > > Key: HBASE-19291 > URL: https://issues.apache.org/jira/browse/HBASE-19291 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy > Attachments: HBASE-19291.master.001.patch > > > Use header and footer in our *.jsp pages to avoid unnecessary redundancy > (copy-paste of code) > (Been sitting in my local repo for long, best to get following pesky > user-facing things fixed before the next major release) > Misc edits: > - Due to redundancy, new additions make it to some places but not others. For > eg there are missing links to "/logLevel", "/processRS.jsp" in few places. > - Fix processMaster.jsp wrongly pointing to rs-status instead of > master-status (probably due to copy paste from processRS.jsp) > - Deleted a bunch of extraneous "" in processMaster.jsp & processRS.jsp > - Added missing tag in snapshot.jsp > - Deleted fossils of html5shiv.js. It's uses and the js itself were deleted > in the commit "819aed4ccd073d818bfef5931ec8d248bfae5f1f" > - Fixed wrongly matched heading tags > - Deleted some unused variables > Tested: > Ran standalone cluster and opened each page to make sure it looked right. > Sidenote: > Looks like HBASE-3835 started the work of converting from jsp to jamon, but > the work didn't finish. Now we have a mix of jsp and jamon. Needs > reconciling, but later. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19291) Use common header and footer for JSP pages
[ https://issues.apache.org/jira/browse/HBASE-19291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259869#comment-16259869 ] Appy commented on HBASE-19291: -- [~Apache9] I accessed all the pages on a standalone cluster and they looked alright. > Use common header and footer for JSP pages > -- > > Key: HBASE-19291 > URL: https://issues.apache.org/jira/browse/HBASE-19291 > Project: HBase > Issue Type: Bug >Reporter: Appy >Assignee: Appy > Attachments: HBASE-19291.master.001.patch > > > Use header and footer in our *.jsp pages to avoid unnecessary redundancy > (copy-paste of code) > (Been sitting in my local repo for long, best to get following pesky > user-facing things fixed before the next major release) > Misc edits: > - Due to redundancy, new additions make it to some places but not others. For > eg there are missing links to "/logLevel", "/processRS.jsp" in few places. > - Fix processMaster.jsp wrongly pointing to rs-status instead of > master-status (probably due to copy paste from processRS.jsp) > - Deleted a bunch of extraneous "" in processMaster.jsp & processRS.jsp > - Added missing tag in snapshot.jsp > - Deleted fossils of html5shiv.js. It's uses and the js itself were deleted > in the commit "819aed4ccd073d818bfef5931ec8d248bfae5f1f" > - Fixed wrongly matched heading tags > - Deleted some unused variables > Tested: > Ran standalone cluster and opened each page to make sure it looked right. > Sidenote: > Looks like HBASE-3835 started the work of converting from jsp to jamon, but > the work didn't finish. Now we have a mix of jsp and jamon. Needs > reconciling, but later. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18987) Raise value of HConstants#MAX_ROW_LENGTH
[ https://issues.apache.org/jira/browse/HBASE-18987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259848#comment-16259848 ] Esteban Gutierrez commented on HBASE-18987: --- After some discussion offline with [~mdrob] around the same comments from [~anoopsamjohn] the only approach to address this correctly is by having a new HFileV4 format without this kind of limitations. For now I'm going to create a new issue to add a guard rail to avoid accepting a key near to {{Short.MAX_VALUE}} in order to avoid triggering this problem. > Raise value of HConstants#MAX_ROW_LENGTH > > > Key: HBASE-18987 > URL: https://issues.apache.org/jira/browse/HBASE-18987 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 1.0.0, 2.0.0 >Reporter: Esteban Gutierrez >Assignee: Esteban Gutierrez >Priority: Minor > Attachments: HBASE-18987.master.001.patch, > HBASE-18987.master.002.patch > > > Short.MAX_VALUE hasn't been a problem for a long time but one of our > customers ran into an edgy case when the midKey used for the split point was > very close to Short.MAX_VALUE. When the split is submitted, we attempt to > create the new two daughter regions and we name those regions via > {{HRegionInfo.createRegionName()}} in order to be added to META. > Unfortunately, since {{HRegionInfo.createRegionName()}} uses midKey as the > startKey {{Put}} will fail since the row key length will now fail checkRow > and thus causing the split to fail. > I tried a couple of alternatives to address this problem, e.g. truncating the > startKey. But the number of changes in the code doesn't justify for this edge > condition. Since we already use {{Integer.MAX_VALUE - 1}} for > {{HConstants#MAXIMUM_VALUE_LENGTH}} it should be ok to use the same limit for > the maximum row key. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19308) batch processing does not consider partial success
[ https://issues.apache.org/jira/browse/HBASE-19308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259831#comment-16259831 ] huaxiang sun commented on HBASE-19308: -- I will upload an unitest case in HBASE-19163. > batch processing does not consider partial success > -- > > Key: HBASE-19308 > URL: https://issues.apache.org/jira/browse/HBASE-19308 > Project: HBase > Issue Type: Bug >Reporter: huaxiang sun >Assignee: huaxiang sun > > While working on HBASE-19163, some unittest failures exposes an interesting > issue. > When client sends a batch, at the region server, it can be divided into > multiple minibatchs. If one of minibatches runs into some exception(resource > for an example), the exception gets back to the client and it thinks that the > whole batch fails. The client will send the whole batch again will cause the > already-succeeded minibatches to be processed again. We need to have > mechanism to allow partial success of the batch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19308) batch processing does not consider partial success
huaxiang sun created HBASE-19308: Summary: batch processing does not consider partial success Key: HBASE-19308 URL: https://issues.apache.org/jira/browse/HBASE-19308 Project: HBase Issue Type: Bug Reporter: huaxiang sun Assignee: huaxiang sun While working on HBASE-19163, some unittest failures exposes an interesting issue. When client sends a batch, at the region server, it can be divided into multiple minibatchs. If one of minibatches runs into some exception(resource for an example), the exception gets back to the client and it thinks that the whole batch fails. The client will send the whole batch again will cause the already-succeeded minibatches to be processed again. We need to have mechanism to allow partial success of the batch. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19304) KEEP_DELETED_CELLS should ignore case
[ https://issues.apache.org/jira/browse/HBASE-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259824#comment-16259824 ] Hadoop QA commented on HBASE-19304: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 51s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 26s{color} | {color:red} hbase-client: The patch generated 1 new + 56 unchanged - 1 fixed = 57 total (was 57) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 32s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 47m 18s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 42s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 | | JIRA Issue | HBASE-19304 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12898430/HBASE-19304-v1.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux a1c8bee62349 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8f806ab486 | | maven | version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) | | Default Java | 1.8.0_151 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/9930/artifact/patchprocess/diff-checkstyle-hbase-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/9930/testReport/ | | modules | C: hbase-client U: hbase-client | | Console output |
[jira] [Commented] (HBASE-19163) "Maximum lock count exceeded" from region server's batch processing
[ https://issues.apache.org/jira/browse/HBASE-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259823#comment-16259823 ] huaxiang sun commented on HBASE-19163: -- The testing failures expose an interesting issue (critical?) It is exposed by limiting the minibatch size to 10k. If one batch is divided into multiple minibatchs at the region server side and the secondary minibatch causes some exceptions. The client thinks that the whole batch failure and it will resend the the whole batch again. This will cause the first minibatch be put into table twice. We need to have mechanism to avoid this case. I am going to create a jira to track it. > "Maximum lock count exceeded" from region server's batch processing > --- > > Key: HBASE-19163 > URL: https://issues.apache.org/jira/browse/HBASE-19163 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 3.0.0, 1.2.7, 2.0.0-alpha-3 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-19163-master-v001.patch, > HBASE-19163.master.001.patch, HBASE-19163.master.002.patch, > HBASE-19163.master.004.patch, HBASE-19163.master.005.patch, unittest-case.diff > > > In one of use cases, we found the following exception and replication is > stuck. > {code} > 2017-10-25 19:41:17,199 WARN [hconnection-0x28db294f-shared--pool4-t936] > client.AsyncProcess: #3, table=foo, attempt=5/5 failed=262836ops, last > exception: java.io.IOException: java.io.IOException: Maximum lock count > exceeded > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2215) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165) > Caused by: java.lang.Error: Maximum lock count exceeded > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(ReentrantReadWriteLock.java:528) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(ReentrantReadWriteLock.java:488) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1327) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871) > at > org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:5163) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:3018) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2877) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2819) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:753) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:715) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2148) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33656) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170) > ... 3 more > {code} > While we are still examining the data pattern, it is sure that there are too > many mutations in the batch against the same row, this exceeds the maximum > 64k shared lock count and it throws an error and failed the whole batch. > There are two approaches to solve this issue. > 1). Let's say there are mutations against the same row in the batch, we just > need to acquire the lock once for the same row vs to acquire the lock for > each mutation. > 2). We catch the error and start to process whatever it gets and loop back. > With HBASE-17924, approach 1 seems easy to implement now. > Create the jira and will post update/patch when investigation moving forward. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19285) Add per-table latency histograms
[ https://issues.apache.org/jira/browse/HBASE-19285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HBASE-19285: --- Status: Patch Available (was: Open) > Add per-table latency histograms > > > Key: HBASE-19285 > URL: https://issues.apache.org/jira/browse/HBASE-19285 > Project: HBase > Issue Type: Bug > Components: metrics >Reporter: Clay B. >Assignee: Josh Elser >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.3 > > Attachments: HBASE-19285.001.branch-1.3.patch, > HBaseTableLatencyMetrics.png > > > HBASE-17017 removed the per-region latency histograms (e.g. Get, Put, Scan at > p75, p85, etc) > HBASE-15518 added some per-table metrics, but not the latency histograms. > Given the previous conversations, it seems like it these per-table > aggregations weren't intentionally omitted, just never re-implemented after > the per-region removal. They're some really nice out-of-the-box metrics we > can provide to our users/admins as long as it's not detrimental. -- This message was sent by Atlassian JIRA (v6.4.14#64029)