[jira] [Commented] (HBASE-13197) Connection API cleanup
[ https://issues.apache.org/jira/browse/HBASE-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034186#comment-16034186 ] Chia-Ping Tsai commented on HBASE-13197: All sub-tasks have be resolved. Maybe it is time to mark this issue as Resolved? > Connection API cleanup > -- > > Key: HBASE-13197 > URL: https://issues.apache.org/jira/browse/HBASE-13197 > Project: HBase > Issue Type: Improvement > Components: API >Affects Versions: 2.0.0 >Reporter: Mikhail Antonov >Assignee: Mikhail Antonov > Fix For: 2.0.0 > > > Please see some discussion in HBASE-12586. Basically, we seem to have several > different ways of acquiring connections, most of which are marked as > deprecated (that include HConnectionManager class and notion of managed > connections). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034175#comment-16034175 ] Chia-Ping Tsai commented on HBASE-18145: bq. withDelayedScannersClose? okay. bq. May be we need just one List for this scanners to be delayed closed? (deferred closed).. Like below? Currently we have just one list for the scanners to be delayed closed. The "memStoreScannersAfterFlush" stored the new memscanner after flush. Do I overlook something? > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034165#comment-16034165 ] Anoop Sam John commented on HBASE-18145: bq.How about this change? withHeapClose -> closeHeapAndCachedScanners withDelayedScannersClose? May be we need just one List for this scanners to be delayed closed? (deferred closed).. Like below? private final List scannersForDelayedClose = new ArrayList<>(); > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034155#comment-16034155 ] Chia-Ping Tsai commented on HBASE-18145: All failed tests have passed on my local run. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18149) The setting rules for table-scope attributes and family-scope attributes should keep consistent
[ https://issues.apache.org/jira/browse/HBASE-18149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034154#comment-16034154 ] Hadoop QA commented on HBASE-18149: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} rubocop {color} | {color:blue} 0m 0s {color} | {color:blue} rubocop was not available. {color} | | {color:blue}0{color} | {color:blue} ruby-lint {color} | {color:blue} 0m 0s {color} | {color:blue} Ruby-lint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 54s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 64m 23s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 29s {color} | {color:red} hbase-shell in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 92m 35s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestShell | | Timed out junit tests | org.apache.hadoop.hbase.client.rsgroup.TestShellRSGroups | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870912/HBASE-18149-master-v1.patch | | JIRA Issue | HBASE-18149 | | Optional Tests | asflicense javac javadoc unit rubocop ruby_lint | | uname | Linux 84e0d62030bb 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 5491714 | | Default Java | 1.8.0_131 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7046/artifact/patchprocess/patch-unit-hbase-shell.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/7046/artifact/patchprocess/patch-unit-hbase-shell.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7046/testReport/ | | modules | C: hbase-shell U: hbase-shell | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7046/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > The setting rules for table-scope attributes and family-scope attributes > should keep consistent > --- > > Key: HBASE-18149 > URL: https://issues.apache.org/jira/browse/HBASE-18149 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.5 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng > Attachments: HBASE-18149-master-v1.patch > > > I use the following command to create a table. > {code} > hbase(main):030:0> create 't3',{NAME => 'f2', BLOCKCACHE => false}, > {COMPACTION_ENABLED => false} > An argument ignored (unknown or overridden): COMPACTION_ENABLED > 0 row(s) in 1.1390 seconds > hbase(main):031:0> describe 't3' > Table t3 is ENABLED > t3
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034146#comment-16034146 ] Chia-Ping Tsai commented on HBASE-18145: bq. In shipped() new call to clearAndClose(scannerForDelayedClose);, We should do this after prevCell clone bq. May be best would be follow the other list close way of whether boolean true, then only close here. You are right. I have noticed that. If TestAcid* is fine on my local (I will run it 1000 times), I will attach the new patch. bq. We have to change the param name in that case How about this change? withHeapClose -> closeHeapAndCachedScanners > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17678) FilterList with MUST_PASS_ONE lead to redundancy cells returned
[ https://issues.apache.org/jira/browse/HBASE-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034145#comment-16034145 ] Hadoop QA commented on HBASE-17678: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 52s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 30m 31s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 25s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 14s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 72m 22s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.locking.TestLockProcedure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870914/HBASE-17678.v3.patch | | JIRA Issue | HBASE-17678 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux ae3f1f2ea0c9 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 5491714 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7047/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/7047/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results |
[jira] [Commented] (HBASE-18138) HBase named read caches
[ https://issues.apache.org/jira/browse/HBASE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034132#comment-16034132 ] Biju Nair commented on HBASE-18138: --- The current {{LruBlockCache/BucketCache}} implementations can be modified to support named caches with the current state to be cache named {{default}}. Attached has some quick changes for {{LruBlockCache}} made to see how complex the changes would be. > HBase named read caches > --- > > Key: HBASE-18138 > URL: https://issues.apache.org/jira/browse/HBASE-18138 > Project: HBase > Issue Type: New Feature > Components: BlockCache, BucketCache >Reporter: Biju Nair > Attachments: HBASE-18138.txt > > > Instead of a single read(block) cache, if HBase can support creation of named > read caches and use by tables it will help common scenarios like > - Assigning a chunk of the cache to tables with data which are critical to > performance so that they don’t get swapped out due to other less critical > table data being read > - To be able to guarantee a percentage of the cache to tenants in a multi > tenant environment by assigning named caches to each tenant -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18138) HBase named read caches
[ https://issues.apache.org/jira/browse/HBASE-18138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Biju Nair updated HBASE-18138: -- Attachment: HBASE-18138.txt > HBase named read caches > --- > > Key: HBASE-18138 > URL: https://issues.apache.org/jira/browse/HBASE-18138 > Project: HBase > Issue Type: New Feature > Components: BlockCache, BucketCache >Reporter: Biju Nair > Attachments: HBASE-18138.txt > > > Instead of a single read(block) cache, if HBase can support creation of named > read caches and use by tables it will help common scenarios like > - Assigning a chunk of the cache to tables with data which are critical to > performance so that they don’t get swapped out due to other less critical > table data being read > - To be able to guarantee a percentage of the cache to tenants in a multi > tenant environment by assigning named caches to each tenant -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034120#comment-16034120 ] Anoop Sam John commented on HBASE-18145: Very imp finding.. In shipped() new call to clearAndClose(scannerForDelayedClose);, We should do this after prevCell clone. That is better as there is no chance at all that prevCell belong to this scanner which is getting closed now. clearAndClose(scannerForDelayedClose); call in close() -> Looks ok to do this whatever be the boolean param. Still it might be better to double check. May be best would be follow the other list close way of whether boolean true, then only close here. After shipped() call any way close is there. WDYT? We have to change the param name in that case bq.close(boolean withHeapClose) > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034101#comment-16034101 ] Hadoop QA commented on HBASE-18132: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 28s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 34m 46s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 5s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 155m 30s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.TestAcidGuarantees | | | hadoop.hbase.regionserver.wal.TestAsyncLogRolling | | Timed out junit tests | org.apache.hadoop.hbase.client.TestFromClientSide | | | org.apache.hadoop.hbase.client.TestMultiRespectsLimits | | | org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient | | | org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.10.1 Server=1.10.1 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870900/HBASE-18132.patch | | JIRA Issue | HBASE-18132 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux f97cd444b7d8 3.19.0-25-generic #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 5491714 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7045/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/7045/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7045/testReport/ | | modules | C: hbase-server U: hbase-server | | Console
[jira] [Commented] (HBASE-17678) FilterList with MUST_PASS_ONE lead to redundancy cells returned
[ https://issues.apache.org/jira/browse/HBASE-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034098#comment-16034098 ] Zheng Hu commented on HBASE-17678: -- bq. If a filter return SEEK_NEXT_USING_HINT, does this means that the next passed cell must be the hint cell? [~zghaobac], For safety, I think we should pass the cell which is greater than or equal than the previous hint cell for filter in filter list. So I uploaded patch v3 and add a new UT for the change. > FilterList with MUST_PASS_ONE lead to redundancy cells returned > --- > > Key: HBASE-17678 > URL: https://issues.apache.org/jira/browse/HBASE-17678 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 2.0.0, 1.3.0, 1.2.1 > Environment: RedHat 7.x >Reporter: Jason Tokayer >Assignee: Zheng Hu > Attachments: HBASE-17678.v1.patch, HBASE-17678.v1.rough.patch, > HBASE-17678.v2.patch, HBASE-17678.v3.patch, > TestColumnPaginationFilterDemo.java > > > When combining ColumnPaginationFilter with a single-element filterList, > MUST_PASS_ONE and MUST_PASS_ALL give different results when there are > multiple cells with the same timestamp. This is unexpected since there is > only a single filter in the list, and I would believe that MUST_PASS_ALL and > MUST_PASS_ONE should only affect the behavior of the joined filter and not > the behavior of any one of the individual filters. If this is not a bug then > it would be nice if the documentation is updated to explain this nuanced > behavior. > I know that there was a decision made in an earlier Hbase version to keep > multiple cells with the same timestamp. This is generally fine but presents > an issue when using the aforementioned filter combination. > Steps to reproduce: > In the shell create a table and insert some data: > {code:none} > create 'ns:tbl',{NAME => 'family',VERSIONS => 100} > put 'ns:tbl','row','family:name','John',1 > put 'ns:tbl','row','family:name','Jane',1 > put 'ns:tbl','row','family:name','Gil',1 > put 'ns:tbl','row','family:name','Jane',1 > {code} > Then, use a Scala client as: > {code:none} > import org.apache.hadoop.hbase.filter._ > import org.apache.hadoop.hbase.util.Bytes > import org.apache.hadoop.hbase.client._ > import org.apache.hadoop.hbase.{CellUtil, HBaseConfiguration, TableName} > import scala.collection.mutable._ > val config = HBaseConfiguration.create() > config.set("hbase.zookeeper.quorum", "localhost") > config.set("hbase.zookeeper.property.clientPort", "2181") > val connection = ConnectionFactory.createConnection(config) > val logicalOp = FilterList.Operator.MUST_PASS_ONE > val limit = 1 > var resultsList = ListBuffer[String]() > for (offset <- 0 to 20 by limit) { > val table = connection.getTable(TableName.valueOf("ns:tbl")) > val paginationFilter = new ColumnPaginationFilter(limit,offset) > val filterList: FilterList = new FilterList(logicalOp,paginationFilter) > println("@ filterList = "+filterList) > val results = table.get(new > Get(Bytes.toBytes("row")).setFilter(filterList)) > val cells = results.rawCells() > if (cells != null) { > for (cell <- cells) { > val value = new String(CellUtil.cloneValue(cell)) > val qualifier = new String(CellUtil.cloneQualifier(cell)) > val family = new String(CellUtil.cloneFamily(cell)) > val result = "OFFSET = "+offset+":"+family + "," + qualifier > + "," + value + "," + cell.getTimestamp() > resultsList.append(result) > } > } > } > resultsList.foreach(println) > {code} > Here are the results for different limit and logicalOp settings: > {code:none} > Limit = 1 & logicalOp = MUST_PASS_ALL: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > Limit = 1 & logicalOp = MUST_PASS_ONE: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > OFFSET = 1:family,name,Gil,1 > OFFSET = 2:family,name,Jane,1 > OFFSET = 3:family,name,John,1 > Limit = 2 & logicalOp = MUST_PASS_ALL: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > Limit = 2 & logicalOp = MUST_PASS_ONE: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > OFFSET = 2:family,name,Jane,1 > {code} > So, it seems that MUST_PASS_ALL gives the expected behavior, but > MUST_PASS_ONE does not. Furthermore, MUST_PASS_ONE seems to give only a > single (not-duplicated) within a page, but not across pages. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18139) maven-remote-resources-plugin fails with IndexOutOfBoundsException in hbase-assembly
[ https://issues.apache.org/jira/browse/HBASE-18139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034095#comment-16034095 ] Xiang Li commented on HBASE-18139: -- Downgrade the priority to 'Minor', as when I use public Maven repository, the error can not be re-created. > maven-remote-resources-plugin fails with IndexOutOfBoundsException in > hbase-assembly > > > Key: HBASE-18139 > URL: https://issues.apache.org/jira/browse/HBASE-18139 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 1.3.2 >Reporter: Xiang Li >Priority: Minor > > The same as HBASE-14199. > {code} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process > (aggregate-licenses) on project hbase-assembly: Error rendering velocity > resource.: Error invoking method 'get(java.lang.Integer)' in > java.util.ArrayList at META-INF/LICENSE.vm[line 1678, column 8]: > InvocationTargetException: Index: 0, Size: 0 -> [Help 1] > {code} > Fail to run mvn install against the latest branch-1 and branch-1.3, with no > additional change. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18139) maven-remote-resources-plugin fails with IndexOutOfBoundsException in hbase-assembly
[ https://issues.apache.org/jira/browse/HBASE-18139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiang Li updated HBASE-18139: - Priority: Minor (was: Blocker) > maven-remote-resources-plugin fails with IndexOutOfBoundsException in > hbase-assembly > > > Key: HBASE-18139 > URL: https://issues.apache.org/jira/browse/HBASE-18139 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 1.3.2 >Reporter: Xiang Li >Priority: Minor > > The same as HBASE-14199. > {code} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process > (aggregate-licenses) on project hbase-assembly: Error rendering velocity > resource.: Error invoking method 'get(java.lang.Integer)' in > java.util.ArrayList at META-INF/LICENSE.vm[line 1678, column 8]: > InvocationTargetException: Index: 0, Size: 0 -> [Help 1] > {code} > Fail to run mvn install against the latest branch-1 and branch-1.3, with no > additional change. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17678) FilterList with MUST_PASS_ONE lead to redundancy cells returned
[ https://issues.apache.org/jira/browse/HBASE-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Hu updated HBASE-17678: - Attachment: HBASE-17678.v3.patch > FilterList with MUST_PASS_ONE lead to redundancy cells returned > --- > > Key: HBASE-17678 > URL: https://issues.apache.org/jira/browse/HBASE-17678 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 2.0.0, 1.3.0, 1.2.1 > Environment: RedHat 7.x >Reporter: Jason Tokayer >Assignee: Zheng Hu > Attachments: HBASE-17678.v1.patch, HBASE-17678.v1.rough.patch, > HBASE-17678.v2.patch, HBASE-17678.v3.patch, > TestColumnPaginationFilterDemo.java > > > When combining ColumnPaginationFilter with a single-element filterList, > MUST_PASS_ONE and MUST_PASS_ALL give different results when there are > multiple cells with the same timestamp. This is unexpected since there is > only a single filter in the list, and I would believe that MUST_PASS_ALL and > MUST_PASS_ONE should only affect the behavior of the joined filter and not > the behavior of any one of the individual filters. If this is not a bug then > it would be nice if the documentation is updated to explain this nuanced > behavior. > I know that there was a decision made in an earlier Hbase version to keep > multiple cells with the same timestamp. This is generally fine but presents > an issue when using the aforementioned filter combination. > Steps to reproduce: > In the shell create a table and insert some data: > {code:none} > create 'ns:tbl',{NAME => 'family',VERSIONS => 100} > put 'ns:tbl','row','family:name','John',1 > put 'ns:tbl','row','family:name','Jane',1 > put 'ns:tbl','row','family:name','Gil',1 > put 'ns:tbl','row','family:name','Jane',1 > {code} > Then, use a Scala client as: > {code:none} > import org.apache.hadoop.hbase.filter._ > import org.apache.hadoop.hbase.util.Bytes > import org.apache.hadoop.hbase.client._ > import org.apache.hadoop.hbase.{CellUtil, HBaseConfiguration, TableName} > import scala.collection.mutable._ > val config = HBaseConfiguration.create() > config.set("hbase.zookeeper.quorum", "localhost") > config.set("hbase.zookeeper.property.clientPort", "2181") > val connection = ConnectionFactory.createConnection(config) > val logicalOp = FilterList.Operator.MUST_PASS_ONE > val limit = 1 > var resultsList = ListBuffer[String]() > for (offset <- 0 to 20 by limit) { > val table = connection.getTable(TableName.valueOf("ns:tbl")) > val paginationFilter = new ColumnPaginationFilter(limit,offset) > val filterList: FilterList = new FilterList(logicalOp,paginationFilter) > println("@ filterList = "+filterList) > val results = table.get(new > Get(Bytes.toBytes("row")).setFilter(filterList)) > val cells = results.rawCells() > if (cells != null) { > for (cell <- cells) { > val value = new String(CellUtil.cloneValue(cell)) > val qualifier = new String(CellUtil.cloneQualifier(cell)) > val family = new String(CellUtil.cloneFamily(cell)) > val result = "OFFSET = "+offset+":"+family + "," + qualifier > + "," + value + "," + cell.getTimestamp() > resultsList.append(result) > } > } > } > resultsList.foreach(println) > {code} > Here are the results for different limit and logicalOp settings: > {code:none} > Limit = 1 & logicalOp = MUST_PASS_ALL: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > Limit = 1 & logicalOp = MUST_PASS_ONE: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > OFFSET = 1:family,name,Gil,1 > OFFSET = 2:family,name,Jane,1 > OFFSET = 3:family,name,John,1 > Limit = 2 & logicalOp = MUST_PASS_ALL: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > Limit = 2 & logicalOp = MUST_PASS_ONE: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > OFFSET = 2:family,name,Jane,1 > {code} > So, it seems that MUST_PASS_ALL gives the expected behavior, but > MUST_PASS_ONE does not. Furthermore, MUST_PASS_ONE seems to give only a > single (not-duplicated) within a page, but not across pages. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18144) Forward-port the old exclusive row lock; there are scenarios where it performs better
[ https://issues.apache.org/jira/browse/HBASE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034091#comment-16034091 ] Allan Yang commented on HBASE-18144: Hi, [~stack], I uploaded a patch(HBASE-17924) with a UT to simulating disordered batch and increment at the same time. Disordered batch put and increment may be the root cause of your costumer's strange thread dump. You can see, without of HBASE-17924, the UT will timeout and fail, but with HBASE-17924, this UT can run normally. I haven't looked deep into it. But it clearly showed there is a problem with lock efficiency. > Forward-port the old exclusive row lock; there are scenarios where it > performs better > - > > Key: HBASE-18144 > URL: https://issues.apache.org/jira/browse/HBASE-18144 > Project: HBase > Issue Type: Bug > Components: Increment >Affects Versions: 1.2.5 >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.3.2, 1.2.7 > > Attachments: DisorderedBatchAndIncrementUT.patch > > > Description to follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18144) Forward-port the old exclusive row lock; there are scenarios where it performs better
[ https://issues.apache.org/jira/browse/HBASE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-18144: --- Attachment: DisorderedBatchAndIncrementUT.patch > Forward-port the old exclusive row lock; there are scenarios where it > performs better > - > > Key: HBASE-18144 > URL: https://issues.apache.org/jira/browse/HBASE-18144 > Project: HBase > Issue Type: Bug > Components: Increment >Affects Versions: 1.2.5 >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.3.2, 1.2.7 > > Attachments: DisorderedBatchAndIncrementUT.patch > > > Description to follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18111) Replication stuck when cluster connection is closed
[ https://issues.apache.org/jira/browse/HBASE-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034088#comment-16034088 ] Hudson commented on HBASE-18111: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3119 (See [https://builds.apache.org/job/HBase-Trunk_matrix/3119/]) HBASE-18111 Replication stuck when cluster connection is closed (apurtell: rev 549171465db83600dbdf6d0b5af01c3ad30dc550) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/HBaseInterClusterReplicationEndpoint.java > Replication stuck when cluster connection is closed > --- > > Key: HBASE-18111 > URL: https://issues.apache.org/jira/browse/HBASE-18111 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5, 0.98.24, 1.1.10 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18111.patch, HBASE-18111-v1.patch, > HBASE-18111-v2.patch > > > Log: > {code} > 2017-05-24,03:01:25,603 ERROR [regionserver13700-SendThread(hostxxx:11000)] > org.apache.zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum > member failed: javax.security.sasl.SaslException: An error: > (java.security.PrivilegedActionException: javax.security.sasl.SaslException: > GSS initiate failed [Caused by GSSException: No valid credentials provided > (Mechanism level: Connection reset)]) occurred when evaluating Zookeeper > Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED > state. > 2017-05-24,03:01:25,615 FATAL [regionserver13700-EventThread] > org.apache.hadoop.hbase.client.HConnectionImplementation: > hconnection-0x1148dd9b-0x35b6b4d4ca999c6, > quorum=10.108.37.30:11000,10.108.38.30:11000,10.108.39.30:11000,10.108.84.25:11000,10.108.84.32:11000, > baseZNode=/hbase/c3prc-xiaomi98 hconnection-0x1148dd9b-0x35b6b4d4ca999c6 > received auth failed from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = > AuthFailed > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:425) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:333) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2017-05-24,03:01:25,615 INFO [regionserver13700-EventThread] > org.apache.hadoop.hbase.client.HConnectionImplementation: Closing zookeeper > sessionid=0x35b6b4d4ca999c6 > 2017-05-24,03:01:25,623 WARN [regionserver13700.replicationSource,800] > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint: > Replicate edites to peer cluster failed. > java.io.IOException: Call to hostxxx/10.136.22.6:24600 failed on local > exception: java.io.IOException: Connection closed > {code} > jstack > {code} > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.sleepForRetries(HBaseInterClusterReplicationEndpoint.java:127) > at > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:199) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:905) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:492) > {code} > The cluster connection was aborted when the ZookeeperWatcher receive a > AuthFailed event. Then the HBaseInterClusterReplicationEndpoint's replicate() > method will stuck in a while loop. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18149) The setting rules for table-scope attributes and family-scope attributes should keep consistent
[ https://issues.apache.org/jira/browse/HBASE-18149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-18149: -- Status: Patch Available (was: Open) > The setting rules for table-scope attributes and family-scope attributes > should keep consistent > --- > > Key: HBASE-18149 > URL: https://issues.apache.org/jira/browse/HBASE-18149 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 1.2.5, 2.0.0 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng > Attachments: HBASE-18149-master-v1.patch > > > I use the following command to create a table. > {code} > hbase(main):030:0> create 't3',{NAME => 'f2', BLOCKCACHE => false}, > {COMPACTION_ENABLED => false} > An argument ignored (unknown or overridden): COMPACTION_ENABLED > 0 row(s) in 1.1390 seconds > hbase(main):031:0> describe 't3' > Table t3 is ENABLED > t3 > > > COLUMN FAMILIES DESCRIPTION > > > {NAME => 'f2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', > KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => > 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'false', > BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} > 1 row(s) in 0.0720 seconds > {code} > *BLOCKCACHE* was in effect but *COMPACTION_ENABLED* didn't take effect. > After checking code, I found that if the table-scope attributes value is > false, you need to enclose 'false' in single quotation marks while > family-scope is not required. > so we should keep the consistent logic for table-scope and family-scope. > the command alter also have the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18149) The setting rules for table-scope attributes and family-scope attributes should keep consistent
[ https://issues.apache.org/jira/browse/HBASE-18149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangxu Cheng updated HBASE-18149: -- Attachment: HBASE-18149-master-v1.patch > The setting rules for table-scope attributes and family-scope attributes > should keep consistent > --- > > Key: HBASE-18149 > URL: https://issues.apache.org/jira/browse/HBASE-18149 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.5 >Reporter: Guangxu Cheng >Assignee: Guangxu Cheng > Attachments: HBASE-18149-master-v1.patch > > > I use the following command to create a table. > {code} > hbase(main):030:0> create 't3',{NAME => 'f2', BLOCKCACHE => false}, > {COMPACTION_ENABLED => false} > An argument ignored (unknown or overridden): COMPACTION_ENABLED > 0 row(s) in 1.1390 seconds > hbase(main):031:0> describe 't3' > Table t3 is ENABLED > t3 > > > COLUMN FAMILIES DESCRIPTION > > > {NAME => 'f2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', > KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => > 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'false', > BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} > 1 row(s) in 0.0720 seconds > {code} > *BLOCKCACHE* was in effect but *COMPACTION_ENABLED* didn't take effect. > After checking code, I found that if the table-scope attributes value is > false, you need to enclose 'false' in single quotation marks while > family-scope is not required. > so we should keep the consistent logic for table-scope and family-scope. > the command alter also have the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18144) Forward-port the old exclusive row lock; there are scenarios where it performs better
[ https://issues.apache.org/jira/browse/HBASE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034056#comment-16034056 ] stack commented on HBASE-18144: --- [~allan163] sorry. Should have been more explicit. Thread dumping we we're stuck getting locks. I was not able to do a good reproduction of the same scenario. In my measurements current locking seemed to take only 20% longer but in the production install the read write lock was plainly taking much longer than this. Doing thousands of increments a second from hundreds of handlers on same small set of rows seems to have this effect. Cited quotes and article indicate acquisition times can degenerate. Hth > Forward-port the old exclusive row lock; there are scenarios where it > performs better > - > > Key: HBASE-18144 > URL: https://issues.apache.org/jira/browse/HBASE-18144 > Project: HBase > Issue Type: Bug > Components: Increment >Affects Versions: 1.2.5 >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.3.2, 1.2.7 > > > Description to follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18137) Replication gets stuck for empty WALs
[ https://issues.apache.org/jira/browse/HBASE-18137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034050#comment-16034050 ] Allan Yang commented on HBASE-18137: May it related to HBASE-18132? Are you sure the WAL is totally empty without any headers? In my observation, the disrupted empty wal should at least 80 bytes or 90bytes(depend on the cell encoding) > Replication gets stuck for empty WALs > - > > Key: HBASE-18137 > URL: https://issues.apache.org/jira/browse/HBASE-18137 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 1.3.1 >Reporter: Ashu Pachauri >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2, 1.1.11, 1.2.7 > > > Replication assumes that only the last WAL of a recovered queue can be empty. > But, intermittent DFS issues may cause empty WALs being created (without the > PWAL magic), and a roll of WAL to happen without a regionserver crash. This > will cause recovered queues to have empty WALs in the middle. This cause > replication to get stuck: > {code} > TRACE regionserver.ReplicationSource: Opening log > WARN regionserver.ReplicationSource: - Got: > java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:197) > at java.io.DataInputStream.readFully(DataInputStream.java:169) > at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1915) > at > org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1880) > at > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1829) > at > org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1843) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.(SequenceFileLogReader.java:70) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.reset(SequenceFileLogReader.java:168) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.initReader(SequenceFileLogReader.java:177) > at > org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:66) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:312) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:276) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:264) > at > org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:423) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationWALReaderManager.openReader(ReplicationWALReaderManager.java:70) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource$ReplicationSourceWorkerThread.openReader(ReplicationSource.java:830) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource$ReplicationSourceWorkerThread.run(ReplicationSource.java:572) > {code} > The WAL in question was completely empty but there were other WALs in the > recovered queue which were newer and non-empty. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18111) Replication stuck when cluster connection is closed
[ https://issues.apache.org/jira/browse/HBASE-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034047#comment-16034047 ] Hudson commented on HBASE-18111: SUCCESS: Integrated in Jenkins build HBase-1.4 #756 (See [https://builds.apache.org/job/HBase-1.4/756/]) HBASE-18111 Replication stuck when cluster connection is closed (apurtell: rev b66a478e73b59e0335de89038475409d916ea164) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/HBaseInterClusterReplicationEndpoint.java > Replication stuck when cluster connection is closed > --- > > Key: HBASE-18111 > URL: https://issues.apache.org/jira/browse/HBASE-18111 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5, 0.98.24, 1.1.10 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18111.patch, HBASE-18111-v1.patch, > HBASE-18111-v2.patch > > > Log: > {code} > 2017-05-24,03:01:25,603 ERROR [regionserver13700-SendThread(hostxxx:11000)] > org.apache.zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum > member failed: javax.security.sasl.SaslException: An error: > (java.security.PrivilegedActionException: javax.security.sasl.SaslException: > GSS initiate failed [Caused by GSSException: No valid credentials provided > (Mechanism level: Connection reset)]) occurred when evaluating Zookeeper > Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED > state. > 2017-05-24,03:01:25,615 FATAL [regionserver13700-EventThread] > org.apache.hadoop.hbase.client.HConnectionImplementation: > hconnection-0x1148dd9b-0x35b6b4d4ca999c6, > quorum=10.108.37.30:11000,10.108.38.30:11000,10.108.39.30:11000,10.108.84.25:11000,10.108.84.32:11000, > baseZNode=/hbase/c3prc-xiaomi98 hconnection-0x1148dd9b-0x35b6b4d4ca999c6 > received auth failed from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = > AuthFailed > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:425) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:333) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2017-05-24,03:01:25,615 INFO [regionserver13700-EventThread] > org.apache.hadoop.hbase.client.HConnectionImplementation: Closing zookeeper > sessionid=0x35b6b4d4ca999c6 > 2017-05-24,03:01:25,623 WARN [regionserver13700.replicationSource,800] > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint: > Replicate edites to peer cluster failed. > java.io.IOException: Call to hostxxx/10.136.22.6:24600 failed on local > exception: java.io.IOException: Connection closed > {code} > jstack > {code} > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.sleepForRetries(HBaseInterClusterReplicationEndpoint.java:127) > at > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:199) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:905) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:492) > {code} > The cluster connection was aborted when the ZookeeperWatcher receive a > AuthFailed event. Then the HBaseInterClusterReplicationEndpoint's replicate() > method will stuck in a while loop. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18144) Forward-port the old exclusive row lock; there are scenarios where it performs better
[ https://issues.apache.org/jira/browse/HBASE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034046#comment-16034046 ] Allan Yang commented on HBASE-18144: Sorry, stack, I'm still not get it. As you said, reentrant read/write lock is about 20% slower than exclusive lock but locking time only takes a very small proportion in the whole write path, why using exclusive lock improves the case? > Forward-port the old exclusive row lock; there are scenarios where it > performs better > - > > Key: HBASE-18144 > URL: https://issues.apache.org/jira/browse/HBASE-18144 > Project: HBase > Issue Type: Bug > Components: Increment >Affects Versions: 1.2.5 >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.3.2, 1.2.7 > > > Description to follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17678) FilterList with MUST_PASS_ONE lead to redundancy cells returned
[ https://issues.apache.org/jira/browse/HBASE-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034045#comment-16034045 ] Guanghao Zhang commented on HBASE-17678: There are still one problem, about the SEEK_NEXT_USING_HINT return code. If a filter return SEEK_NEXT_USING_HINT, does this means that the next passed cell must be the hint cell? Can we pass a not hint cell to the filter? [~stack] any ideas? I am not sure about this. > FilterList with MUST_PASS_ONE lead to redundancy cells returned > --- > > Key: HBASE-17678 > URL: https://issues.apache.org/jira/browse/HBASE-17678 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 2.0.0, 1.3.0, 1.2.1 > Environment: RedHat 7.x >Reporter: Jason Tokayer >Assignee: Zheng Hu > Attachments: HBASE-17678.v1.patch, HBASE-17678.v1.rough.patch, > HBASE-17678.v2.patch, TestColumnPaginationFilterDemo.java > > > When combining ColumnPaginationFilter with a single-element filterList, > MUST_PASS_ONE and MUST_PASS_ALL give different results when there are > multiple cells with the same timestamp. This is unexpected since there is > only a single filter in the list, and I would believe that MUST_PASS_ALL and > MUST_PASS_ONE should only affect the behavior of the joined filter and not > the behavior of any one of the individual filters. If this is not a bug then > it would be nice if the documentation is updated to explain this nuanced > behavior. > I know that there was a decision made in an earlier Hbase version to keep > multiple cells with the same timestamp. This is generally fine but presents > an issue when using the aforementioned filter combination. > Steps to reproduce: > In the shell create a table and insert some data: > {code:none} > create 'ns:tbl',{NAME => 'family',VERSIONS => 100} > put 'ns:tbl','row','family:name','John',1 > put 'ns:tbl','row','family:name','Jane',1 > put 'ns:tbl','row','family:name','Gil',1 > put 'ns:tbl','row','family:name','Jane',1 > {code} > Then, use a Scala client as: > {code:none} > import org.apache.hadoop.hbase.filter._ > import org.apache.hadoop.hbase.util.Bytes > import org.apache.hadoop.hbase.client._ > import org.apache.hadoop.hbase.{CellUtil, HBaseConfiguration, TableName} > import scala.collection.mutable._ > val config = HBaseConfiguration.create() > config.set("hbase.zookeeper.quorum", "localhost") > config.set("hbase.zookeeper.property.clientPort", "2181") > val connection = ConnectionFactory.createConnection(config) > val logicalOp = FilterList.Operator.MUST_PASS_ONE > val limit = 1 > var resultsList = ListBuffer[String]() > for (offset <- 0 to 20 by limit) { > val table = connection.getTable(TableName.valueOf("ns:tbl")) > val paginationFilter = new ColumnPaginationFilter(limit,offset) > val filterList: FilterList = new FilterList(logicalOp,paginationFilter) > println("@ filterList = "+filterList) > val results = table.get(new > Get(Bytes.toBytes("row")).setFilter(filterList)) > val cells = results.rawCells() > if (cells != null) { > for (cell <- cells) { > val value = new String(CellUtil.cloneValue(cell)) > val qualifier = new String(CellUtil.cloneQualifier(cell)) > val family = new String(CellUtil.cloneFamily(cell)) > val result = "OFFSET = "+offset+":"+family + "," + qualifier > + "," + value + "," + cell.getTimestamp() > resultsList.append(result) > } > } > } > resultsList.foreach(println) > {code} > Here are the results for different limit and logicalOp settings: > {code:none} > Limit = 1 & logicalOp = MUST_PASS_ALL: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > Limit = 1 & logicalOp = MUST_PASS_ONE: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > OFFSET = 1:family,name,Gil,1 > OFFSET = 2:family,name,Jane,1 > OFFSET = 3:family,name,John,1 > Limit = 2 & logicalOp = MUST_PASS_ALL: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > Limit = 2 & logicalOp = MUST_PASS_ONE: > scala> resultsList.foreach(println) > OFFSET = 0:family,name,Jane,1 > OFFSET = 2:family,name,Jane,1 > {code} > So, it seems that MUST_PASS_ALL gives the expected behavior, but > MUST_PASS_ONE does not. Furthermore, MUST_PASS_ONE seems to give only a > single (not-duplicated) within a page, but not across pages. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18149) The setting rules for table-scope attributes and family-scope attributes should keep consistent
Guangxu Cheng created HBASE-18149: - Summary: The setting rules for table-scope attributes and family-scope attributes should keep consistent Key: HBASE-18149 URL: https://issues.apache.org/jira/browse/HBASE-18149 Project: HBase Issue Type: Bug Components: shell Affects Versions: 1.2.5, 2.0.0 Reporter: Guangxu Cheng Assignee: Guangxu Cheng I use the following command to create a table. {code} hbase(main):030:0> create 't3',{NAME => 'f2', BLOCKCACHE => false}, {COMPACTION_ENABLED => false} An argument ignored (unknown or overridden): COMPACTION_ENABLED 0 row(s) in 1.1390 seconds hbase(main):031:0> describe 't3' Table t3 is ENABLED t3 COLUMN FAMILIES DESCRIPTION {NAME => 'f2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'false', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 1 row(s) in 0.0720 seconds {code} *BLOCKCACHE* was in effect but *COMPACTION_ENABLED* didn't take effect. After checking code, I found that if the table-scope attributes value is false, you need to enclose 'false' in single quotation marks while family-scope is not required. so we should keep the consistent logic for table-scope and family-scope. the command alter also have the same problem. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17988) get-active-master.rb and draining_servers.rb no longer work
[ https://issues.apache.org/jira/browse/HBASE-17988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinmay Kulkarni updated HBASE-17988: - Attachment: HBASE-17988.patch Modified jruby scripts to pick up _drainingZNode_ and _masterAddressZNode_ correctly, from within _znodePaths_ in _ZooKeeperWatcher._. Tested on a standalone HBase deployment to check that the scripts work as expected. > get-active-master.rb and draining_servers.rb no longer work > --- > > Key: HBASE-17988 > URL: https://issues.apache.org/jira/browse/HBASE-17988 > Project: HBase > Issue Type: Bug > Components: scripts >Affects Versions: 2.0.0 >Reporter: Mike Drob >Assignee: Chinmay Kulkarni >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2, 1.1.11, 1.2.7 > > Attachments: HBASE-17988.patch > > > The scripts {{bin/get-active-master.rb}} and {{bin/draining_servers.rb}} no > longer work on current master branch. Here is an example error message: > {noformat} > $ bin/hbase-jruby bin/get-active-master.rb > NoMethodError: undefined method `masterAddressZNode' for > # >at bin/get-active-master.rb:35 > {noformat} > My initial probing suggests that this is likely due to movement that happened > in HBASE-16690. Perhaps instead of reworking the ruby, there is similar Java > functionality already existing somewhere. > Putting priority at critical since it's impossible to know whether users rely > on the scripts. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18141) Regionserver fails to shutdown when abort triggered in RegionScannerImpl during RPC call
[ https://issues.apache.org/jira/browse/HBASE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034029#comment-16034029 ] Hadoop QA commented on HBASE-18141: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 43s {color} | {color:green} branch-1.3 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s {color} | {color:green} branch-1.3 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s {color} | {color:green} branch-1.3 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} branch-1.3 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 43s {color} | {color:red} hbase-server in branch-1.3 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s {color} | {color:green} branch-1.3 passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 34m 53s {color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 54s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 24s {color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 89m 30s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.TestCheckTestClasses | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:e1e11ad | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870877/HBASE-18141.branch-1.3.001.patch | | JIRA Issue | HBASE-18141 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 103bea27b1e1 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/hbase.sh | | git revision | branch-1.3 / ae7b631 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/7043/artifact/patchprocess/branch-findbugs-hbase-server-warnings.html | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7043/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/7043/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7043/testReport/ | | asflicense | https://builds.apache.org/job/PreCommit-HBASE-Build/7043/artifact/patchprocess/patch-asflicense-problems.txt | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7043/console | | Powered by | Apache Yetus
[jira] [Commented] (HBASE-16392) Backup delete fault tolerance
[ https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034025#comment-16034025 ] Hadoop QA commented on HBASE-16392: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 26s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 32s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 42s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 10s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 63m 44s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.TestCheckTestClasses | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870895/HBASE-16392-v3.patch | | JIRA Issue | HBASE-16392 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux d2612ffb030c 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 5491714 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7044/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/7044/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7044/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7044/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Backup delete fault tolerance > - > > Key: HBASE-16392 > URL:
[jira] [Updated] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-18132: --- Attachment: HBASE-18132.patch > Low replication should be checked in period in case of datanode rolling > upgrade > --- > > Key: HBASE-18132 > URL: https://issues.apache.org/jira/browse/HBASE-18132 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18132-branch-1.patch, > HBASE-18132-branch-1.v2.patch, HBASE-18132-branch-1.v3.patch, > HBASE-18132-branch-1.v4.patch, HBASE-18132.patch > > > For now, we just check low replication of WALs when there is a sync operation > (HBASE-2234), rolling the log if the replica of the WAL is less than > configured. But if the WAL has very little writes or no writes at all, low > replication will not be detected and thus no log will be rolled. > That is a problem when rolling updating datanode, all replica of the WAL with > no writes will be restarted and lead to the WAL file end up with a abnormal > state. Later operation of opening this file will be always failed. > I bring up a patch to check low replication of WALs at a configured period. > When rolling updating datanodes, we just make sure the restart interval time > between two nodes is bigger than the low replication check time, the WAL will > be closed and rolled normally. A UT in the patch will show everything. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-18132: --- Attachment: (was: HBASE-18132.patch) > Low replication should be checked in period in case of datanode rolling > upgrade > --- > > Key: HBASE-18132 > URL: https://issues.apache.org/jira/browse/HBASE-18132 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18132-branch-1.patch, > HBASE-18132-branch-1.v2.patch, HBASE-18132-branch-1.v3.patch, > HBASE-18132-branch-1.v4.patch, HBASE-18132.patch > > > For now, we just check low replication of WALs when there is a sync operation > (HBASE-2234), rolling the log if the replica of the WAL is less than > configured. But if the WAL has very little writes or no writes at all, low > replication will not be detected and thus no log will be rolled. > That is a problem when rolling updating datanode, all replica of the WAL with > no writes will be restarted and lead to the WAL file end up with a abnormal > state. Later operation of opening this file will be always failed. > I bring up a patch to check low replication of WALs at a configured period. > When rolling updating datanodes, we just make sure the restart interval time > between two nodes is bigger than the low replication check time, the WAL will > be closed and rolled normally. A UT in the patch will show everything. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18146) ITAcidGuarantees should drive flushes/compactions with a monkey
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18146: --- Description: IntegrationTestAcidGuarantees best works with minicluster based testing because it sets up for frequent flushing using site configuration. However we may run against a distributed cluster so cannot rely on site configuration to drive desired flushing/compaction behavior. Introduce and use a new monkey policy for this purpose. (was: TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work with minicluster based testing and do not run for a long duration. Consider a new integration test that makes similar atomicity checks while running for, potentially, a very long time, determined by test parameters supplied on the command line (perhaps as property definitions). The new integration test should expect to run against a distributed cluster, support specification of desired monkey policy, and not require any special non-default site configuration settings. ) > ITAcidGuarantees should drive flushes/compactions with a monkey > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > IntegrationTestAcidGuarantees best works with minicluster based testing > because it sets up for frequent flushing using site configuration. However we > may run against a distributed cluster so cannot rely on site configuration to > drive desired flushing/compaction behavior. Introduce and use a new monkey > policy for this purpose. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18146) ITAcidGuarantees should drive flushes/compactions with a monkey
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033984#comment-16033984 ] Andrew Purtell commented on HBASE-18146: Original description {quote} TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work with minicluster based testing and do not run for a long duration. Consider a new integration test that makes similar atomicity checks while running for, potentially, a very long time, determined by test parameters supplied on the command line (perhaps as property definitions). The new integration test should expect to run against a distributed cluster, support specification of desired monkey policy, and not require any special non-default site configuration settings. {quote} New {quote} IntegrationTestAcidGuarantees best works with minicluster based testing because it sets up for frequent flushing using site configuration. However we may run against a distributed cluster so cannot rely on site configuration to drive desired flushing/compaction behavior. Introduce and use a new monkey policy for this purpose. {quote} > ITAcidGuarantees should drive flushes/compactions with a monkey > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > IntegrationTestAcidGuarantees best works with minicluster based testing > because it sets up for frequent flushing using site configuration. However we > may run against a distributed cluster so cannot rely on site configuration to > drive desired flushing/compaction behavior. Introduce and use a new monkey > policy for this purpose. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18146) ITAcidGuarantees should drive flushes/compactions with a monkey
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18146: --- Summary: ITAcidGuarantees should drive flushes/compactions with a monkey (was: Long running integration test similar to TestAcidGuarantees) > ITAcidGuarantees should drive flushes/compactions with a monkey > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work > with minicluster based testing and do not run for a long duration. Consider a > new integration test that makes similar atomicity checks while running for, > potentially, a very long time, determined by test parameters supplied on the > command line (perhaps as property definitions). The new integration test > should expect to run against a distributed cluster, support specification of > desired monkey policy, and not require any special non-default site > configuration settings. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18146) Long running integration test similar to TestAcidGuarantees
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033982#comment-16033982 ] Andrew Purtell commented on HBASE-18146: bq. That said, a monkeyfactory that flushes frequently and causes compactions frequently (and similarly, a split/merge monkey factory) make sense. Let me repurpose this issue. > Long running integration test similar to TestAcidGuarantees > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work > with minicluster based testing and do not run for a long duration. Consider a > new integration test that makes similar atomicity checks while running for, > potentially, a very long time, determined by test parameters supplied on the > command line (perhaps as property definitions). The new integration test > should expect to run against a distributed cluster, support specification of > desired monkey policy, and not require any special non-default site > configuration settings. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18146) Long running integration test similar to TestAcidGuarantees
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033977#comment-16033977 ] Jonathan Hsieh commented on HBASE-18146: Ugh, I realize that was hard to understand/parse but I think you got the jist. -- An alternative would be to change the flush size config. That said, a monkeyfactory that flushes frequently and causes compactions frequently (and similarly, a split/merge monkey factory) make sense. Would it make sense to just file those as separate tickets? We coudl also either close this out as info provided or convert it to a docs improvement jira?) > Long running integration test similar to TestAcidGuarantees > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work > with minicluster based testing and do not run for a long duration. Consider a > new integration test that makes similar atomicity checks while running for, > potentially, a very long time, determined by test parameters supplied on the > command line (perhaps as property definitions). The new integration test > should expect to run against a distributed cluster, support specification of > desired monkey policy, and not require any special non-default site > configuration settings. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16392) Backup delete fault tolerance
[ https://issues.apache.org/jira/browse/HBASE-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-16392: -- Attachment: HBASE-16392-v3.patch Added new UT. cc: [~te...@apache.org] > Backup delete fault tolerance > - > > Key: HBASE-16392 > URL: https://issues.apache.org/jira/browse/HBASE-16392 > Project: HBase > Issue Type: Sub-task >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Labels: backup > Fix For: 2.0.0 > > Attachments: HBASE-16392-v1.patch, HBASE-16392-v2.patch, > HBASE-16392-v3.patch > > > Backup delete modified file system and backup system table. We have to make > sure that operation is atomic, durable and isolated. > Delete operation: > # Start backup session (this guarantees) that system will be blocked for all > backup commands during delete operation > # Save list of tables being deleted to system table > # Before delete operation we take backup system table snapshot > # During delete operation we detect any failures and restore backup system > table from snapshot, then finish backup session > # To guarantee consistency of the data, delete operation MUST be repeated > # We guarantee that all file delete operations are idempotent, can be > repeated multiple times > # Any backup operations will be blocked until consistency is restored > # To restore consistency, repair command must be executed. > # Repair command checks if there is failed delete op in a backup system > table, and repeats delete operation -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18141) Regionserver fails to shutdown when abort triggered in RegionScannerImpl during RPC call
[ https://issues.apache.org/jira/browse/HBASE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033973#comment-16033973 ] Ted Yu commented on HBASE-18141: lgtm Please add test category to TestRegionServerAbort. I assume patch for master branch is coming. > Regionserver fails to shutdown when abort triggered in RegionScannerImpl > during RPC call > > > Key: HBASE-18141 > URL: https://issues.apache.org/jira/browse/HBASE-18141 > Project: HBase > Issue Type: Bug > Components: regionserver, security >Affects Versions: 1.3.1 >Reporter: Gary Helmling >Assignee: Gary Helmling >Priority: Critical > Fix For: 1.3.2 > > Attachments: HBASE-18141.branch-1.3.001.patch > > > When an abort is triggered within the RPC call path by > HRegion.RegionScannerImpl, AccessController is incorrectly apply the RPC > caller identity in the RegionServerObserver.preStopRegionServer() hook. This > leaves the regionserver in a non-responsive state, where its regions are not > reassigned and it returns exceptions for all requests. > When an abort is triggered on the server side, we should not allow a > coprocessor to reject the abort at all. > Here is a sample stack trace: > {noformat} > 17/05/25 06:10:29 FATAL regionserver.HRegionServer: RegionServer abort: > loaded coprocessors are: > [org.apache.hadoop.hbase.security.access.AccessController, > org.apache.hadoop.hbase.security.token.TokenProvider] > 17/05/25 06:10:29 WARN regionserver.HRegionServer: The region server did not > stop > org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient > permissions for user 'rpcuser' (global, action=ADMIN) > at > org.apache.hadoop.hbase.security.access.AccessController.requireGlobalPermission(AccessController.java:548) > at > org.apache.hadoop.hbase.security.access.AccessController.requirePermission(AccessController.java:522) > at > org.apache.hadoop.hbase.security.access.AccessController.preStopRegionServer(AccessController.java:2501) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost$1.call(RegionServerCoprocessorHost.java:86) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost.execShutdown(RegionServerCoprocessorHost.java:300) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost.preStop(RegionServerCoprocessorHost.java:82) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.stop(HRegionServer.java:1905) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:2118) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:2125) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.abortRegionServer(HRegion.java:6326) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleFileNotFound(HRegion.java:6319) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5941) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6084) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5858) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2649) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34950) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2320) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) > {noformat} > I haven't yet evaluated which other release branches this might apply to. > I have a patch currently in progress, which I will post as soon as I complete > a test case. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18141) Regionserver fails to shutdown when abort triggered in RegionScannerImpl during RPC call
[ https://issues.apache.org/jira/browse/HBASE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-18141: -- Status: Patch Available (was: Open) > Regionserver fails to shutdown when abort triggered in RegionScannerImpl > during RPC call > > > Key: HBASE-18141 > URL: https://issues.apache.org/jira/browse/HBASE-18141 > Project: HBase > Issue Type: Bug > Components: regionserver, security >Affects Versions: 1.3.1 >Reporter: Gary Helmling >Assignee: Gary Helmling >Priority: Critical > Fix For: 1.3.2 > > Attachments: HBASE-18141.branch-1.3.001.patch > > > When an abort is triggered within the RPC call path by > HRegion.RegionScannerImpl, AccessController is incorrectly apply the RPC > caller identity in the RegionServerObserver.preStopRegionServer() hook. This > leaves the regionserver in a non-responsive state, where its regions are not > reassigned and it returns exceptions for all requests. > When an abort is triggered on the server side, we should not allow a > coprocessor to reject the abort at all. > Here is a sample stack trace: > {noformat} > 17/05/25 06:10:29 FATAL regionserver.HRegionServer: RegionServer abort: > loaded coprocessors are: > [org.apache.hadoop.hbase.security.access.AccessController, > org.apache.hadoop.hbase.security.token.TokenProvider] > 17/05/25 06:10:29 WARN regionserver.HRegionServer: The region server did not > stop > org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient > permissions for user 'rpcuser' (global, action=ADMIN) > at > org.apache.hadoop.hbase.security.access.AccessController.requireGlobalPermission(AccessController.java:548) > at > org.apache.hadoop.hbase.security.access.AccessController.requirePermission(AccessController.java:522) > at > org.apache.hadoop.hbase.security.access.AccessController.preStopRegionServer(AccessController.java:2501) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost$1.call(RegionServerCoprocessorHost.java:86) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost.execShutdown(RegionServerCoprocessorHost.java:300) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost.preStop(RegionServerCoprocessorHost.java:82) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.stop(HRegionServer.java:1905) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:2118) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:2125) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.abortRegionServer(HRegion.java:6326) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleFileNotFound(HRegion.java:6319) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5941) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6084) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5858) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2649) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34950) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2320) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) > {noformat} > I haven't yet evaluated which other release branches this might apply to. > I have a patch currently in progress, which I will post as soon as I complete > a test case. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-18054) log when we add/remove failed servers in client
[ https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033925#comment-16033925 ] Andrew Purtell edited comment on HBASE-18054 at 6/1/17 11:50 PM: - Thanks. Now let's look at the patch. {code} 51 if (LOG.isDebugEnabled()) { 52LOG.debug("Added failed server with address " + address.toString() + " to list"); 53 } {code} This doesn't tell us more than we can deduce by a later log indicating the server is in the failed server list. We should add why the server has been added to the failed server list. Perhaps addToFailedServers() should get a new additional argument, a string containing the reason, that all callers can supply? was (Author: apurtell): bq. Thanks. Now let's look at the patch. {code} 51 if (LOG.isDebugEnabled()) { 52LOG.debug("Added failed server with address " + address.toString() + " to list"); 53 } {code} This doesn't tell us more than we can deduce by a later log indicating the server is in the failed server list. We should add why the server has been added to the failed server list. Perhaps addToFailedServers() should get a new additional argument, a string containing the reason, that all callers can supply? > log when we add/remove failed servers in client > --- > > Key: HBASE-18054 > URL: https://issues.apache.org/jira/browse/HBASE-18054 > Project: HBase > Issue Type: Bug > Components: Client, Operability >Affects Versions: 1.3.0 >Reporter: Sean Busbey >Assignee: Ali > Attachments: HBASE-18054.patch, HBASE-18054.v2.master.patch, > HBASE-18054.v3.master.patch > > > Currently we log if a server is in the failed server list when we go to > connect to it, but we don't log anything about when the server got into the > list. > This means we have to search the log for errors involving the same server > name that (hopefully) managed to get into the log within > {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18054) log when we add/remove failed servers in client
[ https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033925#comment-16033925 ] Andrew Purtell commented on HBASE-18054: bq. Thanks. Now let's look at the patch. {code} 51 if (LOG.isDebugEnabled()) { 52LOG.debug("Added failed server with address " + address.toString() + " to list"); 53 } {code} This doesn't tell us more than we can deduce by a later log indicating the server is in the failed server list. We should add why the server has been added to the failed server list. Perhaps addToFailedServers() should get a new additional argument, a string containing the reason, that all callers can supply? > log when we add/remove failed servers in client > --- > > Key: HBASE-18054 > URL: https://issues.apache.org/jira/browse/HBASE-18054 > Project: HBase > Issue Type: Bug > Components: Client, Operability >Affects Versions: 1.3.0 >Reporter: Sean Busbey >Assignee: Ali > Attachments: HBASE-18054.patch, HBASE-18054.v2.master.patch, > HBASE-18054.v3.master.patch > > > Currently we log if a server is in the failed server list when we go to > connect to it, but we don't log anything about when the server got into the > list. > This means we have to search the log for errors involving the same server > name that (hopefully) managed to get into the log within > {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18111) Replication stuck when cluster connection is closed
[ https://issues.apache.org/jira/browse/HBASE-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-18111: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 1.4.0 2.0.0 Status: Resolved (was: Patch Available) > Replication stuck when cluster connection is closed > --- > > Key: HBASE-18111 > URL: https://issues.apache.org/jira/browse/HBASE-18111 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5, 0.98.24, 1.1.10 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-18111.patch, HBASE-18111-v1.patch, > HBASE-18111-v2.patch > > > Log: > {code} > 2017-05-24,03:01:25,603 ERROR [regionserver13700-SendThread(hostxxx:11000)] > org.apache.zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum > member failed: javax.security.sasl.SaslException: An error: > (java.security.PrivilegedActionException: javax.security.sasl.SaslException: > GSS initiate failed [Caused by GSSException: No valid credentials provided > (Mechanism level: Connection reset)]) occurred when evaluating Zookeeper > Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED > state. > 2017-05-24,03:01:25,615 FATAL [regionserver13700-EventThread] > org.apache.hadoop.hbase.client.HConnectionImplementation: > hconnection-0x1148dd9b-0x35b6b4d4ca999c6, > quorum=10.108.37.30:11000,10.108.38.30:11000,10.108.39.30:11000,10.108.84.25:11000,10.108.84.32:11000, > baseZNode=/hbase/c3prc-xiaomi98 hconnection-0x1148dd9b-0x35b6b4d4ca999c6 > received auth failed from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = > AuthFailed > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:425) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:333) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2017-05-24,03:01:25,615 INFO [regionserver13700-EventThread] > org.apache.hadoop.hbase.client.HConnectionImplementation: Closing zookeeper > sessionid=0x35b6b4d4ca999c6 > 2017-05-24,03:01:25,623 WARN [regionserver13700.replicationSource,800] > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint: > Replicate edites to peer cluster failed. > java.io.IOException: Call to hostxxx/10.136.22.6:24600 failed on local > exception: java.io.IOException: Connection closed > {code} > jstack > {code} > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.sleepForRetries(HBaseInterClusterReplicationEndpoint.java:127) > at > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:199) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:905) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:492) > {code} > The cluster connection was aborted when the ZookeeperWatcher receive a > AuthFailed event. Then the HBaseInterClusterReplicationEndpoint's replicate() > method will stuck in a while loop. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18146) Long running integration test similar to TestAcidGuarantees
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033919#comment-16033919 ] Andrew Purtell commented on HBASE-18146: bq. An alternative appraoch that would require config as opposed to code woudl be to flush size on the cluster under test That's what I was getting at earlier when suggesting the test shouldn't depend on changes to deployed site configuration, and how the current test seems limited. > Long running integration test similar to TestAcidGuarantees > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work > with minicluster based testing and do not run for a long duration. Consider a > new integration test that makes similar atomicity checks while running for, > potentially, a very long time, determined by test parameters supplied on the > command line (perhaps as property definitions). The new integration test > should expect to run against a distributed cluster, support specification of > desired monkey policy, and not require any special non-default site > configuration settings. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18139) maven-remote-resources-plugin fails with IndexOutOfBoundsException in hbase-assembly
[ https://issues.apache.org/jira/browse/HBASE-18139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033914#comment-16033914 ] Xiang Li commented on HBASE-18139: -- Hi Sean, thanks for that! I did not find enough time to follow this JIRA yesterday, but will reply to you today or during the weekend > maven-remote-resources-plugin fails with IndexOutOfBoundsException in > hbase-assembly > > > Key: HBASE-18139 > URL: https://issues.apache.org/jira/browse/HBASE-18139 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 1.3.2 >Reporter: Xiang Li >Priority: Blocker > > The same as HBASE-14199. > {code} > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process > (aggregate-licenses) on project hbase-assembly: Error rendering velocity > resource.: Error invoking method 'get(java.lang.Integer)' in > java.util.ArrayList at META-INF/LICENSE.vm[line 1678, column 8]: > InvocationTargetException: Index: 0, Size: 0 -> [Help 1] > {code} > Fail to run mvn install against the latest branch-1 and branch-1.3, with no > additional change. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18126) Increment class
[ https://issues.apache.org/jira/browse/HBASE-18126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033901#comment-16033901 ] Ted Yu commented on HBASE-18126: Debugging why mutate response conversion doesn't work. Here is debug string for the mutate response received: {code} I0601 23:23:41.431021 11620 response-converter.cc:52] FromMutateResponse:result { associated_cell_count: 1 stale: false } {code} > Increment class > --- > > Key: HBASE-18126 > URL: https://issues.apache.org/jira/browse/HBASE-18126 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18126.v6.txt, 18126.v7.txt > > > These Increment objects are used by the Table implementation to perform > increment operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HBASE-16543) Separate Create/Modify Table operations from open/reopen regions
[ https://issues.apache.org/jira/browse/HBASE-16543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe resolved HBASE-16543. -- Resolution: Fixed > Separate Create/Modify Table operations from open/reopen regions > > > Key: HBASE-16543 > URL: https://issues.apache.org/jira/browse/HBASE-16543 > Project: HBase > Issue Type: Sub-task > Components: master >Affects Versions: 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 2.0.0 > > > At the moment create table and modify table operations will trigger an > open/reopen of the regions inside the DDL operation. > we should split the operation in two parts > - create table, enable table regions > - modify table, reopen table regions -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (HBASE-16543) Separate Create/Modify Table operations from open/reopen regions
[ https://issues.apache.org/jira/browse/HBASE-16543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe resolved HBASE-16543. -- Resolution: Fixed Fixed by HBASE-14614. > Separate Create/Modify Table operations from open/reopen regions > > > Key: HBASE-16543 > URL: https://issues.apache.org/jira/browse/HBASE-16543 > Project: HBase > Issue Type: Sub-task > Components: master >Affects Versions: 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 2.0.0 > > > At the moment create table and modify table operations will trigger an > open/reopen of the regions inside the DDL operation. > we should split the operation in two parts > - create table, enable table regions > - modify table, reopen table regions -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Reopened] (HBASE-16543) Separate Create/Modify Table operations from open/reopen regions
[ https://issues.apache.org/jira/browse/HBASE-16543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe reopened HBASE-16543: -- > Separate Create/Modify Table operations from open/reopen regions > > > Key: HBASE-16543 > URL: https://issues.apache.org/jira/browse/HBASE-16543 > Project: HBase > Issue Type: Sub-task > Components: master >Affects Versions: 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 2.0.0 > > > At the moment create table and modify table operations will trigger an > open/reopen of the regions inside the DDL operation. > we should split the operation in two parts > - create table, enable table regions > - modify table, reopen table regions -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-16543) Separate Create/Modify Table operations from open/reopen regions
[ https://issues.apache.org/jira/browse/HBASE-16543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe reassigned HBASE-16543: Assignee: Matteo Bertozzi (was: Umesh Agashe) > Separate Create/Modify Table operations from open/reopen regions > > > Key: HBASE-16543 > URL: https://issues.apache.org/jira/browse/HBASE-16543 > Project: HBase > Issue Type: Sub-task > Components: master >Affects Versions: 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 2.0.0 > > > At the moment create table and modify table operations will trigger an > open/reopen of the regions inside the DDL operation. > we should split the operation in two parts > - create table, enable table regions > - modify table, reopen table regions -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18146) Long running integration test similar to TestAcidGuarantees
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033884#comment-16033884 ] Jonathan Hsieh commented on HBASE-18146: There is always room for improvement. :) For your initial specific asks, I think this is sufficient. That suggested monkey sounds useful. An alternative appraoch that would require config as opposed to code woudl be to flush size on the cluster under test which would to exercise flush and compact those mechanisms more frequently (would affect a few of my other favorites like ITBLL and ITIngest) The fact that this was filed means we should probably add a section to the docs to make it more obvious that this and other tests are runnable from the command line. (some IT tests use the class's static main function to run, other use the ITDriver to execute some canned runs). > Long running integration test similar to TestAcidGuarantees > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work > with minicluster based testing and do not run for a long duration. Consider a > new integration test that makes similar atomicity checks while running for, > potentially, a very long time, determined by test parameters supplied on the > command line (perhaps as property definitions). The new integration test > should expect to run against a distributed cluster, support specification of > desired monkey policy, and not require any special non-default site > configuration settings. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18141) Regionserver fails to shutdown when abort triggered in RegionScannerImpl during RPC call
[ https://issues.apache.org/jira/browse/HBASE-18141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-18141: -- Attachment: HBASE-18141.branch-1.3.001.patch Attaching a patch against branch-1.3 > Regionserver fails to shutdown when abort triggered in RegionScannerImpl > during RPC call > > > Key: HBASE-18141 > URL: https://issues.apache.org/jira/browse/HBASE-18141 > Project: HBase > Issue Type: Bug > Components: regionserver, security >Affects Versions: 1.3.1 >Reporter: Gary Helmling >Assignee: Gary Helmling >Priority: Critical > Fix For: 1.3.2 > > Attachments: HBASE-18141.branch-1.3.001.patch > > > When an abort is triggered within the RPC call path by > HRegion.RegionScannerImpl, AccessController is incorrectly apply the RPC > caller identity in the RegionServerObserver.preStopRegionServer() hook. This > leaves the regionserver in a non-responsive state, where its regions are not > reassigned and it returns exceptions for all requests. > When an abort is triggered on the server side, we should not allow a > coprocessor to reject the abort at all. > Here is a sample stack trace: > {noformat} > 17/05/25 06:10:29 FATAL regionserver.HRegionServer: RegionServer abort: > loaded coprocessors are: > [org.apache.hadoop.hbase.security.access.AccessController, > org.apache.hadoop.hbase.security.token.TokenProvider] > 17/05/25 06:10:29 WARN regionserver.HRegionServer: The region server did not > stop > org.apache.hadoop.hbase.security.AccessDeniedException: Insufficient > permissions for user 'rpcuser' (global, action=ADMIN) > at > org.apache.hadoop.hbase.security.access.AccessController.requireGlobalPermission(AccessController.java:548) > at > org.apache.hadoop.hbase.security.access.AccessController.requirePermission(AccessController.java:522) > at > org.apache.hadoop.hbase.security.access.AccessController.preStopRegionServer(AccessController.java:2501) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost$1.call(RegionServerCoprocessorHost.java:86) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost.execShutdown(RegionServerCoprocessorHost.java:300) > at > org.apache.hadoop.hbase.regionserver.RegionServerCoprocessorHost.preStop(RegionServerCoprocessorHost.java:82) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.stop(HRegionServer.java:1905) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:2118) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:2125) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.abortRegionServer(HRegion.java:6326) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.handleFileNotFound(HRegion.java:6319) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5941) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6084) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5858) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2649) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34950) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2320) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168) > {noformat} > I haven't yet evaluated which other release branches this might apply to. > I have a patch currently in progress, which I will post as soon as I complete > a test case. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18148) automated tests against non-minicluster based deployment
[ https://issues.apache.org/jira/browse/HBASE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033881#comment-16033881 ] Alex Leblang commented on HBASE-18148: -- Have you tested this recently? I filed this clusterdock issue b/c I was unable to build the images https://github.com/clusterdock/topology_apache_hbase/issues/6 > automated tests against non-minicluster based deployment > > > Key: HBASE-18148 > URL: https://issues.apache.org/jira/browse/HBASE-18148 > Project: HBase > Issue Type: Test > Components: community, integration tests >Reporter: Sean Busbey > Attachments: HBASE-18148.0.patch > > > we should have ITs that run automatically (i.e. nightly) against a cluster > that isn't a minicluster or standalone instance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18126) Increment class
[ https://issues.apache.org/jira/browse/HBASE-18126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-18126: --- Attachment: 18126.v7.txt > Increment class > --- > > Key: HBASE-18126 > URL: https://issues.apache.org/jira/browse/HBASE-18126 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18126.v6.txt, 18126.v7.txt > > > These Increment objects are used by the Table implementation to perform > increment operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-16261) MultiHFileOutputFormat Enhancement
[ https://issues.apache.org/jira/browse/HBASE-16261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033815#comment-16033815 ] Hudson commented on HBASE-16261: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3118 (See [https://builds.apache.org/job/HBase-Trunk_matrix/3118/]) HBASE-16261 MultiHFileOutputFormat Enhancement (Yi Liang) (jerryjch: rev c7a7f880dd99a29183e54f0092c10e7a70186d9d) * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableHFileOutputFormat.java * (delete) hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiHFileOutputFormat.java * (delete) hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiHFileOutputFormat.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableHFileOutputFormat.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java > MultiHFileOutputFormat Enhancement > > > Key: HBASE-16261 > URL: https://issues.apache.org/jira/browse/HBASE-16261 > Project: HBase > Issue Type: Sub-task > Components: hbase, mapreduce >Affects Versions: 2.0.0 >Reporter: Yi Liang >Assignee: Yi Liang > Fix For: 2.0.0 > > Attachments: HBASE-16261-V1.patch, HBASE-16261-V2.patch, > HBASE-16261-V3.patch, HBASE-16261-V4.patch, HBASE-16261-V5.patch, > HBase-16261-V6.patch, HBase-16261-V7.patch, HBase-16261-V8.patch, > HBase-16261-V9.patch > > > Change MultiHFileOutputFormat to MultiTableHFileOutputFormat, Continuing work > to enhance the MultiTableHFileOutputFormat to make it more usable: > MultiTableHFileOutputFormat follow HFileOutputFormat2 > (1) HFileOutputFormat2 can read one table's region split keys. and then > output multiple hfiles for one family, and each hfile map to one region. We > can add partitioner in MultiTableHFileOutputFormat to make it support this > feature. > (2) HFileOutputFormat2 support Customized Compression algorithm for column > family and BloomFilter, also support customized DataBlockEncoding for the > output hfiles. We can also make MultiTableHFileOutputFormat to support these > features. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18143) [AMv2] Backoff on failed report of region transition quickly goes to astronomical time scale
[ https://issues.apache.org/jira/browse/HBASE-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033816#comment-16033816 ] Hudson commented on HBASE-18143: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3118 (See [https://builds.apache.org/job/HBase-Trunk_matrix/3118/]) HBASE-18143 [AMv2] Backoff on failed report of region transition quickly (stack: rev e1f3c89b3be10f0b52ea216e12dfe1fdad564ee8) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java > [AMv2] Backoff on failed report of region transition quickly goes to > astronomical time scale > > > Key: HBASE-18143 > URL: https://issues.apache.org/jira/browse/HBASE-18143 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 2.0.0 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18143.master.001.patch, > HBASE-18143.master.002.patch, HBASE-18143.master.002.patch > > > Testing on cluster w/ aggressive killing, if Master is killed serially a few > times such that is offline a good while, regionservers that want to report a > region transition pause too long between retries. > Here is the regionserver reporting failures: > {code} > 1 2017-05-31 20:50:53,840 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#0) after 1008ms delay (Master is coming online...). > 2 2017-05-31 20:50:54,853 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#1) after 2026ms delay (Master is coming online...). > 3 2017-05-31 20:50:56,886 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#2) after 6084ms delay (Master is coming online...). > 4 2017-05-31 20:51:02,976 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#3) after 30588ms delay (Master is coming online...). > 5 2017-05-31 20:51:33,570 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#4) after 308422ms delay (Master is coming online...). > 6 2017-05-31 20:56:41,997 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234"
[jira] [Commented] (HBASE-18109) Assign system tables first (priority)
[ https://issues.apache.org/jira/browse/HBASE-18109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033719#comment-16033719 ] Yi Liang commented on HBASE-18109: -- HI Stack, I have spent some time studying Procedure V2 and the new AM, and also interested in this jira, could I take it? > Assign system tables first (priority) > - > > Key: HBASE-18109 > URL: https://issues.apache.org/jira/browse/HBASE-18109 > Project: HBase > Issue Type: Sub-task > Components: Region Assignment >Affects Versions: 2.0.0 >Reporter: stack >Priority: Critical > Fix For: 2.0.0 > > > Need this for stuff like the RSGroup table, etc. Assign these ahead of > user-space regions. > From 'Handle sys table assignment first (e.g. acl, namespace, rsgroup); > currently only hbase:meta is first.' of > https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.oefcyphs0v0x -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17339) Scan-Memory-First Optimization for Get Operations
[ https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033707#comment-16033707 ] Eshcar Hillel commented on HBASE-17339: --- After some time away from this Jira, and some additional experiments and digging into the code, here are our current understanding: HBase already implements some optimization which makes the current suggestion less critical. I will try to explain it in a nutshell. As mentioned, a get operation is divided into two main steps (1) creating and filtering all HFile scanners and memory scanners, (2) applying the next operation which retrieves the result for the operation. HBase defers the seek operation of the scanners as much as possible. In step (1) all scanners are combined in a key-value heap which is sorted by the top key of all scanners. However if there is more than one scanner, then the HFiles scanners do not apply real seek. Instead they set the current cell to be a fake cell which simulates as if a seek to the key was done. In cases were the key can be found both in memory and on disk memory segments have higher timestamps, and they reside at the top of the heap. Finally, in step (2) the store scanner gets the result from the scanners heap. It starts querying the scanners at the top. Only at this point if an HFile scanner is polled from the heap and no real seek was done HBase seeks the key in the file. This seek might end up finding the blocks in the cache or it retrieves them from disk. In addition, in step (1) filtering HFile scanners requires reading HFile metadata and bloom filters -- in most cases these can be found in cache. The optimization implemented in this Jira takes a different approach by trying to only look in memory segments as first step. When the data is found in memory this indeed reduces latency since it avoids the need to read HFile metadata and bloom filters and manages a bigger scanners heap, but when the data is only on disk it incurs the overhead of scanning the data twice (memory only and then full scan). The question is, given this understanding is there a point in having the new optimization, or are we satisfied with the current one? Is there a known scenario where not all bloom filters and metadata blocks are found in the cache? > Scan-Memory-First Optimization for Get Operations > - > > Key: HBASE-17339 > URL: https://issues.apache.org/jira/browse/HBASE-17339 > Project: HBase > Issue Type: Improvement >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-17339-V01.patch, HBASE-17339-V02.patch, > HBASE-17339-V03.patch, HBASE-17339-V03.patch, HBASE-17339-V04.patch, > HBASE-17339-V05.patch, HBASE-17339-V06.patch, read-latency-mixed-workload.jpg > > > The current implementation of a get operation (to retrieve values for a > specific key) scans through all relevant stores of the region; for each store > both memory components (memstores segments) and disk components (hfiles) are > scanned in parallel. > We suggest to apply an optimization that speculatively scans memory-only > components first and only if the result is incomplete scans both memory and > disk. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18005) read replica: handle the case that region server hosting both primary replica and meta region is down
[ https://issues.apache.org/jira/browse/HBASE-18005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033681#comment-16033681 ] Hadoop QA commented on HBASE-18005: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 30s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 44s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 21s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 108m 31s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 158m 4s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870835/HBASE-18005-master-006.patch | | JIRA Issue | HBASE-18005 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 0d2301121a13 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / c7a7f88 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7041/testReport/ | | modules | C: hbase-client hbase-server U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7041/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > read replica: handle the case that region
[jira] [Commented] (HBASE-18126) Increment class
[ https://issues.apache.org/jira/browse/HBASE-18126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033674#comment-16033674 ] Enis Soztutar commented on HBASE-18126: --- Looks good overall. A couple of comments: - This should be named {{ToInt64()}} and should have a return value of int64_t. {{long}} has no guarantees on the number of bytes that represents the value, while int64_t is exactly 64bits. Since we are interoperating with Java whose ints and longs are 32 and 64 respectively, we should always use {{int32_t}} or {{int64_t}}, and never int / long for these kind of APIs. See the patch for HBASE-17220 for examples. {code} +long BytesUtil::ToLong(std::string str) {code} also change the type for the variable {{l}}. - You have to check the length before accessing things like this: {code} + l ^= bytes[i]; {code} I was checking the cost of std::string::c_str() it seems in most implementations, it is just returning the internal pointer. So using it should be fine. - You need to do this reverse because of endianness. I think this code should work with both little and big endian machines: +std::reverse(res.begin(), res.end()); Instead of reversing at the end, let's do two loops after detecting the endianness of the machine. There may be a more optimal way using reinterpret_cast or something, but these can be optimized later. - In the delete patch the corresponding method was named {{DeleteToMutateRequest}}, but here it is {{RequestConverter::ToIncrementRequest}}. Let's stick to one naming scheme (either change the name of the Delete method, or change this name). - Increment should return a {{std::shared_ptr}} of the incremented value. {code} +folly::Future RawAsyncTable::Increment(const hbase::Increment& incr) { {code} > Increment class > --- > > Key: HBASE-18126 > URL: https://issues.apache.org/jira/browse/HBASE-18126 > Project: HBase > Issue Type: Sub-task >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 18126.v6.txt > > > These Increment objects are used by the Table implementation to perform > increment operation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18146) Long running integration test similar to TestAcidGuarantees
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033616#comment-16033616 ] Andrew Purtell commented on HBASE-18146: Looking at the code, IntegrationTestAcidGuarantees wants to apply special configuration to the minicluster and then run three relatively short canned scenarios. It's good to know someone is using it on real clusters and with chaos. Do you think it sufficient or could there be improvement? I'm pretty sure there is room for improvement for testing against real clusters. One thing I was thinking regards a new IT is a new monkey policy that flushes and compacts very frequently. > Long running integration test similar to TestAcidGuarantees > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work > with minicluster based testing and do not run for a long duration. Consider a > new integration test that makes similar atomicity checks while running for, > potentially, a very long time, determined by test parameters supplied on the > command line (perhaps as property definitions). The new integration test > should expect to run against a distributed cluster, support specification of > desired monkey policy, and not require any special non-default site > configuration settings. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18143) [AMv2] Backoff on failed report of region transition quickly goes to astronomical time scale
[ https://issues.apache.org/jira/browse/HBASE-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-18143: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to master branch. Thanks for the review [~uagashe] > [AMv2] Backoff on failed report of region transition quickly goes to > astronomical time scale > > > Key: HBASE-18143 > URL: https://issues.apache.org/jira/browse/HBASE-18143 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 2.0.0 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18143.master.001.patch, > HBASE-18143.master.002.patch, HBASE-18143.master.002.patch > > > Testing on cluster w/ aggressive killing, if Master is killed serially a few > times such that is offline a good while, regionservers that want to report a > region transition pause too long between retries. > Here is the regionserver reporting failures: > {code} > 1 2017-05-31 20:50:53,840 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#0) after 1008ms delay (Master is coming online...). > 2 2017-05-31 20:50:54,853 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#1) after 2026ms delay (Master is coming online...). > 3 2017-05-31 20:50:56,886 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#2) after 6084ms delay (Master is coming online...). > 4 2017-05-31 20:51:02,976 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#3) after 30588ms delay (Master is coming online...). > 5 2017-05-31 20:51:33,570 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#4) after 308422ms delay (Master is coming online...). > 6 2017-05-31 20:56:41,997 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#5) after 6171203ms delay (Master is coming online...). > {code} > See how by the time we get to the 5th retry, we are waiting 100 minutes > before we'll retry. That is
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033563#comment-16033563 ] Mikhail Antonov commented on HBASE-18145: - +1, nice catch. Yep that needs to go to 1.3.2. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18146) Long running integration test similar to TestAcidGuarantees
[ https://issues.apache.org/jira/browse/HBASE-18146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033556#comment-16033556 ] Jonathan Hsieh commented on HBASE-18146: IntegrationTestAcidGuarantees works against real clusters for a configurable amount of time. See: https://github.com/apache/hbase/blob/master/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestAcidGuarantees.java#L113 This can be done via the command line like so: hbase org.apache.hadoop.hbase.IntegrationTestAcidGuarantees -DnumWriters=5 -DnumGetters=1 -DnumScanners=1 -Dmillis=60 -m calm In some internal setups, we've been running it against multiple branches for 10 minute runs with each monkey enabled against various configuration settings (e.g. all writes through multiwal, through mob write path, under kerberos security etc). > Long running integration test similar to TestAcidGuarantees > --- > > Key: HBASE-18146 > URL: https://issues.apache.org/jira/browse/HBASE-18146 > Project: HBase > Issue Type: Test > Components: integration tests >Reporter: Andrew Purtell > > TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work > with minicluster based testing and do not run for a long duration. Consider a > new integration test that makes similar atomicity checks while running for, > potentially, a very long time, determined by test parameters supplied on the > command line (perhaps as property definitions). The new integration test > should expect to run against a distributed cluster, support specification of > desired monkey policy, and not require any special non-default site > configuration settings. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-16543) Separate Create/Modify Table operations from open/reopen regions
[ https://issues.apache.org/jira/browse/HBASE-16543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe reassigned HBASE-16543: Assignee: Umesh Agashe > Separate Create/Modify Table operations from open/reopen regions > > > Key: HBASE-16543 > URL: https://issues.apache.org/jira/browse/HBASE-16543 > Project: HBase > Issue Type: Sub-task > Components: master >Affects Versions: 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Umesh Agashe > Fix For: 2.0.0 > > > At the moment create table and modify table operations will trigger an > open/reopen of the regions inside the DDL operation. > we should split the operation in two parts > - create table, enable table regions > - modify table, reopen table regions -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-18104) [AMv2] Enable aggregation of RPCs (assigns/unassigns, etc.)
[ https://issues.apache.org/jira/browse/HBASE-18104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe reassigned HBASE-18104: Assignee: Umesh Agashe > [AMv2] Enable aggregation of RPCs (assigns/unassigns, etc.) > --- > > Key: HBASE-18104 > URL: https://issues.apache.org/jira/browse/HBASE-18104 > Project: HBase > Issue Type: Sub-task > Components: Region Assignment >Affects Versions: 2.0.0 >Reporter: stack >Assignee: Umesh Agashe > Fix For: 2.0.0 > > > Machinery is in place to coalesce AMv2 RPCs (Assigns, Unassigns). It needs > enabling and verification. From '6.3 We don’t do the aggregating of Assigns' > of > https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.uuwvci2r2tz4 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18054) log when we add/remove failed servers in client
[ https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033491#comment-16033491 ] Hadoop QA commented on HBASE-18054: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 13s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 26m 3s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 32s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 7s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 36m 42s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.13.1 Server=1.13.1 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870834/HBASE-18054.v3.master.patch | | JIRA Issue | HBASE-18054 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux d61e6c9f9d87 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / c7a7f88 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7042/testReport/ | | modules | C: hbase-client U: hbase-client | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7042/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > log when we add/remove failed servers in client > --- > > Key: HBASE-18054 > URL: https://issues.apache.org/jira/browse/HBASE-18054 > Project: HBase > Issue Type: Bug > Components: Client, Operability >Affects Versions: 1.3.0 >Reporter: Sean Busbey >Assignee: Ali > Attachments: HBASE-18054.patch, HBASE-18054.v2.master.patch, >
[jira] [Commented] (HBASE-18143) [AMv2] Backoff on failed report of region transition quickly goes to astronomical time scale
[ https://issues.apache.org/jira/browse/HBASE-18143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033437#comment-16033437 ] Umesh Agashe commented on HBASE-18143: -- +1 lgtm > [AMv2] Backoff on failed report of region transition quickly goes to > astronomical time scale > > > Key: HBASE-18143 > URL: https://issues.apache.org/jira/browse/HBASE-18143 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Affects Versions: 2.0.0 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18143.master.001.patch, > HBASE-18143.master.002.patch, HBASE-18143.master.002.patch > > > Testing on cluster w/ aggressive killing, if Master is killed serially a few > times such that is offline a good while, regionservers that want to report a > region transition pause too long between retries. > Here is the regionserver reporting failures: > {code} > 1 2017-05-31 20:50:53,840 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#0) after 1008ms delay (Master is coming online...). > 2 2017-05-31 20:50:54,853 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#1) after 2026ms delay (Master is coming online...). > 3 2017-05-31 20:50:56,886 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#2) after 6084ms delay (Master is coming online...). > 4 2017-05-31 20:51:02,976 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#3) after 30588ms delay (Master is coming online...). > 5 2017-05-31 20:51:33,570 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#4) after 308422ms delay (Master is coming online...). > 6 2017-05-31 20:56:41,997 INFO [RS_CLOSE_REGION-ve0542:16020-2] > regionserver.HRegionServer: Failed report of region transition server { > host_name: "ve0542.halxg.cloudera.com" port: 16020 start_code: 1496279470954 > } transition { transition_code: CLOSED region_info { region_id: 1496284931226 > table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: > "\337\377\377\377\377\377\377\362" end_key: > "\352\252\252\252\252\252\252\234" offline: false split: false replica_id: 0 > } }; retry (#5) after 6171203ms delay (Master is coming online...). > {code} > See how by the time we get to the 5th retry, we are waiting 100 minutes > before we'll retry. That is too long. Make retry happen more frequently. Data > is offline until the close is successfully
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033434#comment-16033434 ] Hadoop QA commented on HBASE-18145: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 23s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 12s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 57m 28s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 204m 34s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 288m 37s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestMobSnapshotCloneIndependence | | | hadoop.hbase.client.TestBlockEvictionFromClient | | Timed out junit tests | org.apache.hadoop.hbase.master.TestAssignmentManagerMetrics | | | org.apache.hadoop.hbase.master.TestMasterFailoverBalancerPersistence | | | org.apache.hadoop.hbase.master.TestMaster | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12870788/HBASE-18145.v0.patch | | JIRA Issue | HBASE-18145 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 0073526ed18a 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 123086e | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7037/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/7037/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7037/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output |
[jira] [Updated] (HBASE-18005) read replica: handle the case that region server hosting both primary replica and meta region is down
[ https://issues.apache.org/jira/browse/HBASE-18005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huaxiang sun updated HBASE-18005: - Attachment: HBASE-18005-master-006.patch Update the comments for the new test case. > read replica: handle the case that region server hosting both primary replica > and meta region is down > - > > Key: HBASE-18005 > URL: https://issues.apache.org/jira/browse/HBASE-18005 > Project: HBase > Issue Type: Bug >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: HBASE-18005-master-001.patch, > HBASE-18005-master-002.patch, HBASE-18005-master-003.patch, > HBASE-18005-master-004.patch, HBASE-18005-master-005.patch, > HBASE-18005-master-006.patch > > > Identified one corner case in testing that when the region server hosting > both primary replica and the meta region is down, the client tries to reload > the primary replica location from meta table, it is supposed to clean up only > the cached location for specific replicaId, but it clears caches for all > replicas. Please see > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L813 > Since it takes some time for regions to be reassigned (including meta > region), the following may throw exception > https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L173 > This exception needs to be caught and it needs to get cached location (in > this case, the primary replica's location is not available). If there are > cached locations for other replicas, it can still go ahead to get stale > values from secondary replicas. > With meta replica, it still helps to not clean up the caches for all replicas > as the info from primary meta replica is up-to-date. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18054) log when we add/remove failed servers in client
[ https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033420#comment-16033420 ] Ali commented on HBASE-18054: - [~apurtell] ASF license header added in HBASE-18054.v3.master.patch > log when we add/remove failed servers in client > --- > > Key: HBASE-18054 > URL: https://issues.apache.org/jira/browse/HBASE-18054 > Project: HBase > Issue Type: Bug > Components: Client, Operability >Affects Versions: 1.3.0 >Reporter: Sean Busbey >Assignee: Ali > Attachments: HBASE-18054.patch, HBASE-18054.v2.master.patch, > HBASE-18054.v3.master.patch > > > Currently we log if a server is in the failed server list when we go to > connect to it, but we don't log anything about when the server got into the > list. > This means we have to search the log for errors involving the same server > name that (hopefully) managed to get into the log within > {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18148) automated tests against non-minicluster based deployment
[ https://issues.apache.org/jira/browse/HBASE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033421#comment-16033421 ] Sean Busbey commented on HBASE-18148: - sure. short term: * DOCKER_REGISTRY_URL ** hbasejenkinsuser-docker-hbase.bintray.io * CLUSTERDOCK_DOCKER_REGISTRY_USERNAME ** hbasejenkinsuser * DOCKER_REGISTRY_NAMESPACE ** dev > automated tests against non-minicluster based deployment > > > Key: HBASE-18148 > URL: https://issues.apache.org/jira/browse/HBASE-18148 > Project: HBase > Issue Type: Test > Components: community, integration tests >Reporter: Sean Busbey > Attachments: HBASE-18148.0.patch > > > we should have ITs that run automatically (i.e. nightly) against a cluster > that isn't a minicluster or standalone instance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18148) automated tests against non-minicluster based deployment
[ https://issues.apache.org/jira/browse/HBASE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033416#comment-16033416 ] Alex Leblang commented on HBASE-18148: -- Could you include the default values for the parameters in the script? This would make it easier for reviewers without jenkins access to follow the script. > automated tests against non-minicluster based deployment > > > Key: HBASE-18148 > URL: https://issues.apache.org/jira/browse/HBASE-18148 > Project: HBase > Issue Type: Test > Components: community, integration tests >Reporter: Sean Busbey > Attachments: HBASE-18148.0.patch > > > we should have ITs that run automatically (i.e. nightly) against a cluster > that isn't a minicluster or standalone instance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18054) log when we add/remove failed servers in client
[ https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ali updated HBASE-18054: Status: Patch Available (was: Open) > log when we add/remove failed servers in client > --- > > Key: HBASE-18054 > URL: https://issues.apache.org/jira/browse/HBASE-18054 > Project: HBase > Issue Type: Bug > Components: Client, Operability >Affects Versions: 1.3.0 >Reporter: Sean Busbey >Assignee: Ali > Attachments: HBASE-18054.patch, HBASE-18054.v2.master.patch, > HBASE-18054.v3.master.patch > > > Currently we log if a server is in the failed server list when we go to > connect to it, but we don't log anything about when the server got into the > list. > This means we have to search the log for errors involving the same server > name that (hopefully) managed to get into the log within > {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18054) log when we add/remove failed servers in client
[ https://issues.apache.org/jira/browse/HBASE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ali updated HBASE-18054: Attachment: HBASE-18054.v3.master.patch > log when we add/remove failed servers in client > --- > > Key: HBASE-18054 > URL: https://issues.apache.org/jira/browse/HBASE-18054 > Project: HBase > Issue Type: Bug > Components: Client, Operability >Affects Versions: 1.3.0 >Reporter: Sean Busbey >Assignee: Ali > Attachments: HBASE-18054.patch, HBASE-18054.v2.master.patch, > HBASE-18054.v3.master.patch > > > Currently we log if a server is in the failed server list when we go to > connect to it, but we don't log anything about when the server got into the > list. > This means we have to search the log for errors involving the same server > name that (hopefully) managed to get into the log within > {{FAILED_SERVER_EXPIRY_KEY}} milliseconds earlier (default 2 seconds). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-18103) [AMv2] If Master gives OPEN to another, if original eventually succeeds, Master will kill it
[ https://issues.apache.org/jira/browse/HBASE-18103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe reassigned HBASE-18103: Assignee: Umesh Agashe > [AMv2] If Master gives OPEN to another, if original eventually succeeds, > Master will kill it > > > Key: HBASE-18103 > URL: https://issues.apache.org/jira/browse/HBASE-18103 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2 >Reporter: stack >Assignee: Umesh Agashe >Priority: Critical > Fix For: 2.0.0 > > > If a RS is slow to open a Region, the Master will give the Region to another > to open it (In this case, was a massive set of edits to process and a load of > StoreFiles to open...). Should the original RS succeed with its open > eventually, on reporting the master the successful open, the Master currently > kills the RS because the region is supposed to be elsewhere. > This is an easy fix. > The RS does not fully open a Region until Master gives it the go so just > close the region if master rejects the open > See '6.1.1 If Master gives Region to another to Open, old RS will be kill > itself on reject by Master; easy fix!' in > https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.qtfojp9774h -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18148) automated tests against non-minicluster based deployment
[ https://issues.apache.org/jira/browse/HBASE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033400#comment-16033400 ] Sean Busbey commented on HBASE-18148: - these were also aug/sep 2016 I think. > automated tests against non-minicluster based deployment > > > Key: HBASE-18148 > URL: https://issues.apache.org/jira/browse/HBASE-18148 > Project: HBase > Issue Type: Test > Components: community, integration tests >Reporter: Sean Busbey > Attachments: HBASE-18148.0.patch > > > we should have ITs that run automatically (i.e. nightly) against a cluster > that isn't a minicluster or standalone instance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18148) automated tests against non-minicluster based deployment
[ https://issues.apache.org/jira/browse/HBASE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-18148: Attachment: HBASE-18148.0.patch attaching WIP with contents of the two jobs from that last time we tried this with clusterdock. these represent: * [job that builds topology images|https://builds.apache.org/view/H-L/view/HBase/job/HBase-Build-clusterdock-Clusters/] * [job that runs ITBLL against the topology for master HEAD|https://builds.apache.org/view/H-L/view/HBase/job/HBase-master-IntegrationTestBigLinkedList/] > automated tests against non-minicluster based deployment > > > Key: HBASE-18148 > URL: https://issues.apache.org/jira/browse/HBASE-18148 > Project: HBase > Issue Type: Test > Components: community, integration tests >Reporter: Sean Busbey > Attachments: HBASE-18148.0.patch > > > we should have ITs that run automatically (i.e. nightly) against a cluster > that isn't a minicluster or standalone instance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18148) automated tests against non-minicluster based deployment
[ https://issues.apache.org/jira/browse/HBASE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033395#comment-16033395 ] Sean Busbey commented on HBASE-18148: - one way to do this that we started on in the past was to use [clusterdock|https://github.com/clusterdock/]. > automated tests against non-minicluster based deployment > > > Key: HBASE-18148 > URL: https://issues.apache.org/jira/browse/HBASE-18148 > Project: HBase > Issue Type: Test > Components: community, integration tests >Reporter: Sean Busbey > > we should have ITs that run automatically (i.e. nightly) against a cluster > that isn't a minicluster or standalone instance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18148) automated tests against non-minicluster based deployment
Sean Busbey created HBASE-18148: --- Summary: automated tests against non-minicluster based deployment Key: HBASE-18148 URL: https://issues.apache.org/jira/browse/HBASE-18148 Project: HBase Issue Type: Test Components: community, integration tests Reporter: Sean Busbey we should have ITs that run automatically (i.e. nightly) against a cluster that isn't a minicluster or standalone instance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18147) nightly job to check health of active branches
[ https://issues.apache.org/jira/browse/HBASE-18147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-18147: Attachment: HBASE-18147.0.patch WIP patch that I made back in ~September 2016 that checks just master. > nightly job to check health of active branches > -- > > Key: HBASE-18147 > URL: https://issues.apache.org/jira/browse/HBASE-18147 > Project: HBase > Issue Type: Test > Components: community, test >Reporter: Sean Busbey > Attachments: HBASE-18147.0.patch > > > We should set up a job that runs Apache Yetus Test Patch's nightly mode. > Essentially, it produces a report that considers how the branch measures up > against the things we check in our precommit checks. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16261) MultiHFileOutputFormat Enhancement
[ https://issues.apache.org/jira/browse/HBASE-16261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-16261: - Hadoop Flags: Reviewed > MultiHFileOutputFormat Enhancement > > > Key: HBASE-16261 > URL: https://issues.apache.org/jira/browse/HBASE-16261 > Project: HBase > Issue Type: Sub-task > Components: hbase, mapreduce >Affects Versions: 2.0.0 >Reporter: Yi Liang >Assignee: Yi Liang > Fix For: 2.0.0 > > Attachments: HBASE-16261-V1.patch, HBASE-16261-V2.patch, > HBASE-16261-V3.patch, HBASE-16261-V4.patch, HBASE-16261-V5.patch, > HBase-16261-V6.patch, HBase-16261-V7.patch, HBase-16261-V8.patch, > HBase-16261-V9.patch > > > Change MultiHFileOutputFormat to MultiTableHFileOutputFormat, Continuing work > to enhance the MultiTableHFileOutputFormat to make it more usable: > MultiTableHFileOutputFormat follow HFileOutputFormat2 > (1) HFileOutputFormat2 can read one table's region split keys. and then > output multiple hfiles for one family, and each hfile map to one region. We > can add partitioner in MultiTableHFileOutputFormat to make it support this > feature. > (2) HFileOutputFormat2 support Customized Compression algorithm for column > family and BloomFilter, also support customized DataBlockEncoding for the > output hfiles. We can also make MultiTableHFileOutputFormat to support these > features. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16261) MultiHFileOutputFormat Enhancement
[ https://issues.apache.org/jira/browse/HBASE-16261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-16261: - Resolution: Fixed Status: Resolved (was: Patch Available) > MultiHFileOutputFormat Enhancement > > > Key: HBASE-16261 > URL: https://issues.apache.org/jira/browse/HBASE-16261 > Project: HBase > Issue Type: Sub-task > Components: hbase, mapreduce >Affects Versions: 2.0.0 >Reporter: Yi Liang >Assignee: Yi Liang > Fix For: 2.0.0 > > Attachments: HBASE-16261-V1.patch, HBASE-16261-V2.patch, > HBASE-16261-V3.patch, HBASE-16261-V4.patch, HBASE-16261-V5.patch, > HBase-16261-V6.patch, HBase-16261-V7.patch, HBase-16261-V8.patch, > HBase-16261-V9.patch > > > Change MultiHFileOutputFormat to MultiTableHFileOutputFormat, Continuing work > to enhance the MultiTableHFileOutputFormat to make it more usable: > MultiTableHFileOutputFormat follow HFileOutputFormat2 > (1) HFileOutputFormat2 can read one table's region split keys. and then > output multiple hfiles for one family, and each hfile map to one region. We > can add partitioner in MultiTableHFileOutputFormat to make it support this > feature. > (2) HFileOutputFormat2 support Customized Compression algorithm for column > family and BloomFilter, also support customized DataBlockEncoding for the > output hfiles. We can also make MultiTableHFileOutputFormat to support these > features. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18147) nightly job to check health of active branches
[ https://issues.apache.org/jira/browse/HBASE-18147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033365#comment-16033365 ] Sean Busbey commented on HBASE-18147: - [here's an example run output|https://builds.apache.org/view/H-L/view/HBase/job/hbase-qbt-master/417/artifact/out/console-report.html]. we can also do email notifications. eventually, this or an aggregate that includes it can replace the notifications about all of our per-branch test runs. > nightly job to check health of active branches > -- > > Key: HBASE-18147 > URL: https://issues.apache.org/jira/browse/HBASE-18147 > Project: HBase > Issue Type: Test > Components: community, test >Reporter: Sean Busbey > > We should set up a job that runs Apache Yetus Test Patch's nightly mode. > Essentially, it produces a report that considers how the branch measures up > against the things we check in our precommit checks. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18147) nightly job to check health of active branches
Sean Busbey created HBASE-18147: --- Summary: nightly job to check health of active branches Key: HBASE-18147 URL: https://issues.apache.org/jira/browse/HBASE-18147 Project: HBase Issue Type: Test Components: community, test Reporter: Sean Busbey We should set up a job that runs Apache Yetus Test Patch's nightly mode. Essentially, it produces a report that considers how the branch measures up against the things we check in our precommit checks. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18146) Long running integration test similar to TestAcidGuarantees
Andrew Purtell created HBASE-18146: -- Summary: Long running integration test similar to TestAcidGuarantees Key: HBASE-18146 URL: https://issues.apache.org/jira/browse/HBASE-18146 Project: HBase Issue Type: Test Components: integration tests Reporter: Andrew Purtell TestAcidGuarantees and IntegrationTestAcidGuarantees both really only work with minicluster based testing and do not run for a long duration. Consider a new integration test that makes similar atomicity checks while running for, potentially, a very long time, determined by test parameters supplied on the command line (perhaps as property definitions). The new integration test should expect to run against a distributed cluster, support specification of desired monkey policy, and not require any special non-default site configuration settings. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18111) Replication stuck when cluster connection is closed
[ https://issues.apache.org/jira/browse/HBASE-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1602#comment-1602 ] Andrew Purtell commented on HBASE-18111: Failure is unrelated. Unless objection I will commit the v2 patch to master and branch-1 shortly. > Replication stuck when cluster connection is closed > --- > > Key: HBASE-18111 > URL: https://issues.apache.org/jira/browse/HBASE-18111 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5, 0.98.24, 1.1.10 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Attachments: HBASE-18111.patch, HBASE-18111-v1.patch, > HBASE-18111-v2.patch > > > Log: > {code} > 2017-05-24,03:01:25,603 ERROR [regionserver13700-SendThread(hostxxx:11000)] > org.apache.zookeeper.ClientCnxn: SASL authentication with Zookeeper Quorum > member failed: javax.security.sasl.SaslException: An error: > (java.security.PrivilegedActionException: javax.security.sasl.SaslException: > GSS initiate failed [Caused by GSSException: No valid credentials provided > (Mechanism level: Connection reset)]) occurred when evaluating Zookeeper > Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED > state. > 2017-05-24,03:01:25,615 FATAL [regionserver13700-EventThread] > org.apache.hadoop.hbase.client.HConnectionImplementation: > hconnection-0x1148dd9b-0x35b6b4d4ca999c6, > quorum=10.108.37.30:11000,10.108.38.30:11000,10.108.39.30:11000,10.108.84.25:11000,10.108.84.32:11000, > baseZNode=/hbase/c3prc-xiaomi98 hconnection-0x1148dd9b-0x35b6b4d4ca999c6 > received auth failed from ZooKeeper, aborting > org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = > AuthFailed > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:425) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:333) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2017-05-24,03:01:25,615 INFO [regionserver13700-EventThread] > org.apache.hadoop.hbase.client.HConnectionImplementation: Closing zookeeper > sessionid=0x35b6b4d4ca999c6 > 2017-05-24,03:01:25,623 WARN [regionserver13700.replicationSource,800] > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint: > Replicate edites to peer cluster failed. > java.io.IOException: Call to hostxxx/10.136.22.6:24600 failed on local > exception: java.io.IOException: Connection closed > {code} > jstack > {code} > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.sleepForRetries(HBaseInterClusterReplicationEndpoint.java:127) > at > org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:199) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:905) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:492) > {code} > The cluster connection was aborted when the ZookeeperWatcher receive a > AuthFailed event. Then the HBaseInterClusterReplicationEndpoint's replicate() > method will stuck in a while loop. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033323#comment-16033323 ] Andrew Purtell commented on HBASE-18145: lgtm Nice new test. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18144) Forward-port the old exclusive row lock; there are scenarios where it performs better
[ https://issues.apache.org/jira/browse/HBASE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033316#comment-16033316 ] stack commented on HBASE-18144: --- Agree [~davelatham] Should just do-the-right-thing... always! > Forward-port the old exclusive row lock; there are scenarios where it > performs better > - > > Key: HBASE-18144 > URL: https://issues.apache.org/jira/browse/HBASE-18144 > Project: HBase > Issue Type: Bug > Components: Increment >Affects Versions: 1.2.5 >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.3.2, 1.2.7 > > > Description to follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18144) Forward-port the old exclusive row lock; there are scenarios where it performs better
[ https://issues.apache.org/jira/browse/HBASE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033286#comment-16033286 ] Dave Latham commented on HBASE-18144: - Fascinating. Thanks, for sharing so much of the detail of the sleuthing story. Having exclusive locks as a config option sounds like it would this one application that is aware of what is going on, but how likely is it that other folks are going to know when to turn this on? Would sure be great if there was a solution that worked without configuration. > Forward-port the old exclusive row lock; there are scenarios where it > performs better > - > > Key: HBASE-18144 > URL: https://issues.apache.org/jira/browse/HBASE-18144 > Project: HBase > Issue Type: Bug > Components: Increment >Affects Versions: 1.2.5 >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.3.2, 1.2.7 > > > Description to follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Description: After HBASE-17887, the store scanner closes the memstore scanner in updating the inner scanners. The chunk which stores the current data may be reclaimed. So if the chunk is rewrited before we send the data to client, the client will receive the corrupt data. This issue also breaks the TestAcid*. was: HBASE-17887 close the memstore scanner after flush. The chunk which stores the current data may be reclaimed. So if the chunk is rewrited before we send the data to client, the client will receive the corrupt data. This issue also breaks the TestAcid*. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18144) Forward-port the old exclusive row lock; there are scenarios where it performs better
[ https://issues.apache.org/jira/browse/HBASE-18144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033273#comment-16033273 ] Esteban Gutierrez commented on HBASE-18144: --- Yeah, per table sounds a reasonable approach. I've been thinking that the locking mechanism could be pluggable so we can keep multiple implementations and each could be better suited for some workloads. I don't think is overkill to do that but that would be an option. > Forward-port the old exclusive row lock; there are scenarios where it > performs better > - > > Key: HBASE-18144 > URL: https://issues.apache.org/jira/browse/HBASE-18144 > Project: HBase > Issue Type: Bug > Components: Increment >Affects Versions: 1.2.5 >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.3.2, 1.2.7 > > > Description to follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Priority: Blocker (was: Critical) > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > HBASE-17887 close the memstore scanner after flush. The chunk which stores > the current data may be reclaimed. So if the chunk is rewrited before we send > the data to client, the client will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Description: HBASE-17887 close the memstore scanner after flush. The chunk which stores the current data may be reclaimed. So if the chunk is rewrited before we send the data to client, the client will receive the corrupt data. This issue also breaks the TestAcid*. was: HBASE-17887 close the memstore after flush. The chunk which stores the current data may be reclaimed. So if the chunk is rewrited before we send the data to client, the client will receive the corrupt data. This issue also breaks the TestAcid*. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > HBASE-17887 close the memstore scanner after flush. The chunk which stores > the current data may be reclaimed. So if the chunk is rewrited before we send > the data to client, the client will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17887) Row-level consistency is broken for read
[ https://issues.apache.org/jira/browse/HBASE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033257#comment-16033257 ] Chia-Ping Tsai commented on HBASE-17887: HBASE-18145 is another bug about the row-level consistency. > Row-level consistency is broken for read > > > Key: HBASE-17887 > URL: https://issues.apache.org/jira/browse/HBASE-17887 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.3.0 >Reporter: Umesh Agashe >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-17887.branch-1.v0.patch, > HBASE-17887.branch-1.v1.patch, HBASE-17887.branch-1.v1.patch, > HBASE-17887.branch-1.v2.patch, HBASE-17887.branch-1.v2.patch, > HBASE-17887.branch-1.v3.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v4.patch, HBASE-17887.branch-1.v4.patch, > HBASE-17887.branch-1.v5.patch, HBASE-17887.branch-1.v6.patch, > HBASE-17887.ut.patch, HBASE-17887.v0.patch, HBASE-17887.v1.patch, > HBASE-17887.v2.patch, HBASE-17887.v3.patch, HBASE-17887.v4.patch, > HBASE-17887.v5.patch, HBASE-17887.v5.patch > > > The scanner of latest memstore may be lost if we make quick flushes. The > following step may help explain this issue. > # put data_A (seq id = 10, active store data_A and snapshots is empty) > # snapshot of 1st flush (active is empty and snapshot stores data_A) > # put data_B (seq id = 11, active store data_B and snapshot store data_A) > # create user scanner (read point = 11, so It should see the data_B) > # commit of 1st flush > #* clear snapshot ((hfile_A has data_A, active store data_B, and snapshot is > empty) > #* update the reader (the user scanner receives the hfile_A) > # snapshot of 2st flush (active is empty and snapshot store data_B) > # commit of 2st flush > #* clear snapshot (hfile_A has data_A, hfile_B has data_B, active is empty, > and snapshot is empty) – this is critical piece. > #* -update the reader- (haven't happen) > # user scanner update the kv scanners (it creates scanner of hfile_A but > nothing of memstore) > # user see the older data A – wrong result -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Description: HBASE-17887 close the memstore after flush. The chunk which stores the current data may be reclaimed. So if the chunk is rewrited before we send the data to client, the client will receive the corrupt data. This issue also breaks the TestAcid*. was: HBASE-17887 close the memstore after flush. The chuck which stores the current data may be reclaimed. So if the chuck is rewrited before we send the data to client, the client will receive the corrupt data. This issue also breaks the TestAcid*. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > HBASE-17887 close the memstore after flush. The chunk which stores the > current data may be reclaimed. So if the chunk is rewrited before we send the > data to client, the client will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033253#comment-16033253 ] Chia-Ping Tsai commented on HBASE-18145: The HBASE-17887 introduce this critical bug into the branch-1 and branch-1.3, so i mark this issue as blocker. [~mantonov] FYI > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > HBASE-17887 close the memstore after flush. The chuck which stores the > current data may be reclaimed. So if the chuck is rewrited before we send the > data to client, the client will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Description: HBASE-17887 close the memstore after flush. The chuck which stores the current data may be reclaimed. So if the chuck is rewrited before we send the data to client, the client will receive the corrupt data. This issue also breaks the TestAcid*. was: HBASE-18019 close the memstore after flush. The chuck which stores the current data may be reclaimed. So if the chuck is rewrited before we send the data to client, the client will receive the corrupt data. This issue also breaks the TestAcid*. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > HBASE-17887 close the memstore after flush. The chuck which stores the > current data may be reclaimed. So if the chuck is rewrited before we send the > data to client, the client will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Fix Version/s: 1.3.2 1.4.0 > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > HBASE-18019 close the memstore after flush. The chuck which stores the > current data may be reclaimed. So if the chuck is rewrited before we send the > data to client, the client will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033239#comment-16033239 ] ramkrishna.s.vasudevan commented on HBASE-18145: Nice one. Minor nit - it is 'chunk'. +1 otherwise. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18145.v0.patch > > > HBASE-18019 close the memstore after flush. The chuck which stores the > current data may be reclaimed. So if the chuck is rewrited before we send the > data to client, the client will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16196) Update jruby to a newer version.
[ https://issues.apache.org/jira/browse/HBASE-16196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-16196: Attachment: HBASE-16196-branch-1.v9.patch I'm attaching my attempt at backporting v9 to branch-1. TestShell fails, but it's hard to see which things in TestShell are having a problem. [~mdrob], what do you think? Have time to grind on this now, or should we just make a follow-on JIRA for branch-1 backport? > Update jruby to a newer version. > > > Key: HBASE-16196 > URL: https://issues.apache.org/jira/browse/HBASE-16196 > Project: HBase > Issue Type: Bug > Components: dependencies, shell >Reporter: Elliott Clark >Assignee: Mike Drob >Priority: Critical > Fix For: 2.0.0, 1.5.0 > > Attachments: 0001-Update-to-JRuby-9.1.2.0-and-JLine-2.12.patch, > hbase-16196.branch-1.patch, HBASE-16196-branch-1.v9.patch, > hbase-16196.v2.branch-1.patch, hbase-16196.v3.branch-1.patch, > hbase-16196.v4.branch-1.patch, HBASE-16196.v5.patch, HBASE-16196.v6.patch, > HBASE-16196.v7.patch, HBASE-16196.v8.patch, HBASE-16196.v9.patch > > > Ruby 1.8.7 is no longer maintained. > The TTY library in the old jruby is bad. The newer one is less bad. > Since this is only a dependency on the hbase-shell module and not on > hbase-client or hbase-server this should be a pretty simple thing that > doesn't have any backwards compat issues. -- This message was sent by Atlassian JIRA (v6.3.15#6346)