[jira] [Commented] (HBASE-14318) make_rc.sh should purge/re-resolve dependencies from local repository
[ https://issues.apache.org/jira/browse/HBASE-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715332#comment-14715332 ] Nick Dimiduk commented on HBASE-14318: -- Already tried adding {{-U}} to the initial {{make install}}, didn't help. Now looking into {{[dependency:purge-local-repository|http://maven.apache.org/plugins/maven-dependency-plugin/purge-local-repository-mojo.html]}}. make_rc.sh should purge/re-resolve dependencies from local repository - Key: HBASE-14318 URL: https://issues.apache.org/jira/browse/HBASE-14318 Project: HBase Issue Type: Task Components: build Reporter: Nick Dimiduk Assignee: Nick Dimiduk Over on the 1.1.2RC1 VOTE thread, impressively pedantic [~enis] noticed the underlying hadoop version was built locally, not from upstream. Until such time as we can reliably build releases in a clean-room environment, let's have our scripts clean up after us. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL
[ https://issues.apache.org/jira/browse/HBASE-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715328#comment-14715328 ] stack commented on HBASE-14317: --- Is the concurrent shutting of regions which are waiting on safe point: {code} RS_CLOSE_REGION-r12s16:9104-1 #33639 prio=5 os_prio=0 tid=0x7fbf546fe000 nid=0x563 in Object.wait() [0x7fbf38107000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.hadoop.hbase.regionserver.HRegion.waitForFlushesAndCompactions(HRegion.java:1512) - locked 0x00056baa4888 (a org.apache.hadoop.hbase.regionserver.HRegion$WriteState) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371) - locked 0x00056baa4888 (a org.apache.hadoop.hbase.regionserver.HRegion$WriteState) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1336) - locked 0x00056baaf928 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:138) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} ... and then the FATAL roll of logs happening at same time the issue? Dig. Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL - Key: HBASE-14317 URL: https://issues.apache.org/jira/browse/HBASE-14317 Project: HBase Issue Type: Bug Affects Versions: 1.1.1 Reporter: stack Attachments: [Java] RS stuck on WAL sync to a dead DN - Pastebin.com.html, raw.php, subset.of.rs.log hbase-1.1.1 and hadoop-2.7.1 We try to roll logs because can't append (See HDFS-8960) but we get stuck. See attached thread dump and associated log. What is interesting is that syncers are waiting to take syncs to run and at same time we want to flush so we are waiting on a safe point but there seems to be nothing in our ring buffer; did we go to roll log and not add safe point sync to clear out ringbuffer? Needs a bit of study. Try to reproduce. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14318) make_rc.sh should purge/re-resolve dependencies from local repository
Nick Dimiduk created HBASE-14318: Summary: make_rc.sh should purge/re-resolve dependencies from local repository Key: HBASE-14318 URL: https://issues.apache.org/jira/browse/HBASE-14318 Project: HBase Issue Type: Task Components: build Reporter: Nick Dimiduk Assignee: Nick Dimiduk Over on the 1.1.2RC1 VOTE thread, impressively pedantic [~enis] noticed the underlying hadoop version was built locally, not from upstream. Until such time as we can reliably build releases in a clean-room environment, let's have our scripts clean up after us. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14314) Metrics for block cache should take region replicas into account
[ https://issues.apache.org/jira/browse/HBASE-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715363#comment-14715363 ] Nick Dimiduk commented on HBASE-14314: -- Overall looks good. Two things: # I think we want to keep overall stats too, as primary hit rate + replica hit rate != overall hit rate. # some places there's a new boolean flag, other places there's a change in method name. For instance, {{getBlockCachePrimaryHitCount()}} vs. {{getHitCount(boolean)}}. Would be better if this was consistent one way or there other. I think we only care about primary vs. replica (as opposed to primary vs secondary vs tertiary), so no need for an enum. My preference is for new method names, so we'd have {{getHitCount}}, {{getPrimaryHitCount}}, and {{getReplicaHitCount}} for overall, primary, and replicas respectively. Could even change {{getHitCount}} to {{getOverallHitCount}}, but that's probably not necessary. Metrics for block cache should take region replicas into account Key: HBASE-14314 URL: https://issues.apache.org/jira/browse/HBASE-14314 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14314-v1.txt Currently metrics for block cache are aggregates in the sense that they don't distinguish primary from secondary / tertiary replicas. This JIRA separates the block cache metrics for primary region replica from the aggregate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715377#comment-14715377 ] Jerry He commented on HBASE-14309: -- Hi, Ted You v2 patch is fine for the ruby script. You only need to add the default 'nil', like the following: {code} def command(force=nil) if force.nil? admin.balancer()? true: false elsif force == force admin.balancer(true)? true: false {code} Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14314) Metrics for block cache should take region replicas into account
[ https://issues.apache.org/jira/browse/HBASE-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715411#comment-14715411 ] Ted Yu commented on HBASE-14314: bq. primary hit rate + replica hit rate != overall hit rate Can you give a concrete example when the above would happen ? Metrics for block cache should take region replicas into account Key: HBASE-14314 URL: https://issues.apache.org/jira/browse/HBASE-14314 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14314-v1.txt Currently metrics for block cache are aggregates in the sense that they don't distinguish primary from secondary / tertiary replicas. This JIRA separates the block cache metrics for primary region replica from the aggregate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14308) HTableDescriptor WARN is not actionable
[ https://issues.apache.org/jira/browse/HBASE-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715414#comment-14715414 ] Lars Francke commented on HBASE-14308: -- Looks like it was just introduced by HBASE-14224 I see this error message popping up on a freshly started HBase when the hbase:meta Table Descriptor is loaded. It'll be hard to distinguish that from legitimate wrong usage. I'm happy to submit a patch that removes the warning againnot sure what you prefer? HTableDescriptor WARN is not actionable --- Key: HBASE-14308 URL: https://issues.apache.org/jira/browse/HBASE-14308 Project: HBase Issue Type: Task Components: Usability Affects Versions: 2.0.0 Reporter: Nick Dimiduk Priority: Minor Labels: beginner Notice this while testing another patch in standalone mode. I see warn lines like the following {noformat} 2015-08-25 14:19:47,057 WARN [1758008124@qtp-1276709283-0] hbase.HTableDescriptor: Use addCoprocessor* methods to add a coprocessor instead {noformat} This appears to come from {{HTableDescriptor#setValue(Bytes,Bytes)}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715412#comment-14715412 ] Hadoop QA commented on HBASE-14309: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752552/14309-v5.txt against master branch at commit aca8c3b74b09646c72c4e0fe26a4b2103da0d288. ATTACHMENT ID: 12752552 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color}. The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.0. Compilation errors resume: [ERROR] Error invoking method 'get(java.lang.Integer)' in java.util.ArrayList at META-INF/LICENSE.vm[line 1619, column 22] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) on project hbase-assembly: Error rendering velocity resource. Error invoking method 'get(java.lang.Integer)' in java.util.ArrayList at META-INF/LICENSE.vm[line 1619, column 22]: InvocationTargetException: Index: 0, Size: 0 - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hbase-assembly Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15280//console This message is automatically generated. Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt, 14309-v5.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12751) Allow RowLock to be reader writer
[ https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715295#comment-14715295 ] stack commented on HBASE-12751: --- v23 is building now over on hadoopqa. It failed TestPerColumnFamilyFlush and this fails local for me too. Looking. Allow RowLock to be reader writer - Key: HBASE-12751 URL: https://issues.apache.org/jira/browse/HBASE-12751 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.3.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.3.0 Attachments: 12751v22.txt, 12751v23.txt, 12751v23.txt, HBASE-12751-v1.patch, HBASE-12751-v10.patch, HBASE-12751-v10.patch, HBASE-12751-v11.patch, HBASE-12751-v12.patch, HBASE-12751-v13.patch, HBASE-12751-v14.patch, HBASE-12751-v15.patch, HBASE-12751-v16.patch, HBASE-12751-v17.patch, HBASE-12751-v18.patch, HBASE-12751-v19 (1).patch, HBASE-12751-v19.patch, HBASE-12751-v2.patch, HBASE-12751-v20.patch, HBASE-12751-v20.patch, HBASE-12751-v21.patch, HBASE-12751-v3.patch, HBASE-12751-v4.patch, HBASE-12751-v5.patch, HBASE-12751-v6.patch, HBASE-12751-v7.patch, HBASE-12751-v8.patch, HBASE-12751-v9.patch, HBASE-12751.patch Right now every write operation grabs a row lock. This is to prevent values from changing during a read modify write operation (increment or check and put). However it limits parallelism in several different scenarios. If there are several puts to the same row but different columns or stores then this is very limiting. If there are puts to the same column then mvcc number should ensure a consistent ordering. So locking is not needed. However locking for check and put or increment is still needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13212) Procedure V2 - master Create/Modify/Delete namespace
[ https://issues.apache.org/jira/browse/HBASE-13212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715382#comment-14715382 ] Hadoop QA commented on HBASE-13212: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752523/HBASE-13212.v1-branch-1.patch against branch-1 branch at commit aca8c3b74b09646c72c4e0fe26a4b2103da0d288. ATTACHMENT ID: 12752523 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 3 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15277//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15277//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15277//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15277//console This message is automatically generated. Procedure V2 - master Create/Modify/Delete namespace Key: HBASE-13212 URL: https://issues.apache.org/jira/browse/HBASE-13212 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 2.0.0 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Labels: reliability Attachments: HBASE-13212.v1-branch-1.patch, HBASE-13212.v1-master.patch, HBASE-13212.v2-master.patch, HBASE-13212.v3-master.patch Original Estimate: 168h Remaining Estimate: 168h master side, part of HBASE-12439 starts up the procedure executor on the master and replaces the create/modify/delete namespace handlers with the procedure version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14312) Forward port some fixes from hbase-6721-0.98 to hbase-6721
[ https://issues.apache.org/jira/browse/HBASE-14312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14312: --- Fix Version/s: hbase-6721 Forward port some fixes from hbase-6721-0.98 to hbase-6721 -- Key: HBASE-14312 URL: https://issues.apache.org/jira/browse/HBASE-14312 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Labels: hbase-6721 Fix For: hbase-6721 Attachments: HBASE-14312_hbase-6721.patch Some fixes where checked into hbase-6721-0.98 to address some testing failures and resync'ing with internal Y! implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14309: --- Attachment: 14309-v4.txt Patch v4 has the change in balancer.rb Also regenerated MasterProtos.java based on master branch Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14098) Allow dropping caches behind compactions
[ https://issues.apache.org/jira/browse/HBASE-14098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715291#comment-14715291 ] stack commented on HBASE-14098: --- This change drops ugly thread dumps in our logs: {code} 9034 2015-08-26 11:18:38,175 DEBUG [Time-limited test] hfile.HFile$WriterFactory(308): Unable to set drop behind on /Users/stack/checkouts/hbase.git.commit/hbase-server/target/test-data/ed67f436-1d46-43f# 9035 java.lang.UnsupportedOperationException: the wrapped stream does not support setting the drop-behind caching setting. 9036 › at org.apache.hadoop.fs.FSDataOutputStream.setDropBehind(FSDataOutputStream.java:150) 9037 › at org.apache.hadoop.hbase.io.hfile.HFile$WriterFactory.create(HFile.java:306) 9038 › at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.init(StoreFile.java:787) 9039 › at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.init(StoreFile.java:742) 9040 › at org.apache.hadoop.hbase.regionserver.StoreFile$WriterBuilder.build(StoreFile.java:682) 9041 › at org.apache.hadoop.hbase.regionserver.HStore.createWriterInTmp(HStore.java:1029) 9042 › at org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:66) 9043 › at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:932) 9044 › at org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2069) 9045 › at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2312) 9046 › at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2048) 9047 › at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2010) 9048 › at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1901) 9049 › at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1827) 9050 › at org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush.testSelectiveFlushWhenNotEnabled(TestPerColumnFamilyFlush.java:303) 9051 › at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 9052 › at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 9053 › at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 9054 › at java.lang.reflect.Method.invoke(Method.java:606) 9055 › at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) 9056 › at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) 9057 › at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) 9058 › at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) 9059 › at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) 9060 › at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) 9061 › at java.util.concurrent.FutureTask.run(FutureTask.java:262) 9062 › at java.lang.Thread.run(Thread.java:744) {code} It looks like an issue and only after looking in code do I see it informative. I should change this to be TRACE level with DEBUG emitting just that the feature is not available? (This is in testing master with its version of hadoop) Allow dropping caches behind compactions Key: HBASE-14098 URL: https://issues.apache.org/jira/browse/HBASE-14098 Project: HBase Issue Type: Bug Components: Compaction, hadoop2, HFile Affects Versions: 2.0.0, 1.3.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.2.0 Attachments: HBASE-14098-v1.patch, HBASE-14098-v2.patch, HBASE-14098-v3.patch, HBASE-14098-v4.patch, HBASE-14098-v5.patch, HBASE-14098-v6.patch, HBASE-14098-v7-branch-1.patch, HBASE-14098-v7.patch, HBASE-14098.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL
stack created HBASE-14317: - Summary: Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL Key: HBASE-14317 URL: https://issues.apache.org/jira/browse/HBASE-14317 Project: HBase Issue Type: Bug Affects Versions: 1.1.1 Reporter: stack hbase-1.1.1 and hadoop-2.7.1 We try to roll logs because can't append (See HDFS-8960) but we get stuck. See attached thread dump and associated log. What is interesting is that syncers are waiting to take syncs to run and at same time we want to flush so we are waiting on a safe point but there seems to be nothing in our ring buffer; did we go to roll log and not add safe point sync to clear out ringbuffer? Needs a bit of study. Try to reproduce. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14078) improve error message when HMaster can't bind to port
[ https://issues.apache.org/jira/browse/HBASE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715333#comment-14715333 ] Hudson commented on HBASE-14078: FAILURE: Integrated in HBase-1.2 #138 (See [https://builds.apache.org/job/HBase-1.2/138/]) HBASE-14078 improve error message when HMaster can't bind to port (stack: rev 180e8b8fd68f7d6181bbca17183f55bed2fd844f) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java improve error message when HMaster can't bind to port - Key: HBASE-14078 URL: https://issues.apache.org/jira/browse/HBASE-14078 Project: HBase Issue Type: Improvement Components: master Affects Versions: 2.0.0 Reporter: Sean Busbey Assignee: Matt Warhaftig Labels: beginner Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: hbase-14078_post_stack.txt, hbase-14708-v1.patch, hbase-14708-v2.patch, hbase-14708-v3.patch, hbase-14708-v3.patch When the master fails to start becahse hbase.master.port is already taken, the log messages could make it easier to tell. {quote} 2015-07-14 13:10:02,667 INFO [main] regionserver.RSRpcServices: master/master01.example.com/10.20.188.121:16000 server-side HConnection retries=350 2015-07-14 13:10:02,879 INFO [main] ipc.SimpleRpcScheduler: Using deadline as user call queue, count=3 2015-07-14 13:10:02,895 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2258) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:234) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2272) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2513) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.init(RpcServer.java:599) at org.apache.hadoop.hbase.ipc.RpcServer.init(RpcServer.java:2000) at org.apache.hadoop.hbase.regionserver.RSRpcServices.init(RSRpcServices.java:919) at org.apache.hadoop.hbase.master.MasterRpcServices.init(MasterRpcServices.java:211) at org.apache.hadoop.hbase.master.HMaster.createRpcServices(HMaster.java:509) at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:535) at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:351) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2253) ... 5 more {quote} I recognize that the RSRpcServices log message shows port 16000, but I don't know why a new operator would. Additionally, it'd be nice to tell them that the port is controlled by {{hbase.master.port}}. Maybe give a hint on how to see what's using the port. Could be too os-dist specific? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12751) Allow RowLock to be reader writer
[ https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715350#comment-14715350 ] Hadoop QA commented on HBASE-12751: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752522/12751v23.txt against master branch at commit aca8c3b74b09646c72c4e0fe26a4b2103da0d288. ATTACHMENT ID: 12752522 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 84 new or modified tests. {color:red}-1 Anti-pattern{color}. The patch appears to have anti-pattern where BYTES_COMPARATOR was omitted: -getRegionInfo(), -1, new TreeMapbyte[], ListPath());. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + final long now, ListUUID clusterIds, long nonceGroup, long nonce, MultiVersionConcurrencyControl mvcc) { + long logSeqNum, final long now, ListUUID clusterIds, long nonceGroup, long nonce, MultiVersionConcurrencyControl mvcc) { + long txid = log.append(htd, hri, new WALKey(hri.getEncodedNameAsBytes(), hri.getTable(), now, mvcc), +new WALKey(info.getEncodedNameAsBytes(), htd.getTableName(), System.currentTimeMillis(), mvcc), +new WALKey(hri.getEncodedNameAsBytes(), htd.getTableName(), System.currentTimeMillis(), mvcc), +final WALKey logkey = new WALKey(hri.getEncodedNameAsBytes(), hri.getTable(), now, mvcc); {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestStochasticBalancerJmxMetrics {color:red}-1 core zombie tests{color}. There are 5 zombie test(s): at org.apache.camel.component.jetty.JettySuspendWhileInProgressTest.testJettySuspendWhileInProgress(JettySuspendWhileInProgressTest.java:55) at org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd.testEndToEnd(TestFuzzyRowFilterEndToEnd.java:143) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15276//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15276//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15276//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15276//console This message is automatically generated. Allow RowLock to be reader writer - Key: HBASE-12751 URL: https://issues.apache.org/jira/browse/HBASE-12751 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.3.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.3.0 Attachments: 12751v22.txt, 12751v23.txt, 12751v23.txt, HBASE-12751-v1.patch, HBASE-12751-v10.patch, HBASE-12751-v10.patch, HBASE-12751-v11.patch, HBASE-12751-v12.patch, HBASE-12751-v13.patch, HBASE-12751-v14.patch, HBASE-12751-v15.patch, HBASE-12751-v16.patch, HBASE-12751-v17.patch, HBASE-12751-v18.patch, HBASE-12751-v19 (1).patch, HBASE-12751-v19.patch, HBASE-12751-v2.patch, HBASE-12751-v20.patch, HBASE-12751-v20.patch, HBASE-12751-v21.patch, HBASE-12751-v3.patch, HBASE-12751-v4.patch, HBASE-12751-v5.patch, HBASE-12751-v6.patch, HBASE-12751-v7.patch, HBASE-12751-v8.patch, HBASE-12751-v9.patch, HBASE-12751.patch Right now every write operation grabs a row lock. This is to prevent values from changing during a read modify write operation (increment or check and put). However it limits parallelism in several different scenarios. If there are several puts to the same row but different columns or stores then this is very limiting. If there are puts to the
[jira] [Updated] (HBASE-14312) Forward port some fixes from hbase-6721-0.98 to hbase-6721
[ https://issues.apache.org/jira/browse/HBASE-14312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14312: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Applied and pushed to hbase-6721 branch. Forward port some fixes from hbase-6721-0.98 to hbase-6721 -- Key: HBASE-14312 URL: https://issues.apache.org/jira/browse/HBASE-14312 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Labels: hbase-6721 Fix For: hbase-6721 Attachments: HBASE-14312_hbase-6721.patch Some fixes where checked into hbase-6721-0.98 to address some testing failures and resync'ing with internal Y! implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-6721) RegionServer Group based Assignment
[ https://issues.apache.org/jira/browse/HBASE-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715435#comment-14715435 ] Andrew Purtell commented on HBASE-6721: --- bq. Let me know if you'd like to rebase the succeeding patches once the one(s) prior are committed. I'm thinking we start carving out work as subtasks, applying patches from the subtasks to hbase-6721 branch, make the 0.98 versions of them and apply them to hbase-6721-0.98 branch. This will make development on these branches almost exactly like dev on a release branch. On rebase vs. merge: 1. I can rebase, fix up for changes, then force push, for both hbase-6721 and hbase-6721-0.98 2. I can merge master into hbase-6721 (and 0.98 into hbase-6721-0.98), fix up for changes, commit the fixups in a merge commit, and then push the merged result. There are pros and cons to either approach. What would work best for you [~toffer]? RegionServer Group based Assignment --- Key: HBASE-6721 URL: https://issues.apache.org/jira/browse/HBASE-6721 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Labels: hbase-6721 Attachments: 6721-master-webUI.patch, HBASE-6721 GroupBasedLoadBalancer Sequence Diagram.xml, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721-DesigDoc.pdf, HBASE-6721_0.98_2.patch, HBASE-6721_10.patch, HBASE-6721_11.patch, HBASE-6721_8.patch, HBASE-6721_9.patch, HBASE-6721_9.patch, HBASE-6721_94.patch, HBASE-6721_94.patch, HBASE-6721_94_2.patch, HBASE-6721_94_3.patch, HBASE-6721_94_3.patch, HBASE-6721_94_4.patch, HBASE-6721_94_5.patch, HBASE-6721_94_6.patch, HBASE-6721_94_7.patch, HBASE-6721_98_1.patch, HBASE-6721_98_2.patch, HBASE-6721_hbase-6721_addendum.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk.patch, HBASE-6721_trunk1.patch, HBASE-6721_trunk2.patch, balanceCluster Sequence Diagram.svg, immediateAssignments Sequence Diagram.svg, randomAssignment Sequence Diagram.svg, retainAssignment Sequence Diagram.svg, roundRobinAssignment Sequence Diagram.svg In multi-tenant deployments of HBase, it is likely that a RegionServer will be serving out regions from a number of different tables owned by various client applications. Being able to group a subset of running RegionServers and assign specific tables to it, provides a client application a level of isolation and resource allocation. The proposal essentially is to have an AssignmentManager which is aware of RegionServer groups and assigns tables to region servers based on groupings. Load balancing will occur on a per group basis as well. This is essentially a simplification of the approach taken in HBASE-4120. See attached document. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715268#comment-14715268 ] Ted Yu commented on HBASE-14309: By using nil as default value for force parameter, I was able to get both of the following commands to work: {code} hbase(main):002:0 balancer true true 0 row(s) in 30.5740 seconds {code} {code} hbase(main):001:0 balancer true 0 row(s) in 30.7310 seconds {code} Thanks to Jerry for offline hint. Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL
[ https://issues.apache.org/jira/browse/HBASE-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14317: -- Attachment: raw.php Raw Thread Dump Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL - Key: HBASE-14317 URL: https://issues.apache.org/jira/browse/HBASE-14317 Project: HBase Issue Type: Bug Affects Versions: 1.1.1 Reporter: stack Attachments: [Java] RS stuck on WAL sync to a dead DN - Pastebin.com.html, raw.php, subset.of.rs.log hbase-1.1.1 and hadoop-2.7.1 We try to roll logs because can't append (See HDFS-8960) but we get stuck. See attached thread dump and associated log. What is interesting is that syncers are waiting to take syncs to run and at same time we want to flush so we are waiting on a safe point but there seems to be nothing in our ring buffer; did we go to roll log and not add safe point sync to clear out ringbuffer? Needs a bit of study. Try to reproduce. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-6617) ReplicationSourceManager should be able to track multiple WAL paths
[ https://issues.apache.org/jira/browse/HBASE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715431#comment-14715431 ] Hadoop QA commented on HBASE-6617: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752525/HBASE-6617_v4.patch against master branch at commit aca8c3b74b09646c72c4e0fe26a4b2103da0d288. ATTACHMENT ID: 12752525 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 12 zombie test(s): at org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithACL.testScanForUserWithFewerLabelAuthsThanLabelsInScanAuthorizations(TestVisibilityLabelsWithACL.java:117) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15278//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15278//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15278//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15278//console This message is automatically generated. ReplicationSourceManager should be able to track multiple WAL paths --- Key: HBASE-6617 URL: https://issues.apache.org/jira/browse/HBASE-6617 Project: HBase Issue Type: Improvement Components: Replication Reporter: Ted Yu Assignee: Yu Li Fix For: 2.0.0, 1.3.0 Attachments: HBASE-6617.patch, HBASE-6617_v2.patch, HBASE-6617_v3.patch, HBASE-6617_v4.patch Currently ReplicationSourceManager uses logRolled() to receive notification about new HLog and remembers it in latestPath. When region server has multiple WAL support, we need to keep track of multiple Path's in ReplicationSourceManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14310) test-patch.sh should handle spurious non-zero exit code from maven
[ https://issues.apache.org/jira/browse/HBASE-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14714451#comment-14714451 ] Stephen Yuan Jiang commented on HBASE-14310: +1 - LGTM test-patch.sh should handle spurious non-zero exit code from maven -- Key: HBASE-14310 URL: https://issues.apache.org/jira/browse/HBASE-14310 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Attachments: 14310-v1.txt Starting last weekend, I saw patch testing abort due to spurious non-zero exit code from maven. Here are recent examples. https://builds.apache.org/job/PreCommit-HBASE-Build/15251/console : {quote} HBASE-14286 patch is being downloaded at Tue Aug 25 18:49:17 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12751767/HBASE-14286.1.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} https://builds.apache.org/job/PreCommit-HBASE-Build/15250/console : {quote} HBASE-14268 patch is being downloaded at Tue Aug 25 18:19:25 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12752280/14268-V5.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} The search in mvn output for 'Compilation failure' returned nothing. I verified locally that with 14268-V5.patch, master branch compiled. test-patch.sh should handle the spurious exit code so that patches can be tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14078) improve error message when HMaster can't bind to port
[ https://issues.apache.org/jira/browse/HBASE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14714498#comment-14714498 ] Hudson commented on HBASE-14078: SUCCESS: Integrated in HBase-1.2-IT #114 (See [https://builds.apache.org/job/HBase-1.2-IT/114/]) HBASE-14078 improve error message when HMaster can't bind to port (stack: rev 180e8b8fd68f7d6181bbca17183f55bed2fd844f) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java improve error message when HMaster can't bind to port - Key: HBASE-14078 URL: https://issues.apache.org/jira/browse/HBASE-14078 Project: HBase Issue Type: Improvement Components: master Affects Versions: 2.0.0 Reporter: Sean Busbey Assignee: Matt Warhaftig Labels: beginner Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: hbase-14078_post_stack.txt, hbase-14708-v1.patch, hbase-14708-v2.patch, hbase-14708-v3.patch, hbase-14708-v3.patch When the master fails to start becahse hbase.master.port is already taken, the log messages could make it easier to tell. {quote} 2015-07-14 13:10:02,667 INFO [main] regionserver.RSRpcServices: master/master01.example.com/10.20.188.121:16000 server-side HConnection retries=350 2015-07-14 13:10:02,879 INFO [main] ipc.SimpleRpcScheduler: Using deadline as user call queue, count=3 2015-07-14 13:10:02,895 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2258) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:234) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2272) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2513) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.init(RpcServer.java:599) at org.apache.hadoop.hbase.ipc.RpcServer.init(RpcServer.java:2000) at org.apache.hadoop.hbase.regionserver.RSRpcServices.init(RSRpcServices.java:919) at org.apache.hadoop.hbase.master.MasterRpcServices.init(MasterRpcServices.java:211) at org.apache.hadoop.hbase.master.HMaster.createRpcServices(HMaster.java:509) at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:535) at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:351) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2253) ... 5 more {quote} I recognize that the RSRpcServices log message shows port 16000, but I don't know why a new operator would. Additionally, it'd be nice to tell them that the port is controlled by {{hbase.master.port}}. Maybe give a hint on how to see what's using the port. Could be too os-dist specific? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-6617) ReplicationSourceManager should be able to track multiple WAL paths
[ https://issues.apache.org/jira/browse/HBASE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-6617: - Attachment: HBASE-6617_v4.patch Upload v4 patch to sync-up with rb ReplicationSourceManager should be able to track multiple WAL paths --- Key: HBASE-6617 URL: https://issues.apache.org/jira/browse/HBASE-6617 Project: HBase Issue Type: Improvement Components: Replication Reporter: Ted Yu Assignee: Yu Li Fix For: 2.0.0, 1.3.0 Attachments: HBASE-6617.patch, HBASE-6617_v2.patch, HBASE-6617_v3.patch, HBASE-6617_v4.patch Currently ReplicationSourceManager uses logRolled() to receive notification about new HLog and remembers it in latestPath. When region server has multiple WAL support, we need to keep track of multiple Path's in ReplicationSourceManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14309: --- Fix Version/s: 1.3.0 2.0.0 Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.3.0 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt, 14309-v5.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14312) Forward port some fixes from hbase-6721-0.98 to hbase-6721
[ https://issues.apache.org/jira/browse/HBASE-14312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14312: --- Labels: hbase-6721 (was: ) Forward port some fixes from hbase-6721-0.98 to hbase-6721 -- Key: HBASE-14312 URL: https://issues.apache.org/jira/browse/HBASE-14312 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Francis Liu Labels: hbase-6721 Attachments: HBASE-14312_hbase-6721.patch Some fixes where checked into hbase-6721-0.98 to address some testing failures and resync'ing with internal Y! implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715327#comment-14715327 ] Hadoop QA commented on HBASE-14309: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752513/14309-branch-1.1.txt against branch-1.1 branch at commit ff86749caeb63eafcf10cbfba45334757a791384. ATTACHMENT ID: 12752513 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15274//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15274//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15274//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15274//console This message is automatically generated. Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL
[ https://issues.apache.org/jira/browse/HBASE-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14317: -- Attachment: subset.of.rs.log [Java] RS stuck on WAL sync to a dead DN - Pastebin.com.html Thread dump and subset of full RS log Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL - Key: HBASE-14317 URL: https://issues.apache.org/jira/browse/HBASE-14317 Project: HBase Issue Type: Bug Affects Versions: 1.1.1 Reporter: stack Attachments: [Java] RS stuck on WAL sync to a dead DN - Pastebin.com.html, subset.of.rs.log hbase-1.1.1 and hadoop-2.7.1 We try to roll logs because can't append (See HDFS-8960) but we get stuck. See attached thread dump and associated log. What is interesting is that syncers are waiting to take syncs to run and at same time we want to flush so we are waiting on a safe point but there seems to be nothing in our ring buffer; did we go to roll log and not add safe point sync to clear out ringbuffer? Needs a bit of study. Try to reproduce. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14309: --- Attachment: 14309-v5.txt See if patch v5 is better. Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt, 14309-v5.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14308) HTableDescriptor WARN is not actionable
[ https://issues.apache.org/jira/browse/HBASE-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715426#comment-14715426 ] Lars Francke commented on HBASE-14308: -- [~larsgeorge] [~stack] what do you think! HTableDescriptor WARN is not actionable --- Key: HBASE-14308 URL: https://issues.apache.org/jira/browse/HBASE-14308 Project: HBase Issue Type: Task Components: Usability Affects Versions: 2.0.0 Reporter: Nick Dimiduk Priority: Minor Labels: beginner Notice this while testing another patch in standalone mode. I see warn lines like the following {noformat} 2015-08-25 14:19:47,057 WARN [1758008124@qtp-1276709283-0] hbase.HTableDescriptor: Use addCoprocessor* methods to add a coprocessor instead {noformat} This appears to come from {{HTableDescriptor#setValue(Bytes,Bytes)}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715433#comment-14715433 ] Ted Yu commented on HBASE-14309: I ran the same mvn command QA used with and without patch v5. Both passed. Will try QA bot later. Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.3.0 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt, 14309-v5.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14315) Save one call to KeyValueHeap.peek per row
[ https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-14315: -- Attachment: 14315-0.98.txt Simple patch (for 0.98 for now). The observation is simple: We already peeked the current KV, _and_ already checked whether it's null. We can use that in the loop and peek a new value at the end. It's guaranteed to save one call to peek. Whether it's worth the slide decrease in readability is a different discussion. Save one call to KeyValueHeap.peek per row -- Key: HBASE-14315 URL: https://issues.apache.org/jira/browse/HBASE-14315 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Attachments: 14315-0.98.txt Another one of my micro optimizations. In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, which in my runs of scan heavy loads shows up at top. Based on the run and data this can safe between 3 and 10% of runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14305) TestHRegion.testWritesWhileGetting hangs during Unit Testing
[ https://issues.apache.org/jira/browse/HBASE-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-14305: -- Summary: TestHRegion.testWritesWhileGetting hangs during Unit Testing (was: Deadlock observed in MVCC during Unit Testing) TestHRegion.testWritesWhileGetting hangs during Unit Testing Key: HBASE-14305 URL: https://issues.apache.org/jira/browse/HBASE-14305 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Yu Li As titled, this failure is reported in a UT check by HadoopQA, below is part of the jstack output: {noformat} main prio=10 tid=0x7fb77000a800 nid=0x5004 in Object.wait() [0x7fb778799000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.waitForPreviousTransactionsComplete(MultiVersionConcurrencyControl.java:224) - locked 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2254) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2061) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2026) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2016) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1423) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344) - locked 0x0007ee9c85e8 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1295) at org.apache.hadoop.hbase.HBaseTestingUtility.closeRegionAndWAL(HBaseTestingUtility.java:352) at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:3999) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {noformat} It seems waiting on waitQueue never got notified and cause the case a zombie Full jstack output please refer to [this link|https://builds.apache.org/job/PreCommit-HBASE-Build/15244//consoleFull] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14315) Save one call to KeyValueHeap.peek per row
Lars Hofhansl created HBASE-14315: - Summary: Save one call to KeyValueHeap.peek per row Key: HBASE-14315 URL: https://issues.apache.org/jira/browse/HBASE-14315 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Another one of my micro optimizations. In StoreScanner.next(...) we can actually safe a call to KeyValueHeap.peek, which in my runs of scan heavy loads shows up at top. Based on the run and data this can safe between 3 and 10% of runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14315) Save one call to KeyValueHeap.peek per row
[ https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-14315: -- Description: Another one of my micro optimizations. In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, which in my runs of scan heavy loads shows up at top. Based on the run and data this can safe between 3 and 10% of runtime. was: Another one of my micro optimizations. In StoreScanner.next(...) we can actually safe a call to KeyValueHeap.peek, which in my runs of scan heavy loads shows up at top. Based on the run and data this can safe between 3 and 10% of runtime. Save one call to KeyValueHeap.peek per row -- Key: HBASE-14315 URL: https://issues.apache.org/jira/browse/HBASE-14315 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Another one of my micro optimizations. In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, which in my runs of scan heavy loads shows up at top. Based on the run and data this can safe between 3 and 10% of runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14305) TestHRegion.testWritesWhileGetting hangs during Unit Testing
[ https://issues.apache.org/jira/browse/HBASE-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712575#comment-14712575 ] Yu Li commented on HBASE-14305: --- Ok, agree, but it seems a dead wait and cause the case a zombie, not sure whether it's simply an occasional case or any potential bug there. Have updated the description accordingly. TestHRegion.testWritesWhileGetting hangs during Unit Testing Key: HBASE-14305 URL: https://issues.apache.org/jira/browse/HBASE-14305 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Yu Li As titled, this failure is reported in a UT check by HadoopQA, below is part of the jstack output: {noformat} main prio=10 tid=0x7fb77000a800 nid=0x5004 in Object.wait() [0x7fb778799000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.waitForPreviousTransactionsComplete(MultiVersionConcurrencyControl.java:224) - locked 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2254) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2061) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2026) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2016) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1423) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344) - locked 0x0007ee9c85e8 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1295) at org.apache.hadoop.hbase.HBaseTestingUtility.closeRegionAndWAL(HBaseTestingUtility.java:352) at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:3999) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {noformat} It seems waiting on waitQueue never got notified and cause the case a zombie Full jstack output please refer to [this link|https://builds.apache.org/job/PreCommit-HBASE-Build/15244//consoleFull] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13212) Procedure V2 - master Create/Modify/Delete namespace
[ https://issues.apache.org/jira/browse/HBASE-13212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712576#comment-14712576 ] Hadoop QA commented on HBASE-13212: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752380/HBASE-13212.v3-master.patch against master branch at commit 506726ed2832b069602c6b7e2ccd5ec9a81013a6. ATTACHMENT ID: 12752380 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 12 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 9 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15266//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15266//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15266//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15266//console This message is automatically generated. Procedure V2 - master Create/Modify/Delete namespace Key: HBASE-13212 URL: https://issues.apache.org/jira/browse/HBASE-13212 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 2.0.0 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Labels: reliability Attachments: HBASE-13212.v1-master.patch, HBASE-13212.v2-master.patch, HBASE-13212.v3-master.patch Original Estimate: 168h Remaining Estimate: 168h master side, part of HBASE-12439 starts up the procedure executor on the master and replaces the create/modify/delete namespace handlers with the procedure version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12298) Support BB usage in PrefixTree
[ https://issues.apache.org/jira/browse/HBASE-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712573#comment-14712573 ] stack commented on HBASE-12298: --- Maybe this run? https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15247/console Didn't apply. We not reporting back to JIRA if failed compile. Support BB usage in PrefixTree -- Key: HBASE-12298 URL: https://issues.apache.org/jira/browse/HBASE-12298 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan Attachments: HBASE-12298.patch, HBASE-12298_1.patch, HBASE-12298_2.patch, HBASE-12298_3.patch, HBASE-12298_4.patch, HBASE-12298_4.patch, HBASE-12298_4.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14269) FuzzyRowFilter omits certain rows when multiple fuzzy key exist
[ https://issues.apache.org/jira/browse/HBASE-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712582#comment-14712582 ] hongbin ma commented on HBASE-14269: As [~vrodionov] suggested I performed another test with TestFuzzyRowFilterEndToEnd: 50 fuzzykeys: || ||Pre HBASE-13761||HBASE-14269|| |runTest1's first run|180ms|204ms| |runTest1's second run|82ms|62ms| |runTest2's first run|209ms|230ms| |runTest2's second run|92ms|109ms| 100 fuzzykeys: || ||Pre HBASE-13761||HBASE-14269|| |runTest1's first run|183ms|177ms| |runTest1's second run|82ms|56ms| |runTest2's first run|218ms|214ms| |runTest2's second run|98ms|107ms| 500 fuzykeys: || ||Pre HBASE-13761||HBASE-14269|| |runTest1's first run|184ms|192ms| |runTest1's second run|72ms|61ms| |runTest2's first run|260ms|226ms| |runTest2's second run|127ms|101ms| Unfortunately the I don't think post HBASE-13761 optimizations have boost the performance very much. I don't have the condition to profile very large dataset, [~vrodionov] will you please share your numbers? Despite the bad news, the bug in HBASE-13761 still needs to be fixed. It seems the new approach is not degrading when compared with pre HBASE-13761. And as the number of fuzzykeys increases, HBASE-14269 tends to be faster than pre HBASE-13761. So I thinks it is okay to commit this patch. Suggestions for FuzzyRowFilter users: FuzzyRowFilter is good when you have handful of fuzzy filters, when the number of fuzzy filters grow out of control (In apache Kylin we witnessed user queries caused using more than 10 fuzzy filters) Normal it will bring more performance issues than benefit. FuzzyRowFilter omits certain rows when multiple fuzzy key exist --- Key: HBASE-14269 URL: https://issues.apache.org/jira/browse/HBASE-14269 Project: HBase Issue Type: Bug Components: Filters Reporter: hongbin ma Assignee: hongbin ma Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 Attachments: HBASE-14269-v1.patch, HBASE-14269-v2.patch, HBASE-14269.patch https://issues.apache.org/jira/browse/HBASE-13761 introduced a RowTracker in FuzzyRowFilter to avoid performing getNextForFuzzyRule() for each fuzzy key on each getNextCellHint() by maintaining a list of possible row matches for each fuzzy key. The implementation assumes that the prepared rows will be matched one by one, so it removes the first row in the list as soon as it is used. However, this approach may lead to omitting rows in some cases: Consider a case where we have two fuzzy keys: 1?1 2?2 and the data is like: 000 111 112 121 122 211 212 when the first row 000 fails to match, RowTracker will update possible row matches with cell 000 and fuzzy keys 1?1,2?2. This will populate RowTracker with 101 and 202. Then 101 is popped out of RowTracker, hint the scanner to go to row 101. The scanner will get 111 and find it is a match, and continued to find that 112 is not a match, getNextCellHint will be called again. Then comes the bug: Row 101 has been removed out of RowTracker, so RowTracker will jump to 202. As you see row 121 will be omitted, but it is actually a match for fuzzy key 1?1. I will illustrate the bug by adding a new test case in TestFuzzyRowFilterEndToEnd. Also I will provide the bug fix in my patch. The idea of the new solution is to maintain a priority queue for all the possible match rows for each fuzzy key, and whenever getNextCellHint is called, the elements in the queue that are smaller than the parameter currentCell will be updated(and re-insert into the queue). The head of queue will always be the Next cell hint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14305) Deadlock observed in MVCC during Unit Testing
[ https://issues.apache.org/jira/browse/HBASE-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-14305: -- Description: As titled, this failure is reported in a UT check by HadoopQA, below is part of the jstack output: {noformat} main prio=10 tid=0x7fb77000a800 nid=0x5004 in Object.wait() [0x7fb778799000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.waitForPreviousTransactionsComplete(MultiVersionConcurrencyControl.java:224) - locked 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2254) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2061) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2026) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2016) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1423) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344) - locked 0x0007ee9c85e8 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1295) at org.apache.hadoop.hbase.HBaseTestingUtility.closeRegionAndWAL(HBaseTestingUtility.java:352) at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:3999) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {noformat} It seems waiting on waitQueue never got notified and cause the case a zombie Full jstack output please refer to [this link|https://builds.apache.org/job/PreCommit-HBASE-Build/15244//consoleFull] was: As titled, this failure is reported in a UT check by HadoopQA, below is part of the jstack output: {noformat} main prio=10 tid=0x7fb77000a800 nid=0x5004 in Object.wait() [0x7fb778799000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.waitForPreviousTransactionsComplete(MultiVersionConcurrencyControl.java:224) - locked 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2254) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2061) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2026) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2016) at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1423) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1344) - locked 0x0007ee9c85e8 (a java.lang.Object) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1295) at org.apache.hadoop.hbase.HBaseTestingUtility.closeRegionAndWAL(HBaseTestingUtility.java:352) at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:3999) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {noformat} It seems we are waiting on the same waitQueue object after got its lock. Full jstack output please refer to [this link|https://builds.apache.org/job/PreCommit-HBASE-Build/15244//consoleFull] Deadlock observed in MVCC during Unit Testing - Key: HBASE-14305 URL: https://issues.apache.org/jira/browse/HBASE-14305 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Yu Li As titled, this failure is reported in a UT check by HadoopQA, below is part of the jstack output: {noformat} main prio=10 tid=0x7fb77000a800 nid=0x5004 in Object.wait() [0x7fb778799000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.MultiVersionConcurrencyControl.waitForPreviousTransactionsComplete(MultiVersionConcurrencyControl.java:224) - locked 0x0007ee9a5260 (a java.util.LinkedList) at org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2254) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2061) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2026) at
[jira] [Commented] (HBASE-14307) Incorrect use of positional read api in HFileBlock
[ https://issues.apache.org/jira/browse/HBASE-14307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712608#comment-14712608 ] Shradha Revankar commented on HBASE-14307: -- Yes, but it is not guaranteed that it will even read as much as the 'size' using the positional read api, shouldn't there be a loop to read until at least 'size'. We tried running hbase with WebhdfsFilesystem (with server implementation that sets http header for Transfer-encoding as chunked encoding, there is no content-length present), the positional read api reads only the first chunk which is far less than the size. Unless there is a loop, the rest of the bytes are not read. We ended up getting errors like this : Caused by: java.io.IOException: Positional read of 16425 bytes failed at offset 4132767 (returned 26) at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1322) Incorrect use of positional read api in HFileBlock -- Key: HBASE-14307 URL: https://issues.apache.org/jira/browse/HBASE-14307 Project: HBase Issue Type: Bug Reporter: Shradha Revankar Priority: Minor Considering that {{read()}} is not guaranteed to read all bytes, I'm interested to understand this particular piece of code and why is partial read treated as an error : https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java#L1446-L1450 Particularly, if hbase were to use a different filesystem, say WebhdfsFileSystem, this would not work, please also see https://issues.apache.org/jira/browse/HDFS-8943 for discussion around this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14306) Refine RegionGroupingProvider: fix issues and make it more scalable
[ https://issues.apache.org/jira/browse/HBASE-14306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715159#comment-14715159 ] Yu Li commented on HBASE-14306: --- Sorry, wrong link above, should be [this one|https://reviews.apache.org/r/37762/] Refine RegionGroupingProvider: fix issues and make it more scalable --- Key: HBASE-14306 URL: https://issues.apache.org/jira/browse/HBASE-14306 Project: HBase Issue Type: Improvement Components: wal Reporter: Yu Li Assignee: Yu Li Attachments: HBASE-14306.patch, HBASE-14306_v2.patch There're multiple issues in RegionGroupingProvider, including: * The provider cache in it is using byte array as the key of ConcurrentHashMap, which is not right (the reason is [here|http://stackoverflow.com/questions/1058149/using-a-byte-array-as-hashmap-key-java]) * It's using IdentityGroupingStrategy to get group and use it as key of the cache, which means the cache will include an entry for each region. This is especially unnecessary when using BoundedRegionGroupingProvider Besides fixing the above issues, I suggest to change BoundedRegionGroupingProvider from a *provider* to a pluggable *strategy*, which will make the whole picture much more clear. For more details, please refer to the patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14307) Incorrect use of positional read api in HFileBlock
[ https://issues.apache.org/jira/browse/HBASE-14307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715223#comment-14715223 ] Chris Nauroth commented on HBASE-14307: --- [~ram_krish] and [~anoop.hbase], thank you for your replies. Just to add a little more context on Shradha's information, the contract of [{{FSDataInputStream}}|http://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/fs/PositionedReadable.html] guarantees that it will read up to the specified number of bytes. It does not guarantee that it will read exactly that number of bytes. Even if the backing file contains that many bytes remaining, a file system implementation may choose to return immediately after reading any bytes that are already available in buffer space, without forcing the caller to block waiting for the exact number of bytes to be available. This is similar to the contract of [{{java.io.InputStream}}|http://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html], which states that a smaller number may be read. bq. In many places we use IOUtils.readFully API which in turn will use loop. I dont think there is some reason why we should not use that here as well. That sounds like exactly the kind of change we had in mind. It sounds like the intent of this logic in HBase is to detect premature EOF, not necessarily demand that a single read call completes the entire operation. Thank you both! Incorrect use of positional read api in HFileBlock -- Key: HBASE-14307 URL: https://issues.apache.org/jira/browse/HBASE-14307 Project: HBase Issue Type: Bug Reporter: Shradha Revankar Priority: Minor Considering that {{read()}} is not guaranteed to read all bytes, I'm interested to understand this particular piece of code and why is partial read treated as an error : https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java#L1446-L1450 Particularly, if hbase were to use a different filesystem, say WebhdfsFileSystem, this would not work, please also see https://issues.apache.org/jira/browse/HDFS-8943 for discussion around this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14309: --- Attachment: (was: 14309-branch-1.txt) Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-v1.txt, 14309-v2.txt, 14309-v3.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14310) test-patch.sh should handle spurious non-zero exit code from maven
[ https://issues.apache.org/jira/browse/HBASE-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14714490#comment-14714490 ] stack commented on HBASE-14310: --- bq. Starting last weekend, I saw patch testing abort due to spurious non-zero exit code from maven. What changed that brought on this behavior? test-patch.sh should handle spurious non-zero exit code from maven -- Key: HBASE-14310 URL: https://issues.apache.org/jira/browse/HBASE-14310 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 2.0.0 Attachments: 14310-v1.txt Starting last weekend, I saw patch testing abort due to spurious non-zero exit code from maven. Here are recent examples. https://builds.apache.org/job/PreCommit-HBASE-Build/15251/console : {quote} HBASE-14286 patch is being downloaded at Tue Aug 25 18:49:17 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12751767/HBASE-14286.1.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} https://builds.apache.org/job/PreCommit-HBASE-Build/15250/console : {quote} HBASE-14268 patch is being downloaded at Tue Aug 25 18:19:25 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12752280/14268-V5.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} The search in mvn output for 'Compilation failure' returned nothing. I verified locally that with 14268-V5.patch, master branch compiled. test-patch.sh should handle the spurious exit code so that patches can be tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14309: --- Attachment: 14309-branch-1.txt Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12751) Allow RowLock to be reader writer
[ https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12751: -- Attachment: 12751v23.txt v23 failed compile here: https://builds.apache.org/view/H-L/view/HBase/job/PreCommit-HBASE-Build/15270/console I tried the script local and it 'works'. Retry Allow RowLock to be reader writer - Key: HBASE-12751 URL: https://issues.apache.org/jira/browse/HBASE-12751 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 2.0.0, 1.3.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.3.0 Attachments: 12751v22.txt, 12751v23.txt, 12751v23.txt, HBASE-12751-v1.patch, HBASE-12751-v10.patch, HBASE-12751-v10.patch, HBASE-12751-v11.patch, HBASE-12751-v12.patch, HBASE-12751-v13.patch, HBASE-12751-v14.patch, HBASE-12751-v15.patch, HBASE-12751-v16.patch, HBASE-12751-v17.patch, HBASE-12751-v18.patch, HBASE-12751-v19 (1).patch, HBASE-12751-v19.patch, HBASE-12751-v2.patch, HBASE-12751-v20.patch, HBASE-12751-v20.patch, HBASE-12751-v21.patch, HBASE-12751-v3.patch, HBASE-12751-v4.patch, HBASE-12751-v5.patch, HBASE-12751-v6.patch, HBASE-12751-v7.patch, HBASE-12751-v8.patch, HBASE-12751-v9.patch, HBASE-12751.patch Right now every write operation grabs a row lock. This is to prevent values from changing during a read modify write operation (increment or check and put). However it limits parallelism in several different scenarios. If there are several puts to the same row but different columns or stores then this is very limiting. If there are puts to the same column then mvcc number should ensure a consistent ordering. So locking is not needed. However locking for check and put or increment is still needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-6617) ReplicationSourceManager should be able to track multiple WAL paths
[ https://issues.apache.org/jira/browse/HBASE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14714485#comment-14714485 ] Yu Li commented on HBASE-6617: -- Hi [~zjushch], Thanks for the review. I've considered your point carefully, but I still think one replication source per wal group is a better way, for below reasons: 1. w.r.t semantic of ReplicationSource, I believe it's many-one rather than one-one relationship between source and peer. One replication source stands for one kind of source, and no matter how many kinds of source, we need to replicate them all to the specified peer. Before multi wal it's a special case that there's only one kind of source. Just think about the heterogeneous storage implementation in HDFS, after supporting different kinds of disks, the block report granularity has changed from node-level to disk-level. I think multiple wal is quite similar to that. 2. w.r.t business point of view, one wal group may stand for one business. In our scenario we created a grouping strategy based on namespace which allows regions of the same business writing into the same log group. In this case one source per group could allow us to know the replication latency of each business, per regionserver/cluster level. 3. w.r.t deleting ReplicationSource instance, you could find the logic in ReplicationSourceManager#removePeer, where the source would be terminated first and then removed from the source list. 4. w.r.t source metrics, we will use peerId@groupId as the id, and when reporting, the metrics name would be like source.peerId@groupId.ageOfLastShippedOp, you can find the whole logic in constructor of MetricsSource. If you'd still prefer to have a metrics collection to track like per regionserver level latency to one peer, we could add a MetricsReplicationPeerSourceSource similar to MetricsReplicationGlobalSourceSource, when using strategy like randomly bounded region group. Feel free to let me know your thoughts. ReplicationSourceManager should be able to track multiple WAL paths --- Key: HBASE-6617 URL: https://issues.apache.org/jira/browse/HBASE-6617 Project: HBase Issue Type: Improvement Components: Replication Reporter: Ted Yu Assignee: Yu Li Fix For: 2.0.0, 1.3.0 Attachments: HBASE-6617.patch, HBASE-6617_v2.patch, HBASE-6617_v3.patch Currently ReplicationSourceManager uses logRolled() to receive notification about new HLog and remembers it in latestPath. When region server has multiple WAL support, we need to keep track of multiple Path's in ReplicationSourceManager -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14310) test-patch.sh should handle spurious non-zero exit code from maven
[ https://issues.apache.org/jira/browse/HBASE-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14310: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.0.0 Status: Resolved (was: Patch Available) test-patch.sh should handle spurious non-zero exit code from maven -- Key: HBASE-14310 URL: https://issues.apache.org/jira/browse/HBASE-14310 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 2.0.0 Attachments: 14310-v1.txt Starting last weekend, I saw patch testing abort due to spurious non-zero exit code from maven. Here are recent examples. https://builds.apache.org/job/PreCommit-HBASE-Build/15251/console : {quote} HBASE-14286 patch is being downloaded at Tue Aug 25 18:49:17 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12751767/HBASE-14286.1.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} https://builds.apache.org/job/PreCommit-HBASE-Build/15250/console : {quote} HBASE-14268 patch is being downloaded at Tue Aug 25 18:19:25 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12752280/14268-V5.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} The search in mvn output for 'Compilation failure' returned nothing. I verified locally that with 14268-V5.patch, master branch compiled. test-patch.sh should handle the spurious exit code so that patches can be tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14283) Reverse scan doesn’t work with HFile inline index/bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715147#comment-14715147 ] Ben Lau commented on HBASE-14283: - Hi Anoop. The problem isn't that we read a previous block and see that the block is not the expected type. prevBlockOffset guarantees that we can seek to the previous block of the same type as the current one. See the comments on HFileBlock.getPrevBlockOffset(). We are always seeking to the previous data block, we are simply not calculating how much to read correctly once we have seeked to that previous data block because our prev data block size calculation can include other blocks because of the layout of scannable section in HFileV2+. We need a way of knowing apriori what the size of the previous data block is. The method you describe is used in HFileReaderImpl.readNextDataBlock(). Note that the reason this method works is because this method can use the method curBlock.getNextBlockOnDiskSizeWithHeader(). We need something similar to that when seeking backwards in order to achieve optimal performance. Let me know if I misunderstood what you meant. Reverse scan doesn’t work with HFile inline index/bloom blocks -- Key: HBASE-14283 URL: https://issues.apache.org/jira/browse/HBASE-14283 Project: HBase Issue Type: Bug Reporter: Ben Lau Assignee: Ben Lau Attachments: HBASE-14283.patch, hfile-seek-before.patch Reverse scans do not work if an HFile contains inline bloom blocks or leaf level index blocks. The reason is because the seekBefore() call calculates the previous data block’s size by assuming data blocks are contiguous which is not the case in HFile V2 and beyond. Attached is a first cut patch (targeting bcef28eefaf192b0ad48c8011f98b8e944340da5 on trunk) which includes: (1) a unit test which exposes the bug and demonstrates failures for both inline bloom blocks and inline index blocks (2) a proposed fix for inline index blocks that does not require a new HFile version change, but is only performant for 1 and 2-level indexes and not 3+. 3+ requires an HFile format update for optimal performance. This patch does not fix the bloom filter blocks bug. But the fix should be similar to the case of inline index blocks. The reason I haven’t made the change yet is I want to confirm that you guys would be fine with me revising the HFile.Reader interface. Specifically, these 2 functions (getGeneralBloomFilterMetadata and getDeleteBloomFilterMetadata) need to return the BloomFilter. Right now the HFileReader class doesn’t have a reference to the bloom filters (and hence their indices) and only constructs the IO streams and hence has no way to know where the bloom blocks are in the HFile. It seems that the HFile.Reader bloom method comments state that they “know nothing about how that metadata is structured” but I do not know if that is a requirement of the abstraction (why?) or just an incidental current property. We would like to do 3 things with community approval: (1) Update the HFile.Reader interface and implementation to contain and return BloomFilters directly rather than unstructured IO streams (2) Merge the fixes for index blocks and bloom blocks into open source (3) Create a new Jira ticket for open source HBase to add a ‘prevBlockSize’ field in the block header in the next HFile version, so that seekBefore() calls can not only be correct but performant in all cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14283) Reverse scan doesn’t work with HFile inline index/bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715179#comment-14715179 ] ramkrishna.s.vasudevan commented on HBASE-14283: [~benlau] Let me take a look at this tomorrow morning my time. Reverse scan doesn’t work with HFile inline index/bloom blocks -- Key: HBASE-14283 URL: https://issues.apache.org/jira/browse/HBASE-14283 Project: HBase Issue Type: Bug Reporter: Ben Lau Assignee: Ben Lau Attachments: HBASE-14283.patch, hfile-seek-before.patch Reverse scans do not work if an HFile contains inline bloom blocks or leaf level index blocks. The reason is because the seekBefore() call calculates the previous data block’s size by assuming data blocks are contiguous which is not the case in HFile V2 and beyond. Attached is a first cut patch (targeting bcef28eefaf192b0ad48c8011f98b8e944340da5 on trunk) which includes: (1) a unit test which exposes the bug and demonstrates failures for both inline bloom blocks and inline index blocks (2) a proposed fix for inline index blocks that does not require a new HFile version change, but is only performant for 1 and 2-level indexes and not 3+. 3+ requires an HFile format update for optimal performance. This patch does not fix the bloom filter blocks bug. But the fix should be similar to the case of inline index blocks. The reason I haven’t made the change yet is I want to confirm that you guys would be fine with me revising the HFile.Reader interface. Specifically, these 2 functions (getGeneralBloomFilterMetadata and getDeleteBloomFilterMetadata) need to return the BloomFilter. Right now the HFileReader class doesn’t have a reference to the bloom filters (and hence their indices) and only constructs the IO streams and hence has no way to know where the bloom blocks are in the HFile. It seems that the HFile.Reader bloom method comments state that they “know nothing about how that metadata is structured” but I do not know if that is a requirement of the abstraction (why?) or just an incidental current property. We would like to do 3 things with community approval: (1) Update the HFile.Reader interface and implementation to contain and return BloomFilters directly rather than unstructured IO streams (2) Merge the fixes for index blocks and bloom blocks into open source (3) Create a new Jira ticket for open source HBase to add a ‘prevBlockSize’ field in the block header in the next HFile version, so that seekBefore() calls can not only be correct but performant in all cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14307) Incorrect use of positional read api in HFileBlock
[ https://issues.apache.org/jira/browse/HBASE-14307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715180#comment-14715180 ] Anoop Sam John commented on HBASE-14307: In many places we use IOUtils.readFully API which in turn will use loop. I dont think there is some reason why we should not use that here as well. Incorrect use of positional read api in HFileBlock -- Key: HBASE-14307 URL: https://issues.apache.org/jira/browse/HBASE-14307 Project: HBase Issue Type: Bug Reporter: Shradha Revankar Priority: Minor Considering that {{read()}} is not guaranteed to read all bytes, I'm interested to understand this particular piece of code and why is partial read treated as an error : https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java#L1446-L1450 Particularly, if hbase were to use a different filesystem, say WebhdfsFileSystem, this would not work, please also see https://issues.apache.org/jira/browse/HDFS-8943 for discussion around this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715229#comment-14715229 ] Ted Yu commented on HBASE-14309: Trying out on cluster, I got: {code} hbase(main):001:0 balancer ERROR: undefined method `length' for nil:NilClass Here is some help for this command: {code} Tried various ways according to google search which didn't work. If I cannot figure out how to keep balancer command backward compatible, I plan to introduce a new command. Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14078) improve error message when HMaster can't bind to port
[ https://issues.apache.org/jira/browse/HBASE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14078: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 1.3.0 1.2.0 Status: Resolved (was: Patch Available) That test fails for me w/ the patch applied and without the patch applied. Pushed to master, branch-1 and branch-1.2 (since you filed original issue [~busbey]) Thanks for patch [~mwarhaftig] improve error message when HMaster can't bind to port - Key: HBASE-14078 URL: https://issues.apache.org/jira/browse/HBASE-14078 Project: HBase Issue Type: Improvement Components: master Affects Versions: 2.0.0 Reporter: Sean Busbey Assignee: Matt Warhaftig Labels: beginner Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: hbase-14078_post_stack.txt, hbase-14708-v1.patch, hbase-14708-v2.patch, hbase-14708-v3.patch, hbase-14708-v3.patch When the master fails to start becahse hbase.master.port is already taken, the log messages could make it easier to tell. {quote} 2015-07-14 13:10:02,667 INFO [main] regionserver.RSRpcServices: master/master01.example.com/10.20.188.121:16000 server-side HConnection retries=350 2015-07-14 13:10:02,879 INFO [main] ipc.SimpleRpcScheduler: Using deadline as user call queue, count=3 2015-07-14 13:10:02,895 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2258) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:234) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2272) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2513) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.init(RpcServer.java:599) at org.apache.hadoop.hbase.ipc.RpcServer.init(RpcServer.java:2000) at org.apache.hadoop.hbase.regionserver.RSRpcServices.init(RSRpcServices.java:919) at org.apache.hadoop.hbase.master.MasterRpcServices.init(MasterRpcServices.java:211) at org.apache.hadoop.hbase.master.HMaster.createRpcServices(HMaster.java:509) at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:535) at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:351) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2253) ... 5 more {quote} I recognize that the RSRpcServices log message shows port 16000, but I don't know why a new operator would. Additionally, it'd be nice to tell them that the port is controlled by {{hbase.master.port}}. Maybe give a hint on how to see what's using the port. Could be too os-dist specific? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14309: --- Attachment: 14309-branch-1.1.txt Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14310) test-patch.sh should handle spurious non-zero exit code from maven
[ https://issues.apache.org/jira/browse/HBASE-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14714493#comment-14714493 ] Ted Yu commented on HBASE-14310: I haven't had time to dig further. Previously the exit code was not logged, giving us fewer clues. test-patch.sh should handle spurious non-zero exit code from maven -- Key: HBASE-14310 URL: https://issues.apache.org/jira/browse/HBASE-14310 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 2.0.0 Attachments: 14310-v1.txt Starting last weekend, I saw patch testing abort due to spurious non-zero exit code from maven. Here are recent examples. https://builds.apache.org/job/PreCommit-HBASE-Build/15251/console : {quote} HBASE-14286 patch is being downloaded at Tue Aug 25 18:49:17 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12751767/HBASE-14286.1.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} https://builds.apache.org/job/PreCommit-HBASE-Build/15250/console : {quote} HBASE-14268 patch is being downloaded at Tue Aug 25 18:19:25 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12752280/14268-V5.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} The search in mvn output for 'Compilation failure' returned nothing. I verified locally that with 14268-V5.patch, master branch compiled. test-patch.sh should handle the spurious exit code so that patches can be tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14310) test-patch.sh should handle spurious non-zero exit code from maven
[ https://issues.apache.org/jira/browse/HBASE-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14714484#comment-14714484 ] Ted Yu commented on HBASE-14310: Thanks for the review, Stephen. test-patch.sh should handle spurious non-zero exit code from maven -- Key: HBASE-14310 URL: https://issues.apache.org/jira/browse/HBASE-14310 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 2.0.0 Attachments: 14310-v1.txt Starting last weekend, I saw patch testing abort due to spurious non-zero exit code from maven. Here are recent examples. https://builds.apache.org/job/PreCommit-HBASE-Build/15251/console : {quote} HBASE-14286 patch is being downloaded at Tue Aug 25 18:49:17 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12751767/HBASE-14286.1.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} https://builds.apache.org/job/PreCommit-HBASE-Build/15250/console : {quote} HBASE-14268 patch is being downloaded at Tue Aug 25 18:19:25 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12752280/14268-V5.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} The search in mvn output for 'Compilation failure' returned nothing. I verified locally that with 14268-V5.patch, master branch compiled. test-patch.sh should handle the spurious exit code so that patches can be tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13212) Procedure V2 - master Create/Modify/Delete namespace
[ https://issues.apache.org/jira/browse/HBASE-13212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-13212: --- Attachment: HBASE-13212.v1-branch-1.patch Procedure V2 - master Create/Modify/Delete namespace Key: HBASE-13212 URL: https://issues.apache.org/jira/browse/HBASE-13212 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 2.0.0 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Labels: reliability Attachments: HBASE-13212.v1-branch-1.patch, HBASE-13212.v1-master.patch, HBASE-13212.v2-master.patch, HBASE-13212.v3-master.patch Original Estimate: 168h Remaining Estimate: 168h master side, part of HBASE-12439 starts up the procedure executor on the master and replaces the create/modify/delete namespace handlers with the procedure version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14269) FuzzyRowFilter omits certain rows when multiple fuzzy keys exist
[ https://issues.apache.org/jira/browse/HBASE-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715178#comment-14715178 ] Hudson commented on HBASE-14269: FAILURE: Integrated in HBase-TRUNK #6757 (See [https://builds.apache.org/job/HBase-TRUNK/6757/]) HBASE-14269 FuzzyRowFilter omits certain rows when multiple fuzzy keys exist (hongbin ma) (tedyu: rev 6661f2d0254f1da9d8cbbd717274421a2ddcb95f) * hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FuzzyRowFilter.java * hbase-server/src/test/java/org/apache/hadoop/hbase/filter/TestFuzzyRowFilterEndToEnd.java FuzzyRowFilter omits certain rows when multiple fuzzy keys exist Key: HBASE-14269 URL: https://issues.apache.org/jira/browse/HBASE-14269 Project: HBase Issue Type: Bug Components: Filters Reporter: hongbin ma Assignee: hongbin ma Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 Attachments: HBASE-14269-v1.patch, HBASE-14269-v2.patch, HBASE-14269.patch https://issues.apache.org/jira/browse/HBASE-13761 introduced a RowTracker in FuzzyRowFilter to avoid performing getNextForFuzzyRule() for each fuzzy key on each getNextCellHint() by maintaining a list of possible row matches for each fuzzy key. The implementation assumes that the prepared rows will be matched one by one, so it removes the first row in the list as soon as it is used. However, this approach may lead to omitting rows in some cases: Consider a case where we have two fuzzy keys: 1?1 2?2 and the data is like: 000 111 112 121 122 211 212 when the first row 000 fails to match, RowTracker will update possible row matches with cell 000 and fuzzy keys 1?1,2?2. This will populate RowTracker with 101 and 202. Then 101 is popped out of RowTracker, hint the scanner to go to row 101. The scanner will get 111 and find it is a match, and continued to find that 112 is not a match, getNextCellHint will be called again. Then comes the bug: Row 101 has been removed out of RowTracker, so RowTracker will jump to 202. As you see row 121 will be omitted, but it is actually a match for fuzzy key 1?1. I will illustrate the bug by adding a new test case in TestFuzzyRowFilterEndToEnd. Also I will provide the bug fix in my patch. The idea of the new solution is to maintain a priority queue for all the possible match rows for each fuzzy key, and whenever getNextCellHint is called, the elements in the queue that are smaller than the parameter currentCell will be updated(and re-insert into the queue). The head of queue will always be the Next cell hint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14313) After a Connection sees ConnectionClosingException it never recovers
[ https://issues.apache.org/jira/browse/HBASE-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715683#comment-14715683 ] Hudson commented on HBASE-14313: FAILURE: Integrated in HBase-1.0 #1031 (See [https://builds.apache.org/job/HBase-1.0/1031/]) HBASE-14313 After a Connection sees ConnectionClosingException on a connection it never recovers (eclark: rev ea018af2ea1737291916240d054c5c7871bb57c0) * hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java After a Connection sees ConnectionClosingException it never recovers Key: HBASE-14313 URL: https://issues.apache.org/jira/browse/HBASE-14313 Project: HBase Issue Type: Bug Affects Versions: 1.2.0, 1.1.0.1 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Critical Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3 Attachments: HBASE-14313.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14232) Backwards compatiblity support for new MasterObserver APIs
[ https://issues.apache.org/jira/browse/HBASE-14232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14232: --- Attachment: HBASE-14232_hbase-6721_0.98.patch Backwards compatiblity support for new MasterObserver APIs -- Key: HBASE-14232 URL: https://issues.apache.org/jira/browse/HBASE-14232 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Francis Liu Labels: hbase-6721 Fix For: hbase-6721 Attachments: HBASE-14232_hbase-6721.patch, HBASE-14232_hbase-6721_0.98.patch The group assignment changes introduce new methods to the MasterObserver interface. This is a concern for things like Apache Phoenix. (See their IndexMasterObserver, etc.) We can handle this by using compatibility helpers that won't attempt to invoke the new APIs on MasterObservers that do not implement them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14232) Backwards compatiblity support for new MasterObserver APIs
[ https://issues.apache.org/jira/browse/HBASE-14232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14232: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to hbase-6721 and hbase-6721-0.98 Backwards compatiblity support for new MasterObserver APIs -- Key: HBASE-14232 URL: https://issues.apache.org/jira/browse/HBASE-14232 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Francis Liu Labels: hbase-6721 Fix For: hbase-6721 Attachments: HBASE-14232_hbase-6721.patch, HBASE-14232_hbase-6721_0.98.patch The group assignment changes introduce new methods to the MasterObserver interface. This is a concern for things like Apache Phoenix. (See their IndexMasterObserver, etc.) We can handle this by using compatibility helpers that won't attempt to invoke the new APIs on MasterObservers that do not implement them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13212) Procedure V2 - master Create/Modify/Delete namespace
[ https://issues.apache.org/jira/browse/HBASE-13212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715495#comment-14715495 ] Stephen Yuan Jiang commented on HBASE-13212: https://builds.apache.org/job/PreCommit-HBASE-Build/15277//testReport/ says {{0 failures (-3) , 28 skipped (+3) on 4,261 tests (+1471)}} Procedure V2 - master Create/Modify/Delete namespace Key: HBASE-13212 URL: https://issues.apache.org/jira/browse/HBASE-13212 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 2.0.0 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Labels: reliability Attachments: HBASE-13212.v1-branch-1.patch, HBASE-13212.v1-master.patch, HBASE-13212.v2-master.patch, HBASE-13212.v3-master.patch Original Estimate: 168h Remaining Estimate: 168h master side, part of HBASE-12439 starts up the procedure executor on the master and replaces the create/modify/delete namespace handlers with the procedure version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14268) Improve KeyLocker
[ https://issues.apache.org/jira/browse/HBASE-14268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715647#comment-14715647 ] Ted Yu commented on HBASE-14268: Please add javadoc for the following: {code} + public interface ObjectFactoryK, V { {code} How were these default values chosen ? {code} + * Creates a new pool with the default initial capacity (16) + * and the default concurrency level (16). {code} Some tests don't have timeout parameter: {code} + @Test + public void testKeys() { {code} Please add timeout parameter. Improve KeyLocker - Key: HBASE-14268 URL: https://issues.apache.org/jira/browse/HBASE-14268 Project: HBase Issue Type: Improvement Components: util Reporter: Hiroshi Ikeda Assignee: Hiroshi Ikeda Priority: Minor Attachments: 14268-V5.patch, HBASE-14268-V2.patch, HBASE-14268-V3.patch, HBASE-14268-V4.patch, HBASE-14268-V5.patch, HBASE-14268-V5.patch, HBASE-14268.patch, KeyLockerPerformance.java 1. In the implementation of {{KeyLocker}} it uses atomic variables inside a synchronized block, which doesn't make sense. Moreover, logic inside the synchronized block is not trivial so that it makes less performance in heavy multi-threaded environment. 2. {{KeyLocker}} gives an instance of {{RentrantLock}} which is already locked, but it doesn't follow the contract of {{ReentrantLock}} because you are not allowed to freely invoke lock/unlock methods under that contract. That introduces a potential risk; Whenever you see a variable of the type {{RentrantLock}}, you should pay attention to what the included instance is coming from. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14313) After a Connection sees ConnectionClosingException it never recovers
[ https://issues.apache.org/jira/browse/HBASE-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715685#comment-14715685 ] Hudson commented on HBASE-14313: FAILURE: Integrated in HBase-1.2 #139 (See [https://builds.apache.org/job/HBase-1.2/139/]) HBASE-14313 After a Connection sees ConnectionClosingException on a connection it never recovers (eclark: rev 0a1f0cd66a0f090782726246d44d7e9c611abc68) * hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/RpcClientImpl.java After a Connection sees ConnectionClosingException it never recovers Key: HBASE-14313 URL: https://issues.apache.org/jira/browse/HBASE-14313 Project: HBase Issue Type: Bug Affects Versions: 1.2.0, 1.1.0.1 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Critical Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3 Attachments: HBASE-14313.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14309: --- Attachment: 14309-v5-branch-1.txt Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.3.0 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14169) API to refreshSuperUserGroupsConfiguration
[ https://issues.apache.org/jira/browse/HBASE-14169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715596#comment-14715596 ] Francis Liu commented on HBASE-14169: - I think we're on the same pagejust different sentences. :-) {quote} Nope that's the public api of ProxyUsers. It's in the name of the method, in the parameters, and in the static un-extendable code. {quote} Agreed. Tho I was talking about the provider that there's no guarantee it'll read from config. The provider is read from config that is clear. {quote} If there's some company specific ImpersonationProvider that does different things then having ProxyUsers.refreshSuperUserGroupsConfiguration tied to reloading config won't be harmful at all. {quote} Agreed. My point that it is clunky, even HDFS has a separate cli and client api to refresh the super user configuration. Having said would you still like the patch to be changed as part of a refresh configuration call? How do you suggest we do this for 1.x? Are we backporting refresh framework? [~mbertozzi] [~apurtell] Just confirming this changes are ok with you guys as well? API to refreshSuperUserGroupsConfiguration -- Key: HBASE-14169 URL: https://issues.apache.org/jira/browse/HBASE-14169 Project: HBase Issue Type: New Feature Reporter: Francis Liu Assignee: Francis Liu Attachments: HBASE-14169.patch, HBASE-14169_2.patch, HBASE-14169_3.patch For deployments that use security. User impersonation (AKA doAs()) is needed for some services (ie Stargate, thriftserver, Oozie, etc). Impersonation definitions are defined in a xml config file and read and cached by the ProxyUsers class. Calling this api will refresh cached information, eliminating the need to restart the master/regionserver whenever the configuration is changed. Implementation just adds another method to AccessControlService. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14232) Backwards compatiblity support for new MasterObserver APIs
[ https://issues.apache.org/jira/browse/HBASE-14232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715445#comment-14715445 ] Andrew Purtell commented on HBASE-14232: +1 Small change, I'll apply to both branches, no need to provide a 0.98 patch. Backwards compatiblity support for new MasterObserver APIs -- Key: HBASE-14232 URL: https://issues.apache.org/jira/browse/HBASE-14232 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Francis Liu Labels: hbase-6721 Fix For: hbase-6721 Attachments: HBASE-14232_hbase-6721.patch The group assignment changes introduce new methods to the MasterObserver interface. This is a concern for things like Apache Phoenix. (See their IndexMasterObserver, etc.) We can handle this by using compatibility helpers that won't attempt to invoke the new APIs on MasterObservers that do not implement them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14232) Backwards compatiblity support for new MasterObserver APIs
[ https://issues.apache.org/jira/browse/HBASE-14232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-14232: --- Fix Version/s: hbase-6721 Backwards compatiblity support for new MasterObserver APIs -- Key: HBASE-14232 URL: https://issues.apache.org/jira/browse/HBASE-14232 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Francis Liu Labels: hbase-6721 Fix For: hbase-6721 Attachments: HBASE-14232_hbase-6721.patch The group assignment changes introduce new methods to the MasterObserver interface. This is a concern for things like Apache Phoenix. (See their IndexMasterObserver, etc.) We can handle this by using compatibility helpers that won't attempt to invoke the new APIs on MasterObservers that do not implement them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14318) make_rc.sh should purge/re-resolve dependencies from local repository
[ https://issues.apache.org/jira/browse/HBASE-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715443#comment-14715443 ] Nick Dimiduk commented on HBASE-14318: -- Purge doesn't work, due to MDEP-405. make_rc.sh should purge/re-resolve dependencies from local repository - Key: HBASE-14318 URL: https://issues.apache.org/jira/browse/HBASE-14318 Project: HBase Issue Type: Task Components: build Reporter: Nick Dimiduk Assignee: Nick Dimiduk Over on the 1.1.2RC1 VOTE thread, impressively pedantic [~enis] noticed the underlying hadoop version was built locally, not from upstream. Until such time as we can reliably build releases in a clean-room environment, let's have our scripts clean up after us. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14078) improve error message when HMaster can't bind to port
[ https://issues.apache.org/jira/browse/HBASE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715464#comment-14715464 ] Hudson commented on HBASE-14078: FAILURE: Integrated in HBase-TRUNK #6758 (See [https://builds.apache.org/job/HBase-TRUNK/6758/]) HBASE-14078 improve error message when HMaster can't bind to port (stack: rev ff86749caeb63eafcf10cbfba45334757a791384) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java improve error message when HMaster can't bind to port - Key: HBASE-14078 URL: https://issues.apache.org/jira/browse/HBASE-14078 Project: HBase Issue Type: Improvement Components: master Affects Versions: 2.0.0 Reporter: Sean Busbey Assignee: Matt Warhaftig Labels: beginner Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: hbase-14078_post_stack.txt, hbase-14708-v1.patch, hbase-14708-v2.patch, hbase-14708-v3.patch, hbase-14708-v3.patch When the master fails to start becahse hbase.master.port is already taken, the log messages could make it easier to tell. {quote} 2015-07-14 13:10:02,667 INFO [main] regionserver.RSRpcServices: master/master01.example.com/10.20.188.121:16000 server-side HConnection retries=350 2015-07-14 13:10:02,879 INFO [main] ipc.SimpleRpcScheduler: Using deadline as user call queue, count=3 2015-07-14 13:10:02,895 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2258) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:234) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2272) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2513) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.init(RpcServer.java:599) at org.apache.hadoop.hbase.ipc.RpcServer.init(RpcServer.java:2000) at org.apache.hadoop.hbase.regionserver.RSRpcServices.init(RSRpcServices.java:919) at org.apache.hadoop.hbase.master.MasterRpcServices.init(MasterRpcServices.java:211) at org.apache.hadoop.hbase.master.HMaster.createRpcServices(HMaster.java:509) at org.apache.hadoop.hbase.regionserver.HRegionServer.init(HRegionServer.java:535) at org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:351) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2253) ... 5 more {quote} I recognize that the RSRpcServices log message shows port 16000, but I don't know why a new operator would. Additionally, it'd be nice to tell them that the port is controlled by {{hbase.master.port}}. Maybe give a hint on how to see what's using the port. Could be too os-dist specific? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13212) Procedure V2 - master Create/Modify/Delete namespace
[ https://issues.apache.org/jira/browse/HBASE-13212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715466#comment-14715466 ] Hudson commented on HBASE-13212: FAILURE: Integrated in HBase-TRUNK #6758 (See [https://builds.apache.org/job/HBase-TRUNK/6758/]) HBASE-13212: Procedure V2 - master Create/Modify/Delete namespace (Stephen Yuan Jiang) (syuanjiangdev: rev dc79b3c5c91b7bc0c230199fe60eb51324770084) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableNamespaceManager.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteNamespaceProcedure.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestModifyNamespaceProcedure.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/MasterProcedureProtos.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/MasterProtos.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ZKNamespaceManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureTestingUtility.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java * hbase-protocol/src/main/protobuf/MasterProcedure.proto * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateNamespaceProcedure.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestCreateNamespaceProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyNamespaceProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestDeleteNamespaceProcedure.java * hbase-protocol/src/main/protobuf/Master.proto Procedure V2 - master Create/Modify/Delete namespace Key: HBASE-13212 URL: https://issues.apache.org/jira/browse/HBASE-13212 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 2.0.0 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Labels: reliability Attachments: HBASE-13212.v1-branch-1.patch, HBASE-13212.v1-master.patch, HBASE-13212.v2-master.patch, HBASE-13212.v3-master.patch Original Estimate: 168h Remaining Estimate: 168h master side, part of HBASE-12439 starts up the procedure executor on the master and replaces the create/modify/delete namespace handlers with the procedure version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14310) test-patch.sh should handle spurious non-zero exit code from maven
[ https://issues.apache.org/jira/browse/HBASE-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715465#comment-14715465 ] Hudson commented on HBASE-14310: FAILURE: Integrated in HBase-TRUNK #6758 (See [https://builds.apache.org/job/HBase-TRUNK/6758/]) HBASE-14310 test-patch.sh should handle spurious non-zero exit code from maven (tedyu: rev aca8c3b74b09646c72c4e0fe26a4b2103da0d288) * dev-support/test-patch.sh test-patch.sh should handle spurious non-zero exit code from maven -- Key: HBASE-14310 URL: https://issues.apache.org/jira/browse/HBASE-14310 Project: HBase Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Priority: Minor Fix For: 2.0.0 Attachments: 14310-v1.txt Starting last weekend, I saw patch testing abort due to spurious non-zero exit code from maven. Here are recent examples. https://builds.apache.org/job/PreCommit-HBASE-Build/15251/console : {quote} HBASE-14286 patch is being downloaded at Tue Aug 25 18:49:17 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12751767/HBASE-14286.1.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} https://builds.apache.org/job/PreCommit-HBASE-Build/15250/console : {quote} HBASE-14268 patch is being downloaded at Tue Aug 25 18:19:25 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12752280/14268-V5.patch ... /home/jenkins/tools/maven/latest/bin/mvn clean package checkstyle:checkstyle-aggregate findbugs:findbugs -DskipTests -DHBasePatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/patchprocess/trunkJavacWarnings.txt 21 Trunk compilation is broken? \{code\}\{code\} {quote} The search in mvn output for 'Compilation failure' returned nothing. I verified locally that with 14268-V5.patch, master branch compiled. test-patch.sh should handle the spurious exit code so that patches can be tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14313) After a Connection sees ConnectionClosingException it never recovers
[ https://issues.apache.org/jira/browse/HBASE-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715479#comment-14715479 ] Elliott Clark commented on HBASE-14313: --- It seems to fix on the cluster I am testing it on. Committing it now. After a Connection sees ConnectionClosingException it never recovers Key: HBASE-14313 URL: https://issues.apache.org/jira/browse/HBASE-14313 Project: HBase Issue Type: Bug Affects Versions: 1.2.0, 1.1.0.1 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Critical Fix For: 2.0.0, 1.2.0, 1.3.0 Attachments: HBASE-14313.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14230) replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication()
[ https://issues.apache.org/jira/browse/HBASE-14230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715657#comment-14715657 ] Ted Yu commented on HBASE-14230: +1 replace reflection in FSHlog with HdfsDataOutputStream#getCurrentBlockReplication() --- Key: HBASE-14230 URL: https://issues.apache.org/jira/browse/HBASE-14230 Project: HBase Issue Type: Improvement Components: wal Reporter: Heng Chen Assignee: Heng Chen Priority: Minor Attachments: HBASE-14230.patch As comment TODO said, we use {{HdfsDataOutputStream#getCurrentBlockReplication}} and {{DFSOutputStream.getPipeLine}} to replace reflection in FSHlog -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13212) Procedure V2 - master Create/Modify/Delete namespace
[ https://issues.apache.org/jira/browse/HBASE-13212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-13212: --- Resolution: Fixed Fix Version/s: 1.3.0 2.0.0 Status: Resolved (was: Patch Available) Procedure V2 - master Create/Modify/Delete namespace Key: HBASE-13212 URL: https://issues.apache.org/jira/browse/HBASE-13212 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 2.0.0 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Labels: reliability Fix For: 2.0.0, 1.3.0 Attachments: HBASE-13212.v1-branch-1.patch, HBASE-13212.v1-master.patch, HBASE-13212.v2-master.patch, HBASE-13212.v3-master.patch Original Estimate: 168h Remaining Estimate: 168h master side, part of HBASE-12439 starts up the procedure executor on the master and replaces the create/modify/delete namespace handlers with the procedure version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14319) TestAtomicOperation.testMultiRowMutationMultiThreads is flaky
Dima Spivak created HBASE-14319: --- Summary: TestAtomicOperation.testMultiRowMutationMultiThreads is flaky Key: HBASE-14319 URL: https://issues.apache.org/jira/browse/HBASE-14319 Project: HBase Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Dima Spivak org.apache.hadoop.hbase.regionserver.TestAtomicOperation.testMultiRowMutationMultiThreads has been failing sporadically for a while on at least trunk. This might also be reproducible on other branches, but it's hard to tell the state since our b.a.o Jenkins matrix for different Java versions that we test against hasn't been set up to display test results in a pretty way (separate JIRA forthcoming). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14315) Save one call to KeyValueHeap.peek per row
[ https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712635#comment-14712635 ] ramkrishna.s.vasudevan commented on HBASE-14315: I think patch makes sense. +1. Save one call to KeyValueHeap.peek per row -- Key: HBASE-14315 URL: https://issues.apache.org/jira/browse/HBASE-14315 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Attachments: 14315-0.98.txt Another one of my micro optimizations. In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, which in my runs of scan heavy loads shows up at top. Based on the run and data this can safe between 3 and 10% of runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12298) Support BB usage in PrefixTree
[ https://issues.apache.org/jira/browse/HBASE-12298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712749#comment-14712749 ] Hadoop QA commented on HBASE-12298: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752397/HBASE-12298_4.patch against master branch at commit 506726ed2832b069602c6b7e2ccd5ec9a81013a6. ATTACHMENT ID: 12752397 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 28 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 7 zombie test(s): at org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithACL.testLabelsTableOpsWithDifferentUsers(TestVisibilityLabelsWithACL.java:233) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15272//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15272//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15272//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15272//console This message is automatically generated. Support BB usage in PrefixTree -- Key: HBASE-12298 URL: https://issues.apache.org/jira/browse/HBASE-12298 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Reporter: Anoop Sam John Assignee: ramkrishna.s.vasudevan Attachments: HBASE-12298.patch, HBASE-12298_1.patch, HBASE-12298_2.patch, HBASE-12298_3.patch, HBASE-12298_4.patch, HBASE-12298_4.patch, HBASE-12298_4.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14269) FuzzyRowFilter omits certain rows when multiple fuzzy key exist
[ https://issues.apache.org/jira/browse/HBASE-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712788#comment-14712788 ] Ted Yu commented on HBASE-14269: I think the performance with Hongbin's fix is acceptable. +1 on patch. FuzzyRowFilter omits certain rows when multiple fuzzy key exist --- Key: HBASE-14269 URL: https://issues.apache.org/jira/browse/HBASE-14269 Project: HBase Issue Type: Bug Components: Filters Reporter: hongbin ma Assignee: hongbin ma Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 Attachments: HBASE-14269-v1.patch, HBASE-14269-v2.patch, HBASE-14269.patch https://issues.apache.org/jira/browse/HBASE-13761 introduced a RowTracker in FuzzyRowFilter to avoid performing getNextForFuzzyRule() for each fuzzy key on each getNextCellHint() by maintaining a list of possible row matches for each fuzzy key. The implementation assumes that the prepared rows will be matched one by one, so it removes the first row in the list as soon as it is used. However, this approach may lead to omitting rows in some cases: Consider a case where we have two fuzzy keys: 1?1 2?2 and the data is like: 000 111 112 121 122 211 212 when the first row 000 fails to match, RowTracker will update possible row matches with cell 000 and fuzzy keys 1?1,2?2. This will populate RowTracker with 101 and 202. Then 101 is popped out of RowTracker, hint the scanner to go to row 101. The scanner will get 111 and find it is a match, and continued to find that 112 is not a match, getNextCellHint will be called again. Then comes the bug: Row 101 has been removed out of RowTracker, so RowTracker will jump to 202. As you see row 121 will be omitted, but it is actually a match for fuzzy key 1?1. I will illustrate the bug by adding a new test case in TestFuzzyRowFilterEndToEnd. Also I will provide the bug fix in my patch. The idea of the new solution is to maintain a priority queue for all the possible match rows for each fuzzy key, and whenever getNextCellHint is called, the elements in the queue that are smaller than the parameter currentCell will be updated(and re-insert into the queue). The head of queue will always be the Next cell hint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14316) On truncate table command, the hbase doesn't maintain the pre-defined splits
debarshi basak created HBASE-14316: -- Summary: On truncate table command, the hbase doesn't maintain the pre-defined splits Key: HBASE-14316 URL: https://issues.apache.org/jira/browse/HBASE-14316 Project: HBase Issue Type: Bug Reporter: debarshi basak On truncate table command, the hbase doesn't maintain the pre-defined splits. It simply drops and re creates table. It should have some mechanism to maintain the predefined splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14291) NPE On StochasticLoadBalancer Balance Involving RS With No Regions
[ https://issues.apache.org/jira/browse/HBASE-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712764#comment-14712764 ] Hudson commented on HBASE-14291: FAILURE: Integrated in HBase-0.98 #1101 (See [https://builds.apache.org/job/HBase-0.98/1101/]) Revert HBASE-14291 NPE On StochasticLoadBalancer Balance Involving RS With No Regions (apurtell: rev 10388f6a141f1c68b7eeed110f5c9cd7e1cb3f7e) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java NPE On StochasticLoadBalancer Balance Involving RS With No Regions -- Key: HBASE-14291 URL: https://issues.apache.org/jira/browse/HBASE-14291 Project: HBase Issue Type: Bug Components: Balancer Affects Versions: 2.0.0 Environment: Pseudo-distributed (2 local RegionServers), Hadoop 2.5.1, Java 1.7.0_71 Reporter: Matt Warhaftig Assignee: Ted Yu Priority: Minor Fix For: 2.0.0, 1.3.0 Attachments: 14291-v1.txt, hbase-mwarhaftig-master-Matts-MBP.log When StochasticLoadBalancer attempts to balance a local RS with multiple regions with another local RS that had no regions the HBase shell call of 'balancer' gets the following NPE: {noformat} ERROR: java.io.IOException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.getLeastLoadedTopServerForRegion(BaseLoadBalancer.java:863) at org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer$LocalityBasedCandidateGenerator.generate(StochasticLoadBalancer.java:724) at org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:325) at org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:263) at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1264) at org.apache.hadoop.hbase.master.MasterRpcServices.balance(MasterRpcServices.java:413) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:52450) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133) ... 4 more {noformat} Issue only occurs when one of the RSs has no regions before balancing. Also, unsure if distributed RSs would also have same issue. Attached 'hbase-mwarhaftig-master-Matts-MBP.log' is master's log of the error occurring. SimpleLoadBalancer rebalances correctly when used in the same situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13376) Improvements to Stochastic load balancer
[ https://issues.apache.org/jira/browse/HBASE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712765#comment-14712765 ] Hudson commented on HBASE-13376: FAILURE: Integrated in HBase-0.98 #1101 (See [https://builds.apache.org/job/HBase-0.98/1101/]) Revert HBASE-13376 Improvements to Stochastic load balancer (Vandana Ayyalasomayajula) (apurtell: rev 4aa14b6c9074002abb92181d99bf65e7f0e46a22) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/RegionLocationFinder.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestStochasticLoadBalancer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java Improvements to Stochastic load balancer Key: HBASE-13376 URL: https://issues.apache.org/jira/browse/HBASE-13376 Project: HBase Issue Type: Improvement Components: Balancer Affects Versions: 1.0.0, 0.98.12 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Fix For: 2.0.0, 1.3.0 Attachments: 13376-v2.txt, 13376-v5.patch, 13376_4.patch, HBASE-13376.patch, HBASE-13376_0.98.txt, HBASE-13376_0.98_v2.patch, HBASE-13376_0.txt, HBASE-13376_1.txt, HBASE-13376_1_1.txt, HBASE-13376_2.patch, HBASE-13376_2_branch-1.patch, HBASE-13376_3.patch, HBASE-13376_3.patch, HBASE-13376_4.patch, HBASE-13376_5_branch-1.patch, HBASE-13376_6_branch-1.patch, HBASE-13376_98.patch, HBASE-13376_branch-1.patch, HBASE-13376_v3_0.98.patch, HBASE-13376_v4_0.98.patch There are two things this jira tries to address: 1. The locality picker in the stochastic balancer does not pick regions with least locality as candidates for swap/move. So when any user configures locality cost in the configs, the balancer does not always seems to move regions with bad locality. 2. When a cluster has equal number of loaded regions, it always picks the first one. It should pick a random region on one of the equally loaded servers. This improves a chance of finding a good candidate, when load picker is invoked several times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13376) Improvements to Stochastic load balancer
[ https://issues.apache.org/jira/browse/HBASE-13376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712890#comment-14712890 ] Hudson commented on HBASE-13376: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1055 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1055/]) Revert HBASE-13376 Improvements to Stochastic load balancer (Vandana Ayyalasomayajula) (apurtell: rev 4aa14b6c9074002abb92181d99bf65e7f0e46a22) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/RegionLocationFinder.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestStochasticLoadBalancer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java Improvements to Stochastic load balancer Key: HBASE-13376 URL: https://issues.apache.org/jira/browse/HBASE-13376 Project: HBase Issue Type: Improvement Components: Balancer Affects Versions: 1.0.0, 0.98.12 Reporter: Vandana Ayyalasomayajula Assignee: Vandana Ayyalasomayajula Priority: Minor Fix For: 2.0.0, 1.3.0 Attachments: 13376-v2.txt, 13376-v5.patch, 13376_4.patch, HBASE-13376.patch, HBASE-13376_0.98.txt, HBASE-13376_0.98_v2.patch, HBASE-13376_0.txt, HBASE-13376_1.txt, HBASE-13376_1_1.txt, HBASE-13376_2.patch, HBASE-13376_2_branch-1.patch, HBASE-13376_3.patch, HBASE-13376_3.patch, HBASE-13376_4.patch, HBASE-13376_5_branch-1.patch, HBASE-13376_6_branch-1.patch, HBASE-13376_98.patch, HBASE-13376_branch-1.patch, HBASE-13376_v3_0.98.patch, HBASE-13376_v4_0.98.patch There are two things this jira tries to address: 1. The locality picker in the stochastic balancer does not pick regions with least locality as candidates for swap/move. So when any user configures locality cost in the configs, the balancer does not always seems to move regions with bad locality. 2. When a cluster has equal number of loaded regions, it always picks the first one. It should pick a random region on one of the equally loaded servers. This improves a chance of finding a good candidate, when load picker is invoked several times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14291) NPE On StochasticLoadBalancer Balance Involving RS With No Regions
[ https://issues.apache.org/jira/browse/HBASE-14291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712889#comment-14712889 ] Hudson commented on HBASE-14291: FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #1055 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/1055/]) Revert HBASE-14291 NPE On StochasticLoadBalancer Balance Involving RS With No Regions (apurtell: rev 10388f6a141f1c68b7eeed110f5c9cd7e1cb3f7e) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java NPE On StochasticLoadBalancer Balance Involving RS With No Regions -- Key: HBASE-14291 URL: https://issues.apache.org/jira/browse/HBASE-14291 Project: HBase Issue Type: Bug Components: Balancer Affects Versions: 2.0.0 Environment: Pseudo-distributed (2 local RegionServers), Hadoop 2.5.1, Java 1.7.0_71 Reporter: Matt Warhaftig Assignee: Ted Yu Priority: Minor Fix For: 2.0.0, 1.3.0 Attachments: 14291-v1.txt, hbase-mwarhaftig-master-Matts-MBP.log When StochasticLoadBalancer attempts to balance a local RS with multiple regions with another local RS that had no regions the HBase shell call of 'balancer' gets the following NPE: {noformat} ERROR: java.io.IOException at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.getLeastLoadedTopServerForRegion(BaseLoadBalancer.java:863) at org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer$LocalityBasedCandidateGenerator.generate(StochasticLoadBalancer.java:724) at org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:325) at org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:263) at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1264) at org.apache.hadoop.hbase.master.MasterRpcServices.balance(MasterRpcServices.java:413) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:52450) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133) ... 4 more {noformat} Issue only occurs when one of the RSs has no regions before balancing. Also, unsure if distributed RSs would also have same issue. Attached 'hbase-mwarhaftig-master-Matts-MBP.log' is master's log of the error occurring. SimpleLoadBalancer rebalances correctly when used in the same situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14269) FuzzyRowFilter omits certain rows when multiple fuzzy keys exist
[ https://issues.apache.org/jira/browse/HBASE-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713513#comment-14713513 ] Ted Yu commented on HBASE-14269: Integrated to master branch. For branch-1, I got: {code} 1 out of 3 hunks FAILED -- saving rejects to file hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FuzzyRowFilter.java.rej {code} Mind attaching patch for branch-1 ? Thanks FuzzyRowFilter omits certain rows when multiple fuzzy keys exist Key: HBASE-14269 URL: https://issues.apache.org/jira/browse/HBASE-14269 Project: HBase Issue Type: Bug Components: Filters Reporter: hongbin ma Assignee: hongbin ma Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 Attachments: HBASE-14269-v1.patch, HBASE-14269-v2.patch, HBASE-14269.patch https://issues.apache.org/jira/browse/HBASE-13761 introduced a RowTracker in FuzzyRowFilter to avoid performing getNextForFuzzyRule() for each fuzzy key on each getNextCellHint() by maintaining a list of possible row matches for each fuzzy key. The implementation assumes that the prepared rows will be matched one by one, so it removes the first row in the list as soon as it is used. However, this approach may lead to omitting rows in some cases: Consider a case where we have two fuzzy keys: 1?1 2?2 and the data is like: 000 111 112 121 122 211 212 when the first row 000 fails to match, RowTracker will update possible row matches with cell 000 and fuzzy keys 1?1,2?2. This will populate RowTracker with 101 and 202. Then 101 is popped out of RowTracker, hint the scanner to go to row 101. The scanner will get 111 and find it is a match, and continued to find that 112 is not a match, getNextCellHint will be called again. Then comes the bug: Row 101 has been removed out of RowTracker, so RowTracker will jump to 202. As you see row 121 will be omitted, but it is actually a match for fuzzy key 1?1. I will illustrate the bug by adding a new test case in TestFuzzyRowFilterEndToEnd. Also I will provide the bug fix in my patch. The idea of the new solution is to maintain a priority queue for all the possible match rows for each fuzzy key, and whenever getNextCellHint is called, the elements in the queue that are smaller than the parameter currentCell will be updated(and re-insert into the queue). The head of queue will always be the Next cell hint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14315) Save one call to KeyValueHeap.peek per row
[ https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713499#comment-14713499 ] Anoop Sam John commented on HBASE-14315: +1 Save one call to KeyValueHeap.peek per row -- Key: HBASE-14315 URL: https://issues.apache.org/jira/browse/HBASE-14315 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Attachments: 14315-0.98.txt Another one of my micro optimizations. In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, which in my runs of scan heavy loads shows up at top. Based on the run and data this can safe between 3 and 10% of runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14269) FuzzyRowFilter omits certain rows when multiple fuzzy keys exist
[ https://issues.apache.org/jira/browse/HBASE-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14269: --- Summary: FuzzyRowFilter omits certain rows when multiple fuzzy keys exist (was: FuzzyRowFilter omits certain rows when multiple fuzzy key exist) FuzzyRowFilter omits certain rows when multiple fuzzy keys exist Key: HBASE-14269 URL: https://issues.apache.org/jira/browse/HBASE-14269 Project: HBase Issue Type: Bug Components: Filters Reporter: hongbin ma Assignee: hongbin ma Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 Attachments: HBASE-14269-v1.patch, HBASE-14269-v2.patch, HBASE-14269.patch https://issues.apache.org/jira/browse/HBASE-13761 introduced a RowTracker in FuzzyRowFilter to avoid performing getNextForFuzzyRule() for each fuzzy key on each getNextCellHint() by maintaining a list of possible row matches for each fuzzy key. The implementation assumes that the prepared rows will be matched one by one, so it removes the first row in the list as soon as it is used. However, this approach may lead to omitting rows in some cases: Consider a case where we have two fuzzy keys: 1?1 2?2 and the data is like: 000 111 112 121 122 211 212 when the first row 000 fails to match, RowTracker will update possible row matches with cell 000 and fuzzy keys 1?1,2?2. This will populate RowTracker with 101 and 202. Then 101 is popped out of RowTracker, hint the scanner to go to row 101. The scanner will get 111 and find it is a match, and continued to find that 112 is not a match, getNextCellHint will be called again. Then comes the bug: Row 101 has been removed out of RowTracker, so RowTracker will jump to 202. As you see row 121 will be omitted, but it is actually a match for fuzzy key 1?1. I will illustrate the bug by adding a new test case in TestFuzzyRowFilterEndToEnd. Also I will provide the bug fix in my patch. The idea of the new solution is to maintain a priority queue for all the possible match rows for each fuzzy key, and whenever getNextCellHint is called, the elements in the queue that are smaller than the parameter currentCell will be updated(and re-insert into the queue). The head of queue will always be the Next cell hint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14078) improve error message when HMaster can't bind to port
[ https://issues.apache.org/jira/browse/HBASE-14078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712708#comment-14712708 ] Hadoop QA commented on HBASE-14078: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752396/hbase-14708-v3.patch against master branch at commit 506726ed2832b069602c6b7e2ccd5ec9a81013a6. ATTACHMENT ID: 12752396 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 10 zombie test(s): at org.apache.hadoop.hbase.security.access.TestAccessController.testAccessControlClientGrantRevoke(TestAccessController.java:2188) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15271//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15271//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15271//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15271//console This message is automatically generated. improve error message when HMaster can't bind to port - Key: HBASE-14078 URL: https://issues.apache.org/jira/browse/HBASE-14078 Project: HBase Issue Type: Improvement Components: master Affects Versions: 2.0.0 Reporter: Sean Busbey Assignee: Matt Warhaftig Labels: beginner Fix For: 2.0.0 Attachments: hbase-14078_post_stack.txt, hbase-14708-v1.patch, hbase-14708-v2.patch, hbase-14708-v3.patch, hbase-14708-v3.patch When the master fails to start becahse hbase.master.port is already taken, the log messages could make it easier to tell. {quote} 2015-07-14 13:10:02,667 INFO [main] regionserver.RSRpcServices: master/master01.example.com/10.20.188.121:16000 server-side HConnection retries=350 2015-07-14 13:10:02,879 INFO [main] ipc.SimpleRpcScheduler: Using deadline as user call queue, count=3 2015-07-14 13:10:02,895 ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster at org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2258) at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:234) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2272) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at org.apache.hadoop.hbase.ipc.RpcServer.bind(RpcServer.java:2513) at org.apache.hadoop.hbase.ipc.RpcServer$Listener.init(RpcServer.java:599) at
[jira] [Commented] (HBASE-14316) On truncate table command, the hbase doesn't maintain the pre-defined splits
[ https://issues.apache.org/jira/browse/HBASE-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712714#comment-14712714 ] Liu Shaohui commented on HBASE-14316: - Please use the truncate_preserve command, which will maintain the predefined splits. But the acls of table will removed. See: HBASE-5525. On truncate table command, the hbase doesn't maintain the pre-defined splits Key: HBASE-14316 URL: https://issues.apache.org/jira/browse/HBASE-14316 Project: HBase Issue Type: Bug Reporter: debarshi basak On truncate table command, the hbase doesn't maintain the pre-defined splits. It simply drops and re creates table. It should have some mechanism to maintain the predefined splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-14316) On truncate table command, the hbase doesn't maintain the pre-defined splits
[ https://issues.apache.org/jira/browse/HBASE-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Marc Spaggiari resolved HBASE-14316. - Resolution: Not A Problem As [~liushaohui] said. Please use truncate_preserve to preserve the table splits. On truncate table command, the hbase doesn't maintain the pre-defined splits Key: HBASE-14316 URL: https://issues.apache.org/jira/browse/HBASE-14316 Project: HBase Issue Type: Bug Reporter: debarshi basak On truncate table command, the hbase doesn't maintain the pre-defined splits. It simply drops and re creates table. It should have some mechanism to maintain the predefined splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14269) FuzzyRowFilter omits certain rows when multiple fuzzy keys exist
[ https://issues.apache.org/jira/browse/HBASE-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713623#comment-14713623 ] Vladimir Rodionov commented on HBASE-14269: --- {code} Unfortunately the I don't think post HBASE-13761 optimizations have boost the performance very much. I don't have the condition to profile very large dataset, Vladimir Rodionov will you please share your numbers? {code} My performance numbers are irrelevant due to a bug in original HBASE-13761. Unfortunately, performance optimizations in a filter itself are not enough to get overall performance boost. Scanner overhead is high. [~mahongbin], can you attach your test program? FuzzyRowFilter omits certain rows when multiple fuzzy keys exist Key: HBASE-14269 URL: https://issues.apache.org/jira/browse/HBASE-14269 Project: HBase Issue Type: Bug Components: Filters Reporter: hongbin ma Assignee: hongbin ma Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 Attachments: HBASE-14269-v1.patch, HBASE-14269-v2.patch, HBASE-14269.patch https://issues.apache.org/jira/browse/HBASE-13761 introduced a RowTracker in FuzzyRowFilter to avoid performing getNextForFuzzyRule() for each fuzzy key on each getNextCellHint() by maintaining a list of possible row matches for each fuzzy key. The implementation assumes that the prepared rows will be matched one by one, so it removes the first row in the list as soon as it is used. However, this approach may lead to omitting rows in some cases: Consider a case where we have two fuzzy keys: 1?1 2?2 and the data is like: 000 111 112 121 122 211 212 when the first row 000 fails to match, RowTracker will update possible row matches with cell 000 and fuzzy keys 1?1,2?2. This will populate RowTracker with 101 and 202. Then 101 is popped out of RowTracker, hint the scanner to go to row 101. The scanner will get 111 and find it is a match, and continued to find that 112 is not a match, getNextCellHint will be called again. Then comes the bug: Row 101 has been removed out of RowTracker, so RowTracker will jump to 202. As you see row 121 will be omitted, but it is actually a match for fuzzy key 1?1. I will illustrate the bug by adding a new test case in TestFuzzyRowFilterEndToEnd. Also I will provide the bug fix in my patch. The idea of the new solution is to maintain a priority queue for all the possible match rows for each fuzzy key, and whenever getNextCellHint is called, the elements in the queue that are smaller than the parameter currentCell will be updated(and re-insert into the queue). The head of queue will always be the Next cell hint. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14309: --- Attachment: 14309-v3.txt Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: 14309-v1.txt, 14309-v2.txt, 14309-v3.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14283) Reverse scan doesn’t work with HFile inline index/bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-14283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713600#comment-14713600 ] Anoop Sam John commented on HBASE-14283: We use below method to get the previous block public HFileBlock readBlock(long dataBlockOffset, long onDiskBlockSize, final boolean cacheBlock, boolean pread, final boolean isCompaction, boolean updateCacheMetrics, BlockType expectedBlockType, DataBlockEncoding expectedDataBlockEncoding) So there no BlockType check and looping? May be it will read a block and see that block is not the expected one and go to next block and check for type. In seek before case instead of going fwd we should be going backward in case the expected block type is matching with the cur block type. That way of solution will work? Reverse scan doesn’t work with HFile inline index/bloom blocks -- Key: HBASE-14283 URL: https://issues.apache.org/jira/browse/HBASE-14283 Project: HBase Issue Type: Bug Reporter: Ben Lau Assignee: Ben Lau Attachments: HBASE-14283.patch, hfile-seek-before.patch Reverse scans do not work if an HFile contains inline bloom blocks or leaf level index blocks. The reason is because the seekBefore() call calculates the previous data block’s size by assuming data blocks are contiguous which is not the case in HFile V2 and beyond. Attached is a first cut patch (targeting bcef28eefaf192b0ad48c8011f98b8e944340da5 on trunk) which includes: (1) a unit test which exposes the bug and demonstrates failures for both inline bloom blocks and inline index blocks (2) a proposed fix for inline index blocks that does not require a new HFile version change, but is only performant for 1 and 2-level indexes and not 3+. 3+ requires an HFile format update for optimal performance. This patch does not fix the bloom filter blocks bug. But the fix should be similar to the case of inline index blocks. The reason I haven’t made the change yet is I want to confirm that you guys would be fine with me revising the HFile.Reader interface. Specifically, these 2 functions (getGeneralBloomFilterMetadata and getDeleteBloomFilterMetadata) need to return the BloomFilter. Right now the HFileReader class doesn’t have a reference to the bloom filters (and hence their indices) and only constructs the IO streams and hence has no way to know where the bloom blocks are in the HFile. It seems that the HFile.Reader bloom method comments state that they “know nothing about how that metadata is structured” but I do not know if that is a requirement of the abstraction (why?) or just an incidental current property. We would like to do 3 things with community approval: (1) Update the HFile.Reader interface and implementation to contain and return BloomFilters directly rather than unstructured IO streams (2) Merge the fixes for index blocks and bloom blocks into open source (3) Create a new Jira ticket for open source HBase to add a ‘prevBlockSize’ field in the block header in the next HFile version, so that seekBefore() calls can not only be correct but performant in all cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14158) Add documentation for Initial Release for HBase-Spark Module integration
[ https://issues.apache.org/jira/browse/HBASE-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Malaska updated HBASE-14158: Attachment: HBASE-14158.1.patch First draft of documentation Add documentation for Initial Release for HBase-Spark Module integration - Key: HBASE-14158 URL: https://issues.apache.org/jira/browse/HBASE-14158 Project: HBase Issue Type: Improvement Components: documentation, spark Reporter: Ted Malaska Assignee: Ted Malaska Fix For: 2.0.0 Attachments: HBASE-14158.1.patch Add documentation for Initial Release for HBase-Spark Module integration -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13212) Procedure V2 - master Create/Modify/Delete namespace
[ https://issues.apache.org/jira/browse/HBASE-13212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14713644#comment-14713644 ] Stephen Yuan Jiang commented on HBASE-13212: The failure of {{TEST-org.apache.hadoop.hbase.master.TestDistributedLogSplitting.xml.init}} is known unstable test has nothing to do with the change in the patch. Procedure V2 - master Create/Modify/Delete namespace Key: HBASE-13212 URL: https://issues.apache.org/jira/browse/HBASE-13212 Project: HBase Issue Type: Sub-task Components: master Affects Versions: 2.0.0 Reporter: Stephen Yuan Jiang Assignee: Stephen Yuan Jiang Labels: reliability Attachments: HBASE-13212.v1-master.patch, HBASE-13212.v2-master.patch, HBASE-13212.v3-master.patch Original Estimate: 168h Remaining Estimate: 168h master side, part of HBASE-12439 starts up the procedure executor on the master and replaces the create/modify/delete namespace handlers with the procedure version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715460#comment-14715460 ] Jerry He commented on HBASE-14309: -- patch v5 looks good. Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.3.0 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14309) Allow load balancer to operate when there is region in transition by adding force flag
[ https://issues.apache.org/jira/browse/HBASE-14309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715463#comment-14715463 ] Hadoop QA commented on HBASE-14309: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752561/14309-v5-branch-1.txt against branch-1 branch at commit aca8c3b74b09646c72c4e0fe26a4b2103da0d288. ATTACHMENT ID: 12752561 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color}. The patch appears to cause mvn compile goal to fail with Hadoop version 2.4.0. Compilation errors resume: [ERROR] Error invoking method 'get(java.lang.Integer)' in java.util.ArrayList at META-INF/LICENSE.vm[line 1619, column 22] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) on project hbase-assembly: Error rendering velocity resource. Error invoking method 'get(java.lang.Integer)' in java.util.ArrayList at META-INF/LICENSE.vm[line 1619, column 22]: InvocationTargetException: Index: 0, Size: 0 - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hbase-assembly Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15281//console This message is automatically generated. Allow load balancer to operate when there is region in transition by adding force flag -- Key: HBASE-14309 URL: https://issues.apache.org/jira/browse/HBASE-14309 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Fix For: 2.0.0, 1.3.0 Attachments: 14309-branch-1.1.txt, 14309-v1.txt, 14309-v2.txt, 14309-v3.txt, 14309-v4.txt, 14309-v5-branch-1.txt, 14309-v5.txt This issue adds boolean parameter, force, to 'balancer' command so that admin can force region balancing even when there is region in transition - assuming RIT being transient. This enhancement was requested by some customer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14313) After a Connection sees ConnectionClosingException it never recovers
[ https://issues.apache.org/jira/browse/HBASE-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-14313: -- Fix Version/s: 1.1.3 1.0.3 Release Note: HConnection could get stuck when talking to a host that went down and then returned. This has been fixed by closing the connection in all paths. Committed to every branch-1+ After a Connection sees ConnectionClosingException it never recovers Key: HBASE-14313 URL: https://issues.apache.org/jira/browse/HBASE-14313 Project: HBase Issue Type: Bug Affects Versions: 1.2.0, 1.1.0.1 Reporter: Elliott Clark Assignee: Elliott Clark Priority: Critical Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3 Attachments: HBASE-14313.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12751) Allow RowLock to be reader writer
[ https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715473#comment-14715473 ] Hadoop QA commented on HBASE-12751: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12752522/12751v23.txt against master branch at commit aca8c3b74b09646c72c4e0fe26a4b2103da0d288. ATTACHMENT ID: 12752522 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 84 new or modified tests. {color:red}-1 Anti-pattern{color}. The patch appears to have anti-pattern where BYTES_COMPARATOR was omitted: -getRegionInfo(), -1, new TreeMapbyte[], ListPath());. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + final long now, ListUUID clusterIds, long nonceGroup, long nonce, MultiVersionConcurrencyControl mvcc) { + long logSeqNum, final long now, ListUUID clusterIds, long nonceGroup, long nonce, MultiVersionConcurrencyControl mvcc) { + long txid = log.append(htd, hri, new WALKey(hri.getEncodedNameAsBytes(), hri.getTable(), now, mvcc), +new WALKey(info.getEncodedNameAsBytes(), htd.getTableName(), System.currentTimeMillis(), mvcc), +new WALKey(hri.getEncodedNameAsBytes(), htd.getTableName(), System.currentTimeMillis(), mvcc), +final WALKey logkey = new WALKey(hri.getEncodedNameAsBytes(), hri.getTable(), now, mvcc); {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.mob.TestDefaultMobStoreFlusher org.apache.hadoop.hbase.replication.TestReplicationKillMasterRS org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint org.apache.hadoop.hbase.replication.regionserver.TestReplicationSink org.apache.hadoop.hbase.replication.TestReplicationChangingPeerRegionservers org.apache.hadoop.hbase.zookeeper.TestZooKeeperACL org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper org.apache.hadoop.hbase.mob.compactions.TestPartitionedMobCompactor org.apache.hadoop.hbase.TestZooKeeper org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush org.apache.hadoop.hbase.mob.TestExpiredMobFileCleaner org.apache.hadoop.hbase.io.encoding.TestChangingEncoding org.apache.hadoop.hbase.TestServerSideScanMetricsFromClientSide org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpointNoMaster org.apache.hadoop.hbase.replication.TestMultiSlaveReplication org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager {color:red}-1 core zombie tests{color}. There are 28 zombie test(s): at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompaction(TestIOFencing.java:229) at org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithACL.testLabelsTableOpsWithDifferentUsers(TestVisibilityLabelsWithACL.java:233) at org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDefaultVisLabelService.testListLabelsWithRegEx(TestVisibilityLabelsWithDefaultVisLabelService.java:220) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15275//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15275//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: