[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503923#comment-14503923 ] Hudson commented on HBASE-13514: SUCCESS: Integrated in HBase-1.1 #414 (See [https://builds.apache.org/job/HBase-1.1/414/]) HBASE-13514 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout (Jonathan Lawlor) (tedyu: rev b9eac01704586488683c34a591ed52712f21e292) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13471) Deadlock closing a region
[ https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Nishtala updated HBASE-13471: Attachment: HBASE-13471.patch Deadlock closing a region - Key: HBASE-13471 URL: https://issues.apache.org/jira/browse/HBASE-13471 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Rajesh Nishtala Attachments: HBASE-13471.patch {code} Thread 4139 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537): State: WAITING Blocked count: 131 Waited count: 228 Waiting on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371) org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509) org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events
Enis Soztutar created HBASE-13515: - Summary: Handle FileNotFoundException in region replica replay for flush/compaction events Key: HBASE-13515 URL: https://issues.apache.org/jira/browse/HBASE-13515 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 I had this patch laying around that somehow dropped from my plate. We should skip replaying compaction / flush and region open event markers if the files (from flush or compaction) can no longer be found from the secondary. If we do not skip, the replay will be retried forever, effectively blocking the replication further. Bulk load already does this, we just need to do it for flush / compaction and region open events as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events
[ https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13515: -- Status: Patch Available (was: Open) Handle FileNotFoundException in region replica replay for flush/compaction events - Key: HBASE-13515 URL: https://issues.apache.org/jira/browse/HBASE-13515 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 Attachments: hbase-13515_v1.patch I had this patch laying around that somehow dropped from my plate. We should skip replaying compaction / flush and region open event markers if the files (from flush or compaction) can no longer be found from the secondary. If we do not skip, the replay will be retried forever, effectively blocking the replication further. Bulk load already does this, we just need to do it for flush / compaction and region open events as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events
[ https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-13515: -- Attachment: hbase-13515_v1.patch Attaching straightforward patch. Handle FileNotFoundException in region replica replay for flush/compaction events - Key: HBASE-13515 URL: https://issues.apache.org/jira/browse/HBASE-13515 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 Attachments: hbase-13515_v1.patch I had this patch laying around that somehow dropped from my plate. We should skip replaying compaction / flush and region open event markers if the files (from flush or compaction) can no longer be found from the secondary. If we do not skip, the replay will be retried forever, effectively blocking the replication further. Bulk load already does this, we just need to do it for flush / compaction and region open events as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting for
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lawlor updated HBASE-13514: Summary: Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting for (was: Fix test failures in TestScannerHeartbeatMessages in branch-1.1 and branch-1) Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting for -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lawlor updated HBASE-13514: Description: The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. (was: The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds.) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503636#comment-14503636 ] Ted Yu commented on HBASE-13514: Test now passes. +1 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-13514: --- Hadoop Flags: Reviewed Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-10070) HBase read high-availability using timeline-consistent region replicas
[ https://issues.apache.org/jira/browse/HBASE-10070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-10070: -- Fix Version/s: 1.1.0 2.0.0 HBase read high-availability using timeline-consistent region replicas -- Key: HBASE-10070 URL: https://issues.apache.org/jira/browse/HBASE-10070 Project: HBase Issue Type: New Feature Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 Attachments: HighAvailabilityDesignforreadsApachedoc.pdf In the present HBase architecture, it is hard, probably impossible, to satisfy constraints like 99th percentile of the reads will be served under 10 ms. One of the major factors that affects this is the MTTR for regions. There are three phases in the MTTR process - detection, assignment, and recovery. Of these, the detection is usually the longest and is presently in the order of 20-30 seconds. During this time, the clients would not be able to read the region data. However, some clients will be better served if regions will be available for reads during recovery for doing eventually consistent reads. This will help with satisfying low latency guarantees for some class of applications which can work with stale reads. For improving read availability, we propose a replicated read-only region serving design, also referred as secondary regions, or region shadows. Extending current model of a region being opened for reads and writes in a single region server, the region will be also opened for reading in region servers. The region server which hosts the region for reads and writes (as in current case) will be declared as PRIMARY, while 0 or more region servers might be hosting the region as SECONDARY. There may be more than one secondary (replica count 2). Will attach a design doc shortly which contains most of the details and some thoughts about development approaches. Reviews are more than welcome. We also have a proof of concept patch, which includes the master and regions server side of changes. Client side changes will be coming soon as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503806#comment-14503806 ] Lars Hofhansl commented on HBASE-13389: --- Thanks [~jeffreyz], just discussed a bit with [~stack]... If we kept the in-order compactions, we won't need MVCC stamps in the HFile beyond the oldest scanner, right? I feel like I am missing something. Could you show an example of when we need MVCC stamps in the HFile beyond the oldest scanner when you have some time? The issue has to do with Puts/Deletes happening in the same millisecond, right? [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503889#comment-14503889 ] Jonathan Lawlor commented on HBASE-13082: - I'm a little late to the party but this versioned data structure sounds neat. If I'm understanding correctly, it sounds like this versioned data structure would also allow us to remove the lingering lock in updateReaders (and potentially remove updateReaders completely?). Instead of having to update the readers, the compaction/flush would occur in the background and be made visible to new readers via a new latest version in the data structure, is that correct? In other words, would the introduction of this new versioned data structure make StoreScanner single threaded (and thus remove any need for synchronization)? Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events
[ https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503934#comment-14503934 ] Hadoop QA commented on HBASE-13515: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726685/hbase-13515_v1.patch against master branch at commit eb82b8b3098d6a9ac62aa50189f9d4b289f38472. ATTACHMENT ID: 12726685 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13746//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13746//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13746//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13746//console This message is automatically generated. Handle FileNotFoundException in region replica replay for flush/compaction events - Key: HBASE-13515 URL: https://issues.apache.org/jira/browse/HBASE-13515 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 Attachments: hbase-13515_v1.patch I had this patch laying around that somehow dropped from my plate. We should skip replaying compaction / flush and region open event markers if the files (from flush or compaction) can no longer be found from the secondary. If we do not skip, the replay will be retried forever, effectively blocking the replication further. Bulk load already does this, we just need to do it for flush / compaction and region open events as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13502) Deprecate/remove getRowComparator() in TableName
[ https://issues.apache.org/jira/browse/HBASE-13502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503993#comment-14503993 ] stack commented on HBASE-13502: --- Ok, if two places then, yeah, sounds like KVComparator is going to be sticking around a while (if deprecated). HRI having a getComparator makes more sense but it should not be public so, deprecate here... I think you can set the IA.Private annotation on a method? Could do that too for these two getComparator calls. There are only two comparator types (four if you include reverse comparators) and even then, the comparators only differ in how they compare rows... The switch is table name (meta and user table name -- later, if we bring back root, it will be a 3rd dimension on comparators...). Would be good to shutdown the places we go when comparator is not plain (i.e. we didn't read the comparator to use from hfile, etc.)... say have a static or a factory on CellComparator that took a TableName instance... and use that in place of these methods. Deprecate/remove getRowComparator() in TableName Key: HBASE-13502 URL: https://issues.apache.org/jira/browse/HBASE-13502 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 Attachments: HBASE-13502.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13516) Increase PermSize to 128MB
[ https://issues.apache.org/jira/browse/HBASE-13516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503995#comment-14503995 ] stack commented on HBASE-13516: --- Yes, given you've done the research. Only needed in jdk8. Increase PermSize to 128MB -- Key: HBASE-13516 URL: https://issues.apache.org/jira/browse/HBASE-13516 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by default. Every Filter and Coprocessor increases that. Running out of perm space triggers a stop the world full GC of the entire heap. We have seen this in misconfigured cluster. Should we default to {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the box as a convenience for users? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13517) Publish a client artifact with shaded dependencies
[ https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13517: -- Description: Guava's moved on. Hadoop has not. Jackson moves whenever it feels like it. Protobuf moves with breaking point changes. While shading all of the time would break people that require the transitive dependencies for MR or other things. Lets provide an artifact with our dependencies shaded. Then users can have the choice to use the shaded version or the non-shaded version. Publish a client artifact with shaded dependencies -- Key: HBASE-13517 URL: https://issues.apache.org/jira/browse/HBASE-13517 Project: HBase Issue Type: Bug Reporter: Elliott Clark Guava's moved on. Hadoop has not. Jackson moves whenever it feels like it. Protobuf moves with breaking point changes. While shading all of the time would break people that require the transitive dependencies for MR or other things. Lets provide an artifact with our dependencies shaded. Then users can have the choice to use the shaded version or the non-shaded version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13516) Increase PermSize to 128MB
[ https://issues.apache.org/jira/browse/HBASE-13516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504070#comment-14504070 ] Andrew Purtell commented on HBASE-13516: bq. Only needed in jdk8. I think Stack meant not needed in jdk8 since perm gen went away and using these options will cause the JVM to throw up warnings. Increase PermSize to 128MB -- Key: HBASE-13516 URL: https://issues.apache.org/jira/browse/HBASE-13516 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by default. Every Filter and Coprocessor increases that. Running out of perm space triggers a stop the world full GC of the entire heap. We have seen this in misconfigured cluster. Should we default to {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the box as a convenience for users? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13516) Increase PermSize to 128MB
[ https://issues.apache.org/jira/browse/HBASE-13516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504069#comment-14504069 ] Andrew Purtell commented on HBASE-13516: +1 Increase PermSize to 128MB -- Key: HBASE-13516 URL: https://issues.apache.org/jira/browse/HBASE-13516 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by default. Every Filter and Coprocessor increases that. Running out of perm space triggers a stop the world full GC of the entire heap. We have seen this in misconfigured cluster. Should we default to {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the box as a convenience for users? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lawlor reassigned HBASE-13514: --- Assignee: Jonathan Lawlor Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lawlor updated HBASE-13514: Status: Patch Available (was: Open) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lawlor updated HBASE-13514: Attachment: HBASE-13514-branch-1.patch HBASE-13514-branch-1.1.patch HBASE-13514.patch Attaching a patch for each branch to get a QA run on each. The patch addresses the test failure and also adds a deleteTable in test cleanup. [~tedyu] got some time to take a quick looksee? Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
[ https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503824#comment-14503824 ] stack commented on HBASE-13469: --- [~syuanjiang] bq. I think we should spend our energy to clean up handler code in 1.2 and make procedure robust. Ok. Sounds reasonable. Took a look at the last patch and not much code and it has a test (only nit comment is why not have the enum name same as the configuration value that turns on the state: i.e. name enums unused, disable, enabled... then you could compare the configuration and the enum toString'd... No biggie) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1 Key: HBASE-13469 URL: https://issues.apache.org/jira/browse/HBASE-13469 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Fix For: 1.1.0 Attachments: HBASE-13469.v1-branch-1.1.patch In branch-1, I think we want proc v2 to be configurable, so that if any non-recoverable issue is found, at least there is a workaround. We already have the handlers and code laying around. It will be just introducing the config to enable / disable. We can even make it dynamically configurable via the new framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
[ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503943#comment-14503943 ] stack commented on HBASE-13389: --- bq. This may be hard to achieve because out of order puts can be flushed at different time. Do 'out of order' puts happen at DLR time only [~jeffreyz]? i.e. WALs can be replayed in any order since they are farmed out over the cluster. We also cannot guarantee when a region that is receiving DLR edits will flush hfiles; e.g. we could get row1/logSeqId=2 during DLR and flush because we had memory pressure, but then later row1/logSeqId=1 might arrive and be flushed into a newer hfile. The fix for this is to not let compactions happen when region is in recovery -- this is probably the case already (or let compactions go on but preserve mvcc while in recovery)? So, the Lars fix would be to drop mvcc if no scanner outstanding with a span that includes mvcc in current hfile AND we are not in DLR recovery mode? Are there other places where we might have out-of-order puts? (Flushes are single threaded and edits go into FSHLog and MemStore in order caveat Elliott and Nate's recent find: https://issues.apache.org/jira/browse/HBASE-12751?focusedCommentId=14377157page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14377157). bq. ...and only keep mvcc around during region recovery time so that we can still keep HBASE-12600 goal Yes. On keeping seqid in the KV in hfiles so we can do ...out of order in minor compactions. ...don't we mean compacting non-adjacent files rather than out-of-order here? So, yeah, if we preserved mvcc always, we could do any order and non-adjacent. Would be nice. Otherwise, as I see it, if we want to do non-adjacent compactions (which as [~lhofhansl] says above, we do not currently have), then we could do it if all files under a Store have zero for mvcc and we just order the edits by the hfile meta data mvcc number. When there are files with an mvcc per KV, then we should probably merge those first... Would have to think it through more. It gets a little complicated though if the Store has some files with a hfile meta data mvcc number but other files have an mvcc per KV. We could not do a file that has an mvcc per KV with a non-adjacent But we could do it also if files with zero if we have the Lars optimization, we could do non-adjacent if we respected the hfile seqid order. It gets tricky if a file has mvcc in the KV and all the rest do not. Files with KVs in the mvcc need to be compacted together ahead of [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations - Key: HBASE-13389 URL: https://issues.apache.org/jira/browse/HBASE-13389 Project: HBase Issue Type: Sub-task Components: Performance Reporter: stack Attachments: 13389.txt HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it was rare or the mvcc was kept up at the file level. This is sort of how it should be many of us would argue but as a side-effect of this change, read-time optimizations that helped speed scans were undone by this change. In this issue, lets see if we can get the optimizations back -- or just remove the optimizations altogether. The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291. The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]): {quote} Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166. We're always storing the mvcc readpoints, and we never compare them against the actual smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons that these jiras removed in addition to actually storing the data - which with up to 8 bytes per Cell is not trivial. {quote} This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503941#comment-14503941 ] zhangduo commented on HBASE-13259: -- I can pick this up and address the 'ugly ByteBufferArray'. But we do not have enough time to test it on large dataset if we want to catch up with the first rc of 1.1 I think. It is a tuning work, the time we need is unpredictable. We can file a new issue to hold the tuning work and resolve this issue before the first rc of 1.1. What do you think? [~ndimiduk] Thanks. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-4462) Properly treating SocketTimeoutException
[ https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4462: -- Affects Version/s: (was: 0.90.4) Properly treating SocketTimeoutException Key: HBASE-4462 URL: https://issues.apache.org/jira/browse/HBASE-4462 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Fix For: 0.90.8 Attachments: HBASE-4462_0.90.x.patch, unittest_that_shows_us_retrying_sockettimeout.txt SocketTimeoutException is currently treated like any IOE inside of HCM.getRegionServerWithRetries and I think this is a problem. This method should only do retries in cases where we are pretty sure the operation will complete, but with STE we already waited for (by default) 60 seconds and nothing happened. I found this while debugging Douglas Campbell's problem on the mailing list where it seemed like he was using the same scanner from multiple threads, but actually it was just the same client doing retries while the first run didn't even finish yet (that's another problem). You could see the first scanner, then up to two other handlers waiting for it to finish in order to run (because of the synchronization on RegionScanner). So what should we do? We could treat STE as a DoNotRetryException and let the client deal with it, or we could retry only once. There's also the option of having a different behavior for get/put/icv/scan, the issue with operations that modify a cell is that you don't know if the operation completed or not (same when a RS dies hard after completing let's say a Put but just before returning to the client). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-5110) code enhancement - remove unnecessary if-checks in every loop in HLog class
[ https://issues.apache.org/jira/browse/HBASE-5110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-5110: -- Resolution: Not A Problem Status: Resolved (was: Patch Available) code enhancement - remove unnecessary if-checks in every loop in HLog class --- Key: HBASE-5110 URL: https://issues.apache.org/jira/browse/HBASE-5110 Project: HBase Issue Type: Improvement Components: wal Affects Versions: 0.90.1, 0.90.2, 0.90.4, 0.92.0 Reporter: Mikael Sitruk Priority: Minor Attachments: HBASE-5110_1.patch The HLog class (method findMemstoresWithEditsEqualOrOlderThan) has unnecessary if check in a loop. static byte [][] findMemstoresWithEditsEqualOrOlderThan(final long oldestWALseqid, final Mapbyte [], Long regionsToSeqids) { // This method is static so it can be unit tested the easier. Listbyte [] regions = null; for (Map.Entrybyte [], Long e: regionsToSeqids.entrySet()) { if (e.getValue().longValue() = oldestWALseqid) { if (regions == null) regions = new ArrayListbyte [](); regions.add(e.getKey()); } } return regions == null? null: regions.toArray(new byte [][] {HConstants.EMPTY_BYTE_ARRAY}); } The following change is suggested static byte [][] findMemstoresWithEditsEqualOrOlderThan(final long oldestWALseqid, final Mapbyte [], Long regionsToSeqids) { // This method is static so it can be unit tested the easier. Listbyte [] regions = new ArrayListbyte [](); for (Map.Entrybyte [], Long e: regionsToSeqids.entrySet()) { if (e.getValue().longValue() = oldestWALseqid) { regions.add(e.getKey()); } } return regions.size() == 0? null: regions.toArray(new byte [][] {HConstants.EMPTY_BYTE_ARRAY}); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-8720) Only one snapshot region tasks that can run at a time
[ https://issues.apache.org/jira/browse/HBASE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-8720: -- Resolution: Not A Problem Status: Resolved (was: Patch Available) Only one snapshot region tasks that can run at a time - Key: HBASE-8720 URL: https://issues.apache.org/jira/browse/HBASE-8720 Project: HBase Issue Type: Bug Components: snapshots Affects Versions: 0.94.8, 0.95.0 Reporter: binlijin Attachments: 8720-v2.txt, HBASE-8720.patch {code} SnapshotSubprocedurePool(String name, Configuration conf) { // configure the executor service long keepAlive = conf.getLong( RegionServerSnapshotManager.SNAPSHOT_TIMEOUT_MILLIS_KEY, RegionServerSnapshotManager.SNAPSHOT_TIMEOUT_MILLIS_DEFAULT); int threads = conf.getInt(CONCURENT_SNAPSHOT_TASKS_KEY, DEFAULT_CONCURRENT_SNAPSHOT_TASKS); this.name = name; executor = new ThreadPoolExecutor(1, threads, keepAlive, TimeUnit.MILLISECONDS, new LinkedBlockingQueueRunnable(), new DaemonThreadFactory(rs( + name + )-snapshot-pool)); taskPool = new ExecutorCompletionServiceVoid(executor); } {code} ThreadPoolExecutor: corePoolSize:1 maximumPoolSize:3 workQueue:LinkedBlockingQueue,unlimited so when a new task submit to the ThreadPoolExecutor, if there is a task is running, the new task is queued in the queue, so all snapshot region tasks execute one by one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-7218) Rename Snapshot
[ https://issues.apache.org/jira/browse/HBASE-7218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-7218: -- Fix Version/s: (was: hbase-6055) Assignee: (was: Matteo Bertozzi) Affects Version/s: (was: hbase-6055) Status: Open (was: Patch Available) Cancelling stale patch. CLose this? Rename Snapshot --- Key: HBASE-7218 URL: https://issues.apache.org/jira/browse/HBASE-7218 Project: HBase Issue Type: New Feature Components: snapshots Reporter: Matteo Bertozzi Priority: Minor Attachments: HBASE-7218-v0.patch, HBASE-7218-v1.patch Add the ability to rename a snapshot. HBaseAdmin.renameSnapshot(oldName, newName) shell: snapshot_rename 'oldName', 'newName' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-8064) hbase connection could not reuse
[ https://issues.apache.org/jira/browse/HBASE-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-8064: -- Resolution: Cannot Reproduce Assignee: (was: Yuan Kang) Release Note: (was: hbase connection manager can't resuse the connection for this code,the patch resolve it) Status: Resolved (was: Patch Available) hbase connection could not reuse Key: HBASE-8064 URL: https://issues.apache.org/jira/browse/HBASE-8064 Project: HBase Issue Type: Bug Components: Client Affects Versions: 0.94.0 Environment: hadoop-1.0.2 hbase-0.94.0 Reporter: Yuan Kang Labels: patch Attachments: HConnectionManager-connection-could-not-reuse.patch when hconnection is used by one matchine,the connection return to the pool. if anather matchine reget the connection,it can be resued. but in the code the caching map don't be managered correctly -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13071) Hbase Streaming Scan Feature
[ https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503549#comment-14503549 ] Eshcar Hillel commented on HBASE-13071: --- Done rebase. Thanks to HBASE-13090 next and loadCache methods are separated so this rebase wasn't too painful (thanks [~jonathan.lawlor]). I also changed some new scanner tests to account for the change in scanner cache interface (it is now a Queue). Hbase Streaming Scan Feature Key: HBASE-13071 URL: https://issues.apache.org/jira/browse/HBASE-13071 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: 99.eshcar.png, HBASE-13071_98_1.patch, HBASE-13071_trunk_1.patch, HBASE-13071_trunk_10.patch, HBASE-13071_trunk_2.patch, HBASE-13071_trunk_3.patch, HBASE-13071_trunk_4.patch, HBASE-13071_trunk_5.patch, HBASE-13071_trunk_6.patch, HBASE-13071_trunk_7.patch, HBASE-13071_trunk_8.patch, HBASE-13071_trunk_9.patch, HBASE-13071_trunk_rebase_1.0.patch, HBaseStreamingScanDesign.pdf, HbaseStreamingScanEvaluation.pdf, HbaseStreamingScanEvaluationwithMultipleClients.pdf, gc.delay.png, gc.eshcar.png, gc.png, hits.delay.png, hits.eshcar.png, hits.png, latency.delay.png, latency.png, network.png A scan operation iterates over all rows of a table or a subrange of the table. The synchronous nature in which the data is served at the client side hinders the speed the application traverses the data: it increases the overall processing time, and may cause a great variance in the times the application waits for the next piece of data. The scanner next() method at the client side invokes an RPC to the regionserver and then stores the results in a cache. The application can specify how many rows will be transmitted per RPC; by default this is set to 100 rows. The cache can be considered as a producer-consumer queue, where the hbase client pushes the data to the queue and the application consumes it. Currently this queue is synchronous, i.e., blocking. More specifically, when the application consumed all the data from the cache --- so the cache is empty --- the hbase client retrieves additional data from the server and re-fills the cache with new data. During this time the application is blocked. Under the assumption that the application processing time can be balanced by the time it takes to retrieve the data, an asynchronous approach can reduce the time the application is waiting for data. We attach a design document. We also have a patch that is based on a private branch, and some evaluation results of this code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
[ https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-13469: --- Attachment: HBASE-13469.v1-branch-1.1.patch [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1 Key: HBASE-13469 URL: https://issues.apache.org/jira/browse/HBASE-13469 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Fix For: 1.1.0 Attachments: HBASE-13469.v1-branch-1.1.patch In branch-1, I think we want proc v2 to be configurable, so that if any non-recoverable issue is found, at least there is a workaround. We already have the handlers and code laying around. It will be just introducing the config to enable / disable. We can even make it dynamically configurable via the new framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13516) Increase PermSize to 128MB
Enis Soztutar created HBASE-13516: - Summary: Increase PermSize to 128MB Key: HBASE-13516 URL: https://issues.apache.org/jira/browse/HBASE-13516 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 HBase uses ~40MB, and with Phoenix we use ~56MB of Perm space out of 64MB by default. Every Filter and Coprocessor increases that. Running out of perm space triggers a stop the world full GC of the entire heap. We have seen this in misconfigured cluster. Should we default to {{-XX:PermSize=128m -XX:MaxPermSize=128m}} out of the box as a convenience for users? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503985#comment-14503985 ] Hadoop QA commented on HBASE-13514: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726661/HBASE-13514-branch-1.patch against branch-1 branch at commit 702aea5b38ed6ad0942b0c59c3accca476b46873. ATTACHMENT ID: 12726661 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13745//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13745//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13745//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13745//console This message is automatically generated. Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504049#comment-14504049 ] Nick Dimiduk commented on HBASE-13259: -- Right. Sounds good. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-7750) We should throw IOE when calling HRegionServer#replicateLogEntries if ReplicationSink is null
[ https://issues.apache.org/jira/browse/HBASE-7750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-7750: -- Resolution: Incomplete Assignee: (was: Jieshan Bean) Status: Resolved (was: Patch Available) We should throw IOE when calling HRegionServer#replicateLogEntries if ReplicationSink is null - Key: HBASE-7750 URL: https://issues.apache.org/jira/browse/HBASE-7750 Project: HBase Issue Type: Bug Components: Replication Affects Versions: 0.94.4, 0.95.2 Reporter: Jieshan Bean Attachments: HBASE-7750-94.patch, HBASE-7750-trunk.patch It may be an expected behavior, but I think it's better to do something. We configured hbase.replication as true in master cluster, and added peer. But forgot to configure hbase.replication on slave cluster side. ReplicationSource read HLog, shipped log edits, and logged position. Everything seemed alright. But data was not present in slave cluster. So I think, slave cluster should throw exception to master cluster instead of return directly: {code} public void replicateLogEntries(final HLog.Entry[] entries) throws IOException { checkOpen(); if (this.replicationSinkHandler == null) return; this.replicationSinkHandler.replicateLogEntries(entries); } {code} I would like to hear your comments on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-8378) add 'force' option for drop table
[ https://issues.apache.org/jira/browse/HBASE-8378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-8378: -- Status: Open (was: Patch Available) Cancelling stale patch. Close? add 'force' option for drop table - Key: HBASE-8378 URL: https://issues.apache.org/jira/browse/HBASE-8378 Project: HBase Issue Type: Improvement Components: shell, Usability Affects Versions: 0.95.0, 0.94.6.1 Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: 0001-HBASE-8378-shell-add-force-option-to-drop.patch, 0001-HBASE-8378-shell-add-force-option-to-drop.patch Does this logic look familiar? {noformat} def drop_table(name): if (!admin.table_exists?(name): return if (admin.enabled?(name)): admin.disable_table(name) admin.drop_table(name) {noformat} Let's add a force option to 'drop' that does exactly this. We'll save 6 lines of code for thousands of developers in millions of scripts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-13259: - Fix Version/s: (was: 1.1.0) 1.2.0 mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.2.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13518) Typo in hbase.hconnection.meta.lookup.threads.core parameter
Enis Soztutar created HBASE-13518: - Summary: Typo in hbase.hconnection.meta.lookup.threads.core parameter Key: HBASE-13518 URL: https://issues.apache.org/jira/browse/HBASE-13518 Project: HBase Issue Type: Improvement Reporter: Enis Soztutar Assignee: Devaraj Das Fix For: 2.0.0, 1.1.0 A possible typo coming from patch in HBASE-13036. I think we want {{hbase.hconnection.meta.lookup.threads.core}}, not {{hbase.hconnection.meta.lookup.threads.max.core}} to be in line with the regular thread pool configuration. {code} //To start with, threads.max.core threads can hit the meta (including replicas). //After that, requests will get queued up in the passed queue, and only after //the queue is full, a new thread will be started this.metaLookupPool = getThreadPool( conf.getInt(hbase.hconnection.meta.lookup.threads.max, 128), conf.getInt(hbase.hconnection.meta.lookup.threads.max.core, 10), {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-7937) Retry log rolling to support HA NN scenario
[ https://issues.apache.org/jira/browse/HBASE-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-7937: -- Resolution: Incomplete Assignee: (was: Himanshu Vashishtha) Status: Resolved (was: Patch Available) Retry log rolling to support HA NN scenario --- Key: HBASE-7937 URL: https://issues.apache.org/jira/browse/HBASE-7937 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.5 Reporter: Himanshu Vashishtha Attachments: HBASE-7937-trunk.patch, HBASE-7937-v1.patch, HBase-7937-0.94.txt, HBase-7937-trunk.txt A failure in log rolling causes regionserver abort. In case of HA NN, it will be good if there is a retry mechanism to roll the logs. A corresponding jira for MemStore retries is HBASE-7507. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-4535) hbase-env.sh in hbase rpm does not set HBASE_CONF_DIR
[ https://issues.apache.org/jira/browse/HBASE-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4535: -- Resolution: Incomplete Assignee: (was: Eric Yang) Status: Resolved (was: Patch Available) hbase-env.sh in hbase rpm does not set HBASE_CONF_DIR - Key: HBASE-4535 URL: https://issues.apache.org/jira/browse/HBASE-4535 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.90.3 Reporter: Ramya Sunil Attachments: HBASE-4535.patch After a hbase rpm install, hbase-env.sh does not define HBASE_CONF_DIR. This needs to be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13090) Progress heartbeats for long running scanners
[ https://issues.apache.org/jira/browse/HBASE-13090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503574#comment-14503574 ] Jonathan Lawlor commented on HBASE-13090: - Filed HBASE-13514 to address the test failures in branch-1 and branch-1.1 Progress heartbeats for long running scanners - Key: HBASE-13090 URL: https://issues.apache.org/jira/browse/HBASE-13090 Project: HBase Issue Type: New Feature Reporter: Andrew Purtell Assignee: Jonathan Lawlor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: 13090-branch-1.addendum, HBASE-13090-v1.patch, HBASE-13090-v2.patch, HBASE-13090-v3.patch, HBASE-13090-v3.patch, HBASE-13090-v4.patch, HBASE-13090-v6.patch, HBASE-13090-v7.patch It can be necessary to set very long timeouts for clients that issue scans over large regions when all data in the region might be filtered out depending on scan criteria. This is a usability concern because it can be hard to identify what worst case timeout to use until scans are occasionally/intermittently failing in production, depending on variable scan criteria. It would be better if the client-server scan protocol can send back periodic progress heartbeats to clients as long as server scanners are alive and making progress. This is related but orthogonal to streaming scan (HBASE-13071). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages in branch-1.1 and branch-1
Jonathan Lawlor created HBASE-13514: --- Summary: Fix test failures in TestScannerHeartbeatMessages in branch-1.1 and branch-1 Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Priority: Minor The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13482) Phoenix is failing to scan tables on secure environments.
[ https://issues.apache.org/jira/browse/HBASE-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13482: --- Fix Version/s: 0.98.13 Cherry picked to 0.98, thanks for the heads up! Phoenix is failing to scan tables on secure environments. -- Key: HBASE-13482 URL: https://issues.apache.org/jira/browse/HBASE-13482 Project: HBase Issue Type: Bug Reporter: Alicia Ying Shu Assignee: Alicia Ying Shu Fix For: 1.1.0, 0.98.13 Attachments: Hbase-13482-v1.patch, Hbase-13482.patch When executed on secure environments, phoenix query is getting the following exception message: java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.security.AccessDeniedException: org.apache.hadoop.hbase.security.AccessDeniedException: User 'null' is not the scanner owner! org.apache.hadoop.hbase.security.access.AccessController.requireScannerOwner(AccessController.java:2048) org.apache.hadoop.hbase.security.access.AccessController.preScannerNext(AccessController.java:2022) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$53.call(RegionCoprocessorHost.java:1336) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1671) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1746) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1720) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preScannerNext(RegionCoprocessorHost.java:1331) org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2227) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503740#comment-14503740 ] Ted Yu commented on HBASE-13514: Thanks for the patch, Jonathan. Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-13514: --- Resolution: Fixed Status: Resolved (was: Patch Available) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-6639) Class.newInstance() can throw any checked exceptions and must be encapsulated with catching Exception
[ https://issues.apache.org/jira/browse/HBASE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6639: -- Resolution: Incomplete Assignee: (was: Hiroshi Ikeda) Status: Resolved (was: Patch Available) Class.newInstance() can throw any checked exceptions and must be encapsulated with catching Exception - Key: HBASE-6639 URL: https://issues.apache.org/jira/browse/HBASE-6639 Project: HBase Issue Type: Bug Affects Versions: 0.94.1 Reporter: Hiroshi Ikeda Priority: Minor Attachments: HBASE-6639-V2.patch, HBASE-6639-V3.patch, HBASE-6639.patch There are some logics to call Class.newInstance() without catching Exception, for example, in the method CoprocessorHost.loadInstance(). Class.newInstance() is declared to throw InstantiationException and IllegalAccessException but indeed the method can throw any checked exceptions without declaration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-6289) ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires.
[ https://issues.apache.org/jira/browse/HBASE-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6289: -- Resolution: Cannot Reproduce Assignee: (was: Maryann Xue) Status: Resolved (was: Patch Available) Reopen if reproducible with current release code. ROOT region doesn't get re-assigned in ServerShutdownHandler if the RS is still working but only the RS's ZK node expires. -- Key: HBASE-6289 URL: https://issues.apache.org/jira/browse/HBASE-6289 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.6, 0.94.0 Reporter: Maryann Xue Priority: Critical Attachments: HBASE-6289-v2.patch, HBASE-6289-v2.patch, HBASE-6289.patch The ROOT RS has some network problem and its ZK node expires first, which kicks off the ServerShutdownHandler. it calls verifyAndAssignRoot() to try to re-assign ROOT. At that time, the RS is actually still working and passes the verifyRootRegionLocation() check, so the ROOT region is skipped from re-assignment. {code} private void verifyAndAssignRoot() throws InterruptedException, IOException, KeeperException { long timeout = this.server.getConfiguration(). getLong(hbase.catalog.verification.timeout, 1000); if (!this.server.getCatalogTracker().verifyRootRegionLocation(timeout)) { this.services.getAssignmentManager().assignRoot(); } } {code} After a few moments, this RS encounters DFS write problem and decides to abort. The RS then soon gets restarted from commandline, and constantly report: {code} 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,627 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,628 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 2012-06-27 23:13:08,630 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation
[ https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-5790: -- Status: Open (was: Patch Available) Cancelling stale patch. We're up to minimum necessary ZK now I'd say. Revisit? Or close. ZKUtil deleteRecursively should be a recoverable operation -- Key: HBASE-5790 URL: https://issues.apache.org/jira/browse/HBASE-5790 Project: HBase Issue Type: Improvement Reporter: Jesse Yates Assignee: Jesse Yates Labels: zookeeper Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means we can wholesale delete chunks of the zk tree and ensure that we don't have any pesky recursive delete issues where we delete the children of a node, but then a child joins before deletion of the parent. Even without transactions, this should be the behavior, but it is possible to make it much cleaner now that we have this new feature in zk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13471) Deadlock closing a region
[ https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504047#comment-14504047 ] Rajesh Nishtala commented on HBASE-13471: - In fairness I think there are two bugs here. (1) the client has a row / region mismatch under some circumstances that are yet TBD and (2) when that occurs there's a possible infinite loop. This addresses the later by propagating up the wrong region information to the client. With this fix in we can hopefully find the cause of (1) with the extra debugging information that results from the fix for (2). Deadlock closing a region - Key: HBASE-13471 URL: https://issues.apache.org/jira/browse/HBASE-13471 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Rajesh Nishtala Attachments: HBASE-13471.patch {code} Thread 4139 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537): State: WAITING Blocked count: 131 Waited count: 228 Waiting on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371) org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509) org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-8489) Fix HBASE-8482 on trunk
[ https://issues.apache.org/jira/browse/HBASE-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-8489: -- Resolution: Incomplete Status: Resolved (was: Patch Available) Fix HBASE-8482 on trunk --- Key: HBASE-8489 URL: https://issues.apache.org/jira/browse/HBASE-8489 Project: HBase Issue Type: Bug Reporter: Nicolas Liochon Attachments: 8482.v2.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-4462) Properly treating SocketTimeoutException
[ https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4462: -- Resolution: Incomplete Assignee: (was: ramkrishna.s.vasudevan) Status: Resolved (was: Patch Available) Properly treating SocketTimeoutException Key: HBASE-4462 URL: https://issues.apache.org/jira/browse/HBASE-4462 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Attachments: HBASE-4462_0.90.x.patch, unittest_that_shows_us_retrying_sockettimeout.txt SocketTimeoutException is currently treated like any IOE inside of HCM.getRegionServerWithRetries and I think this is a problem. This method should only do retries in cases where we are pretty sure the operation will complete, but with STE we already waited for (by default) 60 seconds and nothing happened. I found this while debugging Douglas Campbell's problem on the mailing list where it seemed like he was using the same scanner from multiple threads, but actually it was just the same client doing retries while the first run didn't even finish yet (that's another problem). You could see the first scanner, then up to two other handlers waiting for it to finish in order to run (because of the synchronization on RegionScanner). So what should we do? We could treat STE as a DoNotRetryException and let the client deal with it, or we could retry only once. There's also the option of having a different behavior for get/put/icv/scan, the issue with operations that modify a cell is that you don't know if the operation completed or not (same when a RS dies hard after completing let's say a Put but just before returning to the client). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-3577) enables Thrift client to get the Region location
[ https://issues.apache.org/jira/browse/HBASE-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-3577: -- Resolution: Not A Problem Status: Resolved (was: Patch Available) enables Thrift client to get the Region location Key: HBASE-3577 URL: https://issues.apache.org/jira/browse/HBASE-3577 Project: HBase Issue Type: Improvement Components: Thrift Reporter: Kazuki Ohta Attachments: HBASE3577-1.patch, HBASE3577-2.patch The current thrift interface has the getTableRegions() interface like below. {code} listTRegionInfo getTableRegions( /** table name */ 1:Text tableName) throws (1:IOError io) {code} {code} struct TRegionInfo { 1:Text startKey, 2:Text endKey, 3:i64 id, 4:Text name, 5:byte version } {code} But the method don't have the region location information (where the region is located). I want to add the Thrift interfaces like below in HTable.java. {code} public MapHRegionInfo, HServerAddress getRegionsInfo() throws IOException {code} {code} public HRegionLocation getRegionLocation(final String row) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lawlor updated HBASE-13514: Summary: Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting of hbase.rpc.timeout (was: Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting for ) Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Lawlor updated HBASE-13514: Summary: Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout (was: Fix test failures in TestScannerHeartbeatMessages caused by a too restrictive setting of hbase.rpc.timeout) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13481) Master should respect master (old) DNS/bind related configurations
[ https://issues.apache.org/jira/browse/HBASE-13481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503667#comment-14503667 ] Enis Soztutar commented on HBASE-13481: --- yes. A test is failing since this went in org.apache.hadoop.hbase.regionserver.TestRegionServerHostname.testRegionServerHostname Sorry my b. v2 patch passed hadoopqa, but I committed v3 without waiting for another, because I was in a hurry to spin 1.0.1 RC. Anyway, Ted's addendum fixed the test already. Master should respect master (old) DNS/bind related configurations -- Key: HBASE-13481 URL: https://issues.apache.org/jira/browse/HBASE-13481 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.0.1, 1.1.0 Attachments: 13481-addendum.txt, hbase-13481_v1.patch, hbase-13481_v2.patch, hbase-13481_v3-branch-1.0.patch, hbase-13481_v3.patch This is a continuation of parent HBASE-13453. We should continue respecting the following parameters that 1.0.0 does not: {code} hbase.master.dns.interface hbase.master.dns.nameserver hbase.master.ipc.address {code} Credit goes to [~jerryhe] for pointing that out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13259) mmap() based BucketCache IOEngine
[ https://issues.apache.org/jira/browse/HBASE-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503950#comment-14503950 ] stack commented on HBASE-13259: --- I suggest we kick it out of 1.1 then. It should be finished with a definitive story before it gets committed IMO. mmap() based BucketCache IOEngine - Key: HBASE-13259 URL: https://issues.apache.org/jira/browse/HBASE-13259 Project: HBase Issue Type: New Feature Components: BlockCache Affects Versions: 0.98.10 Reporter: Zee Chen Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13259-v2.patch, HBASE-13259.patch, ioread-1.svg, mmap-0.98-v1.patch, mmap-1.svg, mmap-trunk-v1.patch Of the existing BucketCache IOEngines, FileIOEngine uses pread() to copy data from kernel space to user space. This is a good choice when the total working set size is much bigger than the available RAM and the latency is dominated by IO access. However, when the entire working set is small enough to fit in the RAM, using mmap() (and subsequent memcpy()) to move data from kernel space to user space is faster. I have run some short keyval gets tests and the results indicate a reduction of 2%-7% of kernel CPU on my system, depending on the load. On the gets, the latency histograms from mmap() are identical to those from pread(), but peak throughput is close to 40% higher. This patch modifies ByteByfferArray to allow it to specify a backing file. Example for using this feature: set hbase.bucketcache.ioengine to mmap:/dev/shm/bucketcache.0 in hbase-site.xml. Attached perf measured CPU usage breakdown in flames graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13517) Publish a client artifact with shaded dependencies
Elliott Clark created HBASE-13517: - Summary: Publish a client artifact with shaded dependencies Key: HBASE-13517 URL: https://issues.apache.org/jira/browse/HBASE-13517 Project: HBase Issue Type: Bug Reporter: Elliott Clark -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-4916) LoadTest MR Job
[ https://issues.apache.org/jira/browse/HBASE-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4916: -- Resolution: Incomplete Assignee: (was: Karthik Ranganathan) Status: Resolved (was: Patch Available) LoadTest MR Job --- Key: HBASE-4916 URL: https://issues.apache.org/jira/browse/HBASE-4916 Project: HBase Issue Type: Sub-task Components: Client, regionserver Reporter: Nicolas Spiegelberg Attachments: ASF.LICENSE.NOT.GRANTED--HBASE-4916.D741.1.patch Add a script to start a streaming map-reduce job where each map tasks runs an instance of the load tester for a partition of the key-space. Ensure that the load tester takes a parameter indicating the start key for write operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-6480) If callQueueSize exceed maxQueueSize, all call will be rejected, do not reject priorityCall
[ https://issues.apache.org/jira/browse/HBASE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6480: -- Resolution: Not A Problem Status: Resolved (was: Patch Available) If callQueueSize exceed maxQueueSize, all call will be rejected, do not reject priorityCall Key: HBASE-6480 URL: https://issues.apache.org/jira/browse/HBASE-6480 Project: HBase Issue Type: Improvement Reporter: binlijin Attachments: HBASE-6480-94.patch, HBASE-6480-trunk.patch Current if the callQueueSize exceed maxQueueSize, all call will be rejected, Should we let the priority Call pass through? Current: {code} if ((callSize + callQueueSize.get()) maxQueueSize) { Call callTooBig = xxx return ; } if (priorityCallQueue != null getQosLevel(param) highPriorityLevel) { priorityCallQueue.put(call); updateCallQueueLenMetrics(priorityCallQueue); } else { callQueue.put(call); // queue the call; maybe blocked here updateCallQueueLenMetrics(callQueue); } {code} Should we change it to : {code} if (priorityCallQueue != null getQosLevel(param) highPriorityLevel) { priorityCallQueue.put(call); updateCallQueueLenMetrics(priorityCallQueue); } else { if ((callSize + callQueueSize.get()) maxQueueSize) { Call callTooBig = xxx return ; } callQueue.put(call); // queue the call; maybe blocked here updateCallQueueLenMetrics(callQueue); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13071) Hbase Streaming Scan Feature
[ https://issues.apache.org/jira/browse/HBASE-13071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503897#comment-14503897 ] Hadoop QA commented on HBASE-13071: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726649/HBASE-13071_trunk_rebase_1.0.patch against master branch at commit 702aea5b38ed6ad0942b0c59c3accca476b46873. ATTACHMENT ID: 12726649 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 16 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.1 2.5.2 2.6.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1902 checkstyle errors (more than the master's current 1898 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13744//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13744//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13744//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13744//console This message is automatically generated. Hbase Streaming Scan Feature Key: HBASE-13071 URL: https://issues.apache.org/jira/browse/HBASE-13071 Project: HBase Issue Type: New Feature Reporter: Eshcar Hillel Attachments: 99.eshcar.png, HBASE-13071_98_1.patch, HBASE-13071_trunk_1.patch, HBASE-13071_trunk_10.patch, HBASE-13071_trunk_2.patch, HBASE-13071_trunk_3.patch, HBASE-13071_trunk_4.patch, HBASE-13071_trunk_5.patch, HBASE-13071_trunk_6.patch, HBASE-13071_trunk_7.patch, HBASE-13071_trunk_8.patch, HBASE-13071_trunk_9.patch, HBASE-13071_trunk_rebase_1.0.patch, HBaseStreamingScanDesign.pdf, HbaseStreamingScanEvaluation.pdf, HbaseStreamingScanEvaluationwithMultipleClients.pdf, gc.delay.png, gc.eshcar.png, gc.png, hits.delay.png, hits.eshcar.png, hits.png, latency.delay.png, latency.png, network.png A scan operation iterates over all rows of a table or a subrange of the table. The synchronous nature in which the data is served at the client side hinders the speed the application traverses the data: it increases the overall processing time, and may cause a great variance in the times the application waits for the next piece of data. The scanner next() method at the client side invokes an RPC to the regionserver and then stores the results in a cache. The application can specify how many rows will be transmitted per RPC; by default this is set to 100 rows. The cache can be considered as a producer-consumer queue, where the hbase client pushes the data to the queue and the application consumes it. Currently this queue is synchronous, i.e., blocking. More specifically, when the application consumed all the data from the cache --- so the cache is empty --- the hbase client retrieves additional data from the server and re-fills the cache with new data. During this time the application is blocked. Under the assumption that the application processing time can be balanced by the time it takes to retrieve the data, an asynchronous approach can reduce the time the application is waiting for data. We attach a design document. We also have a patch that is based on a private branch, and some evaluation results of this code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14503932#comment-14503932 ] Hudson commented on HBASE-13514: FAILURE: Integrated in HBase-1.2 #9 (See [https://builds.apache.org/job/HBase-1.2/9/]) HBASE-13514 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout (Jonathan Lawlor) (tedyu: rev cac134c14af9df7d4219bd77abf817a84c975499) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13519) Support coupled compactions with secondary index
tristartom created HBASE-13519: -- Summary: Support coupled compactions with secondary index Key: HBASE-13519 URL: https://issues.apache.org/jira/browse/HBASE-13519 Project: HBase Issue Type: New Feature Reporter: tristartom Hi, DELI (DEferred Lightweight Indexing) is our research prototype from Syracuse University with collaboration from Georgia Tech and IBM Research. In DELI, we propose that when supporting secondary index on HBase, the index-to-base-table sync-up should be coupled with compaction. The benefit of this is that online Put stays to be append-only and performance would be guaranteed. The code of DELI is shared in github: https://github.com/tristartom/nosql-indexing Details can be found in the following research paper published in CCGrid 2015: http://tristartom.github.io/docs/ccgrid15.pdf We are grateful for the HBase community, and any comments/suggestions are appreciated. Yuzhe -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13471) Deadlock closing a region
[ https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504036#comment-14504036 ] Rajesh Nishtala commented on HBASE-13471: - The fix is up at https://reviews.facebook.net/D37437. Looks like there's a possible infinite loop that can occur in doMiniBatchMutation with the readLock held causing the doClose() to never be able to grab its lock. Deadlock closing a region - Key: HBASE-13471 URL: https://issues.apache.org/jira/browse/HBASE-13471 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Rajesh Nishtala Attachments: HBASE-13471.patch {code} Thread 4139 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537): State: WAITING Blocked count: 131 Waited count: 228 Waiting on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371) org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509) org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13514) Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout
[ https://issues.apache.org/jira/browse/HBASE-13514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504038#comment-14504038 ] Hudson commented on HBASE-13514: SUCCESS: Integrated in HBase-TRUNK #6394 (See [https://builds.apache.org/job/HBase-TRUNK/6394/]) HBASE-13514 Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout (Jonathan Lawlor) (tedyu: rev eb82b8b3098d6a9ac62aa50189f9d4b289f38472) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestScannerHeartbeatMessages.java Fix test failures in TestScannerHeartbeatMessages caused by incorrect setting of hbase.rpc.timeout -- Key: HBASE-13514 URL: https://issues.apache.org/jira/browse/HBASE-13514 Project: HBase Issue Type: Sub-task Affects Versions: 1.1.0, 1.2.0 Reporter: Jonathan Lawlor Assignee: Jonathan Lawlor Priority: Minor Fix For: 2.0.0, 1.1.0, 1.2.0 Attachments: HBASE-13514-branch-1.1.patch, HBASE-13514-branch-1.patch, HBASE-13514.patch The test inside TestScannerHeartbeatMessages is failing because the configured value of hbase.rpc.timeout cannot be less than 2 seconds in branch-1 and branch-1.1 but the test expects that it can be set to 0.5 seconds. This is because of the field MIN_RPC_TIMEOUT in {{RpcRetryingCaller}} which exists in branch-1 and branch-1.1 but is no longer in master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-4635) Remove dependency of java for rpm/deb packaging
[ https://issues.apache.org/jira/browse/HBASE-4635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4635: -- Resolution: Not A Problem Assignee: (was: Eric Yang) Status: Resolved (was: Patch Available) Remove dependency of java for rpm/deb packaging --- Key: HBASE-4635 URL: https://issues.apache.org/jira/browse/HBASE-4635 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.92.0 Environment: Java, Ubuntu, RHEL Reporter: Eric Yang Attachments: HBASE-4635-1.patch, HBASE-4635.patch Comment from HBASE-3606: Eric, it looks like hbase rpm spec file sets dependency on jdk. Can we remove the jdk dependency ? As everyone will not be installing jdk through rpm. There are multiple ways to install Java on Linux. It would be better to remove Java dependency declaration for packaging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-4523) dfs.support.append config should be present in the hadoop configs, we should remove them from hbase so the user is not confused when they see the config in 2 places
[ https://issues.apache.org/jira/browse/HBASE-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4523: -- Resolution: Not A Problem Assignee: (was: Eric Yang) Hadoop Flags: (was: Reviewed) Status: Resolved (was: Patch Available) dfs.support.append config should be present in the hadoop configs, we should remove them from hbase so the user is not confused when they see the config in 2 places Key: HBASE-4523 URL: https://issues.apache.org/jira/browse/HBASE-4523 Project: HBase Issue Type: Bug Affects Versions: 0.90.4, 0.92.0 Reporter: Arpit Gupta Attachments: HBASE-4523.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-4337) Update HBase directory structure layout to be aligned with Hadoop
[ https://issues.apache.org/jira/browse/HBASE-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4337: -- Resolution: Not A Problem Assignee: (was: Eric Yang) Release Note: (was: Added binary only profile for building binary only tar ball.) Status: Resolved (was: Patch Available) Update HBase directory structure layout to be aligned with Hadoop - Key: HBASE-4337 URL: https://issues.apache.org/jira/browse/HBASE-4337 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Reporter: Eric Yang Attachments: HBASE-4337-1.patch, HBASE-4337-2.patch, HBASE-4337-3.patch, HBASE-4337-4.patch, HBASE-4337-5.patch, HBASE-4337-6.patch, HBASE-4337.patch, hbase-4337-7.patch In HADOOP-6255, a proposal was made for common directory layout for Hadoop ecosystem. This jira is to track the necessary work for making HBase directory structure aligned with Hadoop for better integration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-4415) Add configuration script for setup HBase (hbase-setup-conf.sh)
[ https://issues.apache.org/jira/browse/HBASE-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-4415: -- Resolution: Later Assignee: (was: Eric Yang) Status: Resolved (was: Patch Available) Add configuration script for setup HBase (hbase-setup-conf.sh) -- Key: HBASE-4415 URL: https://issues.apache.org/jira/browse/HBASE-4415 Project: HBase Issue Type: New Feature Components: scripts Affects Versions: 0.90.4, 0.92.0 Environment: Java 6, Linux Reporter: Eric Yang Attachments: HBASE-4415-1.patch, HBASE-4415-2.patch, HBASE-4415-3.patch, HBASE-4415-4.patch, HBASE-4415-5.patch, HBASE-4415-6.patch, HBASE-4415-7.patch, HBASE-4415-8.patch, HBASE-4415-9.patch, HBASE-4415.patch The goal of this jura is to provide a installation script for configuring HBase environment and configuration. By using the same pattern of *-setup-conf.sh for all Hadoop related projects. For HBase, the usage of the script looks like this: {noformat} usage: ./hbase-setup-conf.sh parameters Optional parameters: --hadoop-conf=/etc/hadoopSet Hadoop configuration directory location --hadoop-home=/usr Set Hadoop directory location --hadoop-namenode=localhost Set Hadoop namenode hostname --hadoop-replication=3 Set HDFS replication --hbase-home=/usrSet HBase directory location --hbase-conf=/etc/hbase Set HBase configuration directory location --hbase-log=/var/log/hbase Set HBase log directory location --hbase-pid=/var/run/hbase Set HBase pid directory location --hbase-user=hbase Set HBase user --java-home=/usr/java/defaultSet JAVA_HOME directory location --kerberos-realm=KERBEROS.EXAMPLE.COMSet Kerberos realm --kerberos-principal-id=_HOSTSet Kerberos principal ID --keytab-dir=/etc/security/keytabs Set keytab directory --regionservers=localhostSet regionservers hostnames --zookeeper-home=/usrSet ZooKeeper directory location --zookeeper-quorum=localhost Set ZooKeeper Quorum --zookeeper-snapshot=/var/lib/zookeeper Set ZooKeeper snapshot location {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13515) Handle FileNotFoundException in region replica replay for flush/compaction events
[ https://issues.apache.org/jira/browse/HBASE-13515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504088#comment-14504088 ] Devaraj Das commented on HBASE-13515: - LGTM Handle FileNotFoundException in region replica replay for flush/compaction events - Key: HBASE-13515 URL: https://issues.apache.org/jira/browse/HBASE-13515 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 2.0.0, 1.1.0 Attachments: hbase-13515_v1.patch I had this patch laying around that somehow dropped from my plate. We should skip replaying compaction / flush and region open event markers if the files (from flush or compaction) can no longer be found from the secondary. If we do not skip, the replay will be retried forever, effectively blocking the replication further. Bulk load already does this, we just need to do it for flush / compaction and region open events as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-13517) Publish a client artifact with shaded dependencies
[ https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virag Kothari reassigned HBASE-13517: - Assignee: Virag Kothari Publish a client artifact with shaded dependencies -- Key: HBASE-13517 URL: https://issues.apache.org/jira/browse/HBASE-13517 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Virag Kothari Guava's moved on. Hadoop has not. Jackson moves whenever it feels like it. Protobuf moves with breaking point changes. While shading all of the time would break people that require the transitive dependencies for MR or other things. Lets provide an artifact with our dependencies shaded. Then users can have the choice to use the shaded version or the non-shaded version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13520) NullPointerException in TagRewriteCell
[ https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504328#comment-14504328 ] Josh Elser commented on HBASE-13520: I thought about this for a little bit. I'm undecided on whether or not it's a good idea to avoid returning null. On one hand, we made the conscious decision which states the underlying cell's tags should never be accessed again by this object. This implies that it would be an error if the caller tries to access this array when it is null (leads me to think something like {{assert null != this.tags}} could be added). On the other hand, we might avoid a future bug if we fail gracefully to an empty byte array. I couldn't make up my mind if one was better than the other, so I didn't make a change. I'm happy to make a change if there are those who are more strongly opinionated than me :) NullPointerException in TagRewriteCell -- Key: HBASE-13520 URL: https://issues.apache.org/jira/browse/HBASE-13520 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: HBASE-13520-v1.patch, HBASE-13520.patch Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos enabled. {noformat} 2015-04-20 18:54:36,712 ERROR [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157) at org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186) at org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541) at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127) at org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504) at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided copying all components of the original {{Cell}}. In an attempt to help free the tags on the old cell that we wouldn't be referencing anymore, {{TagRewriteCell}} nulls out the original {{byte[] tags}}. This causes a problem in that the implementation of {{heapSize()}} as it {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. Because the tags on the passed in {{Cell}} (which was also a {{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE by the byte array is null. I believe this isn't observed in normal, unsecure deployments because there is only one RegionObserver/Coprocessor loaded that gets invoked via {{postMutationBeforeWAL}}. When there is only one RegionObserver, the TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the wire/protobuf. This means that the optimization isn't performed. When we have two (or more) observers that a TagRewriteCell passes through (and a new TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), this enables the described-above NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13501) Deprecate/Remove getComparator() in HRegionInfo.
[ https://issues.apache.org/jira/browse/HBASE-13501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504327#comment-14504327 ] ramkrishna.s.vasudevan commented on HBASE-13501: When we talk about removing getComparator in HRegionInfo which is marked public, ideally HRegionInfo should not have been public. The only place where we expose that is in Admin.java {code} /** * Close a region. For expert-admins Runs close on the regionserver. The master will not be * informed of the close. * * @param sn * @param hri * @throws IOException */ void closeRegion(final ServerName sn, final HRegionInfo hri) throws IOException; {code} Here we really don't need an HRegionInfo which could have been always created from a TableName. I would say we could deprecate/remove these methods so that HRegionInfo can go to LimitedPrivate so that atleast CPs can use it and not a direct client facing Interface. Thoughts? Deprecate/Remove getComparator() in HRegionInfo. Key: HBASE-13501 URL: https://issues.apache.org/jira/browse/HBASE-13501 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13471) Deadlock closing a region
[ https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504347#comment-14504347 ] Lars Hofhansl commented on HBASE-13471: --- +1 on patch. Deadlock closing a region - Key: HBASE-13471 URL: https://issues.apache.org/jira/browse/HBASE-13471 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Rajesh Nishtala Attachments: HBASE-13471.patch {code} Thread 4139 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537): State: WAITING Blocked count: 131 Waited count: 228 Waiting on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371) org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509) org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13471) Deadlock closing a region
[ https://issues.apache.org/jira/browse/HBASE-13471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-13471: -- Affects Version/s: 1.1.0 2.0.0 Deadlock closing a region - Key: HBASE-13471 URL: https://issues.apache.org/jira/browse/HBASE-13471 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Elliott Clark Assignee: Rajesh Nishtala Attachments: HBASE-13471.patch {code} Thread 4139 (regionserver/hbase412.example.com/10.158.6.53:60020-splits-1429003183537): State: WAITING Blocked count: 131 Waited count: 228 Waiting on java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@50714dc3 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1371) org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1325) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:352) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:252) org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:509) org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:84) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13520) NullPointerException in TagRewriteCell
[ https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-13520: --- Status: Patch Available (was: Open) NullPointerException in TagRewriteCell -- Key: HBASE-13520 URL: https://issues.apache.org/jira/browse/HBASE-13520 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: HBASE-13520.patch Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos enabled. {noformat} 2015-04-20 18:54:36,712 ERROR [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157) at org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186) at org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541) at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127) at org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504) at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided copying all components of the original {{Cell}}. In an attempt to help free the tags on the old cell that we wouldn't be referencing anymore, {{TagRewriteCell}} nulls out the original {{byte[] tags}}. This causes a problem in that the implementation of {{heapSize()}} as it {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. Because the tags on the passed in {{Cell}} (which was also a {{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE by the byte array is null. I believe this isn't observed in normal, unsecure deployments because there is only one RegionObserver/Coprocessor loaded that gets invoked via {{postMutationBeforeWAL}}. When there is only one RegionObserver, the TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the wire/protobuf. This means that the optimization isn't performed. When we have two (or more) observers that a TagRewriteCell passes through (and a new TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), this enables the described-above NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13520) NullPointerException in TagRewriteCell
[ https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504316#comment-14504316 ] Anoop Sam John commented on HBASE-13520: Thanks for the find and fix Josh. My bad.. I missed the null check in this place. (added in another I place I guess) +1 nit on test case, need add SmallTests Category NullPointerException in TagRewriteCell -- Key: HBASE-13520 URL: https://issues.apache.org/jira/browse/HBASE-13520 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: HBASE-13520.patch Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos enabled. {noformat} 2015-04-20 18:54:36,712 ERROR [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157) at org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186) at org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541) at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127) at org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504) at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided copying all components of the original {{Cell}}. In an attempt to help free the tags on the old cell that we wouldn't be referencing anymore, {{TagRewriteCell}} nulls out the original {{byte[] tags}}. This causes a problem in that the implementation of {{heapSize()}} as it {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. Because the tags on the passed in {{Cell}} (which was also a {{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE by the byte array is null. I believe this isn't observed in normal, unsecure deployments because there is only one RegionObserver/Coprocessor loaded that gets invoked via {{postMutationBeforeWAL}}. When there is only one RegionObserver, the TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the wire/protobuf. This means that the optimization isn't performed. When we have two (or more) observers that a TagRewriteCell passes through (and a new TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), this enables the described-above NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling
[ https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504308#comment-14504308 ] Anoop Sam John commented on HBASE-13375: bq.what's the difference between system and super users? Super users are those which come from the xml configuration where as the system user is the user who started the server process. All these considered as super users of HBase. Make sense? But we can make the name simple getSuperUsers() and just add doc that it include which all users. Provide HBase superuser higher priority over other users in the RPC handling Key: HBASE-13375 URL: https://issues.apache.org/jira/browse/HBASE-13375 Project: HBase Issue Type: Improvement Components: rpc Reporter: Devaraj Das Assignee: Mikhail Antonov Fix For: 1.1.0 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with a higher priority compared to user RPCs (and they are handled by a separate set of handlers, etc.). It may be good to stretch this to users too - hbase superuser (configured via hbase.superuser) gets higher priority over other users in the RPC handling. That way the superuser can always perform administrative operations on the cluster even if all the normal priority handlers are occupied (for example, we had a situation where all the master's handlers were tied up with many simultaneous createTable RPC calls from multiple users and the master wasn't able to perform any operations initiated by the admin). (Discussed this some with [~enis] and [~elserj]). Does this make sense to others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling
[ https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504309#comment-14504309 ] Mikhail Antonov commented on HBASE-13375: - Right. Will update the patch later today, incorporating the feedback. Going to make these collection static fields of User class with lazy loading from conf (as that was the idea in other jira - makes sense to me). Provide HBase superuser higher priority over other users in the RPC handling Key: HBASE-13375 URL: https://issues.apache.org/jira/browse/HBASE-13375 Project: HBase Issue Type: Improvement Components: rpc Reporter: Devaraj Das Assignee: Mikhail Antonov Fix For: 1.1.0 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with a higher priority compared to user RPCs (and they are handled by a separate set of handlers, etc.). It may be good to stretch this to users too - hbase superuser (configured via hbase.superuser) gets higher priority over other users in the RPC handling. That way the superuser can always perform administrative operations on the cluster even if all the normal priority handlers are occupied (for example, we had a situation where all the master's handlers were tied up with many simultaneous createTable RPC calls from multiple users and the master wasn't able to perform any operations initiated by the admin). (Discussed this some with [~enis] and [~elserj]). Does this make sense to others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13520) NullPointerException in TagRewriteCell
[ https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504325#comment-14504325 ] ramkrishna.s.vasudevan commented on HBASE-13520: {code} @Override public byte[] getTagsArray() { return this.tags; } {code} Just asking, so for any callers calling getTagsArray() when tags == null, is it better to return an EMPTY_BYTE_ARRAY when tags == null. +1 on patch. NullPointerException in TagRewriteCell -- Key: HBASE-13520 URL: https://issues.apache.org/jira/browse/HBASE-13520 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: HBASE-13520-v1.patch, HBASE-13520.patch Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos enabled. {noformat} 2015-04-20 18:54:36,712 ERROR [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157) at org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186) at org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541) at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127) at org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504) at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided copying all components of the original {{Cell}}. In an attempt to help free the tags on the old cell that we wouldn't be referencing anymore, {{TagRewriteCell}} nulls out the original {{byte[] tags}}. This causes a problem in that the implementation of {{heapSize()}} as it {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. Because the tags on the passed in {{Cell}} (which was also a {{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE by the byte array is null. I believe this isn't observed in normal, unsecure deployments because there is only one RegionObserver/Coprocessor loaded that gets invoked via {{postMutationBeforeWAL}}. When there is only one RegionObserver, the TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the wire/protobuf. This means that the optimization isn't performed. When we have two (or more) observers that a TagRewriteCell passes through (and a new TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), this enables the described-above NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11758) Meta region location should be cached
[ https://issues.apache.org/jira/browse/HBASE-11758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virag Kothari updated HBASE-11758: -- Assignee: (was: Virag Kothari) Meta region location should be cached - Key: HBASE-11758 URL: https://issues.apache.org/jira/browse/HBASE-11758 Project: HBase Issue Type: Sub-task Reporter: Virag Kothari The zk less assignment involves only master updating the meta and this can be faster if we cache the meta location instead of reading the meta znode every time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11289) Speedup balance
[ https://issues.apache.org/jira/browse/HBASE-11289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virag Kothari updated HBASE-11289: -- Assignee: (was: Virag Kothari) Speedup balance --- Key: HBASE-11289 URL: https://issues.apache.org/jira/browse/HBASE-11289 Project: HBase Issue Type: Sub-task Reporter: Francis Liu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13517) Publish a client artifact with shaded dependencies
[ https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504354#comment-14504354 ] Elliott Clark commented on HBASE-13517: --- Here's a patch that adds in hbase-sharded, hbase-sharded-client, and hbase-sharded-server. When using the sharded versions there is a trade off. You can't use HBaseTestingUtil because all of the jsp, jersey, servlet classloading. If someone has a fix for this I'd be all ears. It's just beyond me. Publish a client artifact with shaded dependencies -- Key: HBASE-13517 URL: https://issues.apache.org/jira/browse/HBASE-13517 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.1.0 Reporter: Elliott Clark Assignee: Elliott Clark Fix For: 2.0.0, 1.1.0 Attachments: HBASE-13517.patch Guava's moved on. Hadoop has not. Jackson moves whenever it feels like it. Protobuf moves with breaking point changes. While shading all of the time would break people that require the transitive dependencies for MR or other things. Lets provide an artifact with our dependencies shaded. Then users can have the choice to use the shaded version or the non-shaded version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13501) Deprecate/Remove getComparator() in HRegionInfo.
[ https://issues.apache.org/jira/browse/HBASE-13501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504389#comment-14504389 ] stack commented on HBASE-13501: --- That seems like the right direction. HRI is all over the code base but I can't think of a good reason why it should be popping up in public client methods. Not sure though about calling a close region and passing a table How we for sure going to close the right region? On other hand, HRI is 'wrong' How is it used internally? To find the 'name' or region id? Deprecate/Remove getComparator() in HRegionInfo. Key: HBASE-13501 URL: https://issues.apache.org/jira/browse/HBASE-13501 Project: HBase Issue Type: Sub-task Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13078) IntegrationTestSendTraceRequests is a noop
[ https://issues.apache.org/jira/browse/HBASE-13078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504142#comment-14504142 ] Josh Elser commented on HBASE-13078: Friends, any chance we can get this committed in all but 0.98 for now? We can deal with whether or not this makes it into 0.98 after HBASE-12938 like Andrew stated. IntegrationTestSendTraceRequests is a noop -- Key: HBASE-13078 URL: https://issues.apache.org/jira/browse/HBASE-13078 Project: HBase Issue Type: Test Components: integration tests Reporter: Nick Dimiduk Assignee: Josh Elser Priority: Critical Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: HBASE-13078-0.98-removal.patch, HBASE-13078-0.98-v1.patch, HBASE-13078-v1.patch, HBASE-13078.patch While pair-debugging with [~jeffreyz] on HBASE-13077, we noticed that IntegrationTestSendTraceRequests doesn't actually assert anything. This test should be converted to use a mini cluster, setup a POJOSpanReceiver, and then verify the spans collected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13520) NullPointerException in TagRewriteCell
[ https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HBASE-13520: --- Attachment: HBASE-13520.patch Grossly simple patch for what was a rather convoluted bug to track down. Applies cleanly to master, branch-1, branch-1.1, and branch-1.0. NullPointerException in TagRewriteCell -- Key: HBASE-13520 URL: https://issues.apache.org/jira/browse/HBASE-13520 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: HBASE-13520.patch Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos enabled. {noformat} 2015-04-20 18:54:36,712 ERROR [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157) at org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186) at org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541) at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127) at org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504) at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided copying all components of the original {{Cell}}. In an attempt to help free the tags on the old cell that we wouldn't be referencing anymore, {{TagRewriteCell}} nulls out the original {{byte[] tags}}. This causes a problem in that the implementation of {{heapSize()}} as it {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. Because the tags on the passed in {{Cell}} (which was also a {{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE by the byte array is null. I believe this isn't observed in normal, unsecure deployments because there is only one RegionObserver/Coprocessor loaded that gets invoked via {{postMutationBeforeWAL}}. When there is only one RegionObserver, the TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the wire/protobuf. This means that the optimization isn't performed. When we have two (or more) observers that a TagRewriteCell passes through (and a new TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), this enables the described-above NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
[ https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-13469: --- Attachment: HBASE-13469.v1-branch-1.1.patch [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1 Key: HBASE-13469 URL: https://issues.apache.org/jira/browse/HBASE-13469 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Fix For: 1.1.0 Attachments: HBASE-13469.v1-branch-1.1.patch In branch-1, I think we want proc v2 to be configurable, so that if any non-recoverable issue is found, at least there is a workaround. We already have the handlers and code laying around. It will be just introducing the config to enable / disable. We can even make it dynamically configurable via the new framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13375) Provide HBase superuser higher priority over other users in the RPC handling
[ https://issues.apache.org/jira/browse/HBASE-13375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504304#comment-14504304 ] Anoop Sam John commented on HBASE-13375: IMO it can be done now as using VisibilityUtil in places like core server area like QoS look bit strange. Yes as it is becoming used by many areas of code, now we can better the method name and signature. Provide HBase superuser higher priority over other users in the RPC handling Key: HBASE-13375 URL: https://issues.apache.org/jira/browse/HBASE-13375 Project: HBase Issue Type: Improvement Components: rpc Reporter: Devaraj Das Assignee: Mikhail Antonov Fix For: 1.1.0 Attachments: HBASE-13375-v0.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v1.patch, HBASE-13375-v2.patch HBASE-13351 annotates Master RPCs so that RegionServer RPCs are treated with a higher priority compared to user RPCs (and they are handled by a separate set of handlers, etc.). It may be good to stretch this to users too - hbase superuser (configured via hbase.superuser) gets higher priority over other users in the RPC handling. That way the superuser can always perform administrative operations on the cluster even if all the normal priority handlers are occupied (for example, we had a situation where all the master's handlers were tied up with many simultaneous createTable RPC calls from multiple users and the master wasn't able to perform any operations initiated by the admin). (Discussed this some with [~enis] and [~elserj]). Does this make sense to others? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13469) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1
[ https://issues.apache.org/jira/browse/HBASE-13469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-13469: --- Attachment: (was: HBASE-13469.v1-branch-1.1.patch) [branch-1.1] Procedure V2 - Make procedure v2 configurable in branch-1.1 Key: HBASE-13469 URL: https://issues.apache.org/jira/browse/HBASE-13469 Project: HBase Issue Type: Sub-task Components: master Reporter: Enis Soztutar Assignee: Stephen Yuan Jiang Fix For: 1.1.0 Attachments: HBASE-13469.v1-branch-1.1.patch In branch-1, I think we want proc v2 to be configurable, so that if any non-recoverable issue is found, at least there is a workaround. We already have the handlers and code laying around. It will be just introducing the config to enable / disable. We can even make it dynamically configurable via the new framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504350#comment-14504350 ] Lars Hofhansl commented on HBASE-13082: --- Correct. No more locking other than to fix the current version of access data structure at the beginning of the scan, and StoreScanner would indeed be single threaded (which is it 99.% of the already :) ). That would be bigger change. Coarsen StoreScanner locks to RegionScanner --- Key: HBASE-13082 URL: https://issues.apache.org/jira/browse/HBASE-13082 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: Lars Hofhansl Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, next.png, next.png Continuing where HBASE-10015 left of. We can avoid locking (and memory fencing) inside StoreScanner by deferring to the lock already held by the RegionScanner. In tests this shows quite a scan improvement and reduced CPU (the fences make the cores wait for memory fetches). There are some drawbacks too: * All calls to RegionScanner need to be remain synchronized * Implementors of coprocessors need to be diligent in following the locking contract. For example Phoenix does not lock RegionScanner.nextRaw() and required in the documentation (not picking on Phoenix, this one is my fault as I told them it's OK) * possible starving of flushes and compaction with heavy read load. RegionScanner operations would keep getting the locks and the flushes/compactions would not be able finalize the set of files. I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11290) Unlock RegionStates
[ https://issues.apache.org/jira/browse/HBASE-11290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504392#comment-14504392 ] Mikhail Antonov commented on HBASE-11290: - Yeah, I think that would just work. After all, the lock names are encoded region names..so even for cluster with 1M regions with edge case of bulk re-assignment the cache shouldn't take more than few hundred mb of RAM, HMaster should be able to handle it. Alternatively, I guess we could just have wrapper class around CHMname, lock with lock(), unlock() methods, but current patch would work, too (maybe as further improvement we can limit the size of the cache and make getLock() block if the cache is waiting for GC?) What's funny, the length of this thread (http://stackoverflow.com/questions/5639870/simple-java-name-based-locks/) suggests that simple named locks aren't that simple ;) Unlock RegionStates --- Key: HBASE-11290 URL: https://issues.apache.org/jira/browse/HBASE-11290 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Virag Kothari Fix For: 2.0.0, 1.1.0, 0.98.13 Attachments: HBASE-11290-0.98.patch, HBASE-11290-0.98_v2.patch, HBASE-11290.draft.patch Even though RegionStates is a highly accessed data structure in HMaster. Most of it's methods are synchronized. Which limits concurrency. Even simply making some of the getters non-synchronized by using concurrent data structures has helped with region assignments. We can go as simple as this approach or create locks per region or a bucket lock per region bucket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13420) RegionEnvironment.offerExecutionLatency Blocks Threads under Heavy Load
[ https://issues.apache.org/jira/browse/HBASE-13420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-13420: --- Attachment: 1M-0.98.13-SNAPSHOT.svg 1M-0.98.12.svg I did a quick comparison using LoadTestTool on an all-localhost HDFS+HBase cluster between 0.98.12 and an 0.98.13-SNAPSHOT which was .12 plus this patch. The server has 32 GB of RAM and 12 cores, Xeon E5-1660s running at 3.70GHz. All JVMs except the regionserver were given 1 GB heap. The regionserver ran with 8 GB. (No particular reason for that heap size, just reusing a setting from another test.) I installed the AccessController with hbase.security.authorization set to false so every region would run with a coprocessor (largely inert) so we'd exercise this change. CMS GC. LoadTestTool arguments: -read 100:10 -write 1:1024:10 -update 20:10 -num_keys 100 *0.98.12* ||read|| ||update|| ||write|| || ||keys_sec||latency_ms||keys_sec||latency_ms||keys_sec||latency_ms|| |19831.5102|0|786.3265306|5.285714286|3929.142857|2.102040816| *0.98.13-SNAPSHOT* ||read|| ||update|| ||write|| || ||keys_sec||latency_ms||keys_sec||latency_ms||keys_sec||latency_ms|| |19377.10204|0|783.755102|5.265306122|3924.530612|2.102040816| Profiles attached. They look almost identical with a quick glance. I will run a longer comparison tomorrow with 25M keys. RegionEnvironment.offerExecutionLatency Blocks Threads under Heavy Load --- Key: HBASE-13420 URL: https://issues.apache.org/jira/browse/HBASE-13420 Project: HBase Issue Type: Improvement Reporter: John Leach Assignee: Andrew Purtell Fix For: 2.0.0, 1.1.0, 0.98.13, 1.0.2 Attachments: 1M-0.98.12.svg, 1M-0.98.13-SNAPSHOT.svg, HBASE-13420.patch, HBASE-13420.txt, hbase-13420.tar.gz, offerExecutionLatency.tiff Original Estimate: 3h Remaining Estimate: 3h The ArrayBlockingQueue blocks threads for 20s during a performance run focusing on creating numerous small scans. I see a buffer size of (100) private final BlockingQueueLong coprocessorTimeNanos = new ArrayBlockingQueueLong( LATENCY_BUFFER_SIZE); and then I see a drain coming from MetricsRegionWrapperImpl with 45 second executor HRegionMetricsWrapperRunable RegionCoprocessorHost#getCoprocessorExecutionStatistics() RegionCoprocessorHost#getExecutionLatenciesNanos() Am I missing something? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-13520) NullPointerException in TagRewriteCell
Josh Elser created HBASE-13520: -- Summary: NullPointerException in TagRewriteCell Key: HBASE-13520 URL: https://issues.apache.org/jira/browse/HBASE-13520 Project: HBase Issue Type: Bug Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.0, 1.1.0, 1.0.2 Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos enabled. {noformat} 2015-04-20 18:54:36,712 ERROR [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157) at org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186) at org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541) at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127) at org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504) at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided copying all components of the original {{Cell}}. In an attempt to help free the tags on the old cell that we wouldn't be referencing anymore, {{TagRewriteCell}} nulls out the original {{byte[] tags}}. This causes a problem in that the implementation of {{heapSize()}} as it {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. Because the tags on the passed in {{Cell}} (which was also a {{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE by the byte array is null. I believe this isn't observed in normal, unsecure deployments because there is only one RegionObserver/Coprocessor loaded that gets invoked via {{postMutationBeforeWAL}}. When there is only one RegionObserver, the TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the wire/protobuf. This means that the optimization isn't performed. When we have two (or more) observers that a TagRewriteCell passes through (and a new TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), this enables the described-above NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13517) Publish a client artifact with shaded dependencies
[ https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virag Kothari updated HBASE-13517: -- Assignee: (was: Virag Kothari) Publish a client artifact with shaded dependencies -- Key: HBASE-13517 URL: https://issues.apache.org/jira/browse/HBASE-13517 Project: HBase Issue Type: Bug Reporter: Elliott Clark Guava's moved on. Hadoop has not. Jackson moves whenever it feels like it. Protobuf moves with breaking point changes. While shading all of the time would break people that require the transitive dependencies for MR or other things. Lets provide an artifact with our dependencies shaded. Then users can have the choice to use the shaded version or the non-shaded version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13482) Phoenix is failing to scan tables on secure environments.
[ https://issues.apache.org/jira/browse/HBASE-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504337#comment-14504337 ] Hudson commented on HBASE-13482: SUCCESS: Integrated in HBase-0.98 #955 (See [https://builds.apache.org/job/HBase-0.98/955/]) HBASE-13482. Phoenix is failing to scan tables on secure environments. (Alicia Shu) (apurtell: rev 50010ca31ed0587e3bf112a5789ec42185a9b939) * hbase-server/src/main/java/org/apache/hadoop/hbase/security/visibility/VisibilityController.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java Phoenix is failing to scan tables on secure environments. -- Key: HBASE-13482 URL: https://issues.apache.org/jira/browse/HBASE-13482 Project: HBase Issue Type: Bug Reporter: Alicia Ying Shu Assignee: Alicia Ying Shu Fix For: 1.1.0, 0.98.13 Attachments: Hbase-13482-v1.patch, Hbase-13482.patch When executed on secure environments, phoenix query is getting the following exception message: java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.security.AccessDeniedException: org.apache.hadoop.hbase.security.AccessDeniedException: User 'null' is not the scanner owner! org.apache.hadoop.hbase.security.access.AccessController.requireScannerOwner(AccessController.java:2048) org.apache.hadoop.hbase.security.access.AccessController.preScannerNext(AccessController.java:2022) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$53.call(RegionCoprocessorHost.java:1336) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1671) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1746) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1720) org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preScannerNext(RegionCoprocessorHost.java:1331) org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2227) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11290) Unlock RegionStates
[ https://issues.apache.org/jira/browse/HBASE-11290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504339#comment-14504339 ] Virag Kothari commented on HBASE-11290: --- IdLock is not reentrant. So, wont be able to use that So probably we have to go with LockCache impl as in patch. For eviction, currently values are wrapped using soft references so garbage collection will be triggered on it on demand. Its not a great eviction policy and will create memory pressure for large no of regions. As LockCache uses guava's cache builder, it can support quite a few eviction schemes (https://code.google.com/p/guava-libraries/wiki/CachesExplained) I think they can be investigated and added later on in a new jira. Thoughts? Unlock RegionStates --- Key: HBASE-11290 URL: https://issues.apache.org/jira/browse/HBASE-11290 Project: HBase Issue Type: Sub-task Reporter: Francis Liu Assignee: Virag Kothari Fix For: 2.0.0, 1.1.0, 0.98.13 Attachments: HBASE-11290-0.98.patch, HBASE-11290-0.98_v2.patch, HBASE-11290.draft.patch Even though RegionStates is a highly accessed data structure in HMaster. Most of it's methods are synchronized. Which limits concurrency. Even simply making some of the getters non-synchronized by using concurrent data structures has helped with region assignments. We can go as simple as this approach or create locks per region or a bucket lock per region bucket. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13517) Publish a client artifact with shaded dependencies
[ https://issues.apache.org/jira/browse/HBASE-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504286#comment-14504286 ] stack commented on HBASE-13517: --- Guava first! Publish a client artifact with shaded dependencies -- Key: HBASE-13517 URL: https://issues.apache.org/jira/browse/HBASE-13517 Project: HBase Issue Type: Bug Reporter: Elliott Clark Guava's moved on. Hadoop has not. Jackson moves whenever it feels like it. Protobuf moves with breaking point changes. While shading all of the time would break people that require the transitive dependencies for MR or other things. Lets provide an artifact with our dependencies shaded. Then users can have the choice to use the shaded version or the non-shaded version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13520) NullPointerException in TagRewriteCell
[ https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504318#comment-14504318 ] Josh Elser commented on HBASE-13520: bq. nit on test case, need add SmallTests Category Ack! Forgot about that. Thanks. Will post a new version shortly. bq. I missed the null check in this place. (added in another I place I guess) That was the confusing part. The heapSize implementation made it seem like it wasn't being handled correctly, but the fact that it only appeared with 1 RegionObserver was very misleading :) NullPointerException in TagRewriteCell -- Key: HBASE-13520 URL: https://issues.apache.org/jira/browse/HBASE-13520 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: HBASE-13520.patch Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos enabled. {noformat} 2015-04-20 18:54:36,712 ERROR [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157) at org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186) at org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541) at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127) at org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504) at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided copying all components of the original {{Cell}}. In an attempt to help free the tags on the old cell that we wouldn't be referencing anymore, {{TagRewriteCell}} nulls out the original {{byte[] tags}}. This causes a problem in that the implementation of {{heapSize()}} as it {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. Because the tags on the passed in {{Cell}} (which was also a {{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE by the byte array is null. I believe this isn't observed in normal, unsecure deployments because there is only one RegionObserver/Coprocessor loaded that gets invoked via {{postMutationBeforeWAL}}. When there is only one RegionObserver, the TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the wire/protobuf. This means that the optimization isn't performed. When we have two (or more) observers that a TagRewriteCell passes through (and a new TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), this enables the described-above NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13520) NullPointerException in TagRewriteCell
[ https://issues.apache.org/jira/browse/HBASE-13520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HBASE-13520: --- Attachment: HBASE-13520-v1.patch NullPointerException in TagRewriteCell -- Key: HBASE-13520 URL: https://issues.apache.org/jira/browse/HBASE-13520 Project: HBase Issue Type: Bug Affects Versions: 1.0.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.0, 1.1.0, 1.0.2 Attachments: HBASE-13520-v1.patch, HBASE-13520.patch Found via running {{IntegrationTestIngestWithVisibilityLabels}} with Kerberos enabled. {noformat} 2015-04-20 18:54:36,712 ERROR [B.defaultRpcServer.handler=17,queue=2,port=16020] ipc.RpcServer: Unexpected throwable object java.lang.NullPointerException at org.apache.hadoop.hbase.TagRewriteCell.getTagsLength(TagRewriteCell.java:157) at org.apache.hadoop.hbase.TagRewriteCell.heapSize(TagRewriteCell.java:186) at org.apache.hadoop.hbase.CellUtil.estimatedHeapSizeOf(CellUtil.java:568) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.heapSizeChange(DefaultMemStore.java:1024) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.internalAdd(DefaultMemStore.java:259) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:567) at org.apache.hadoop.hbase.regionserver.DefaultMemStore.upsert(DefaultMemStore.java:541) at org.apache.hadoop.hbase.regionserver.HStore.upsert(HStore.java:2154) at org.apache.hadoop.hbase.regionserver.HRegion.increment(HRegion.java:7127) at org.apache.hadoop.hbase.regionserver.RSRpcServices.increment(RSRpcServices.java:504) at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2020) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31967) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2106) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$2.run(RpcExecutor.java:107) at java.lang.Thread.run(Thread.java:745) {noformat} HBASE-11870 tried to be tricky when only the tags of a {{Cell}} need to be altered in the write-pipeline by creating a {{TagRewriteCell}} which avoided copying all components of the original {{Cell}}. In an attempt to help free the tags on the old cell that we wouldn't be referencing anymore, {{TagRewriteCell}} nulls out the original {{byte[] tags}}. This causes a problem in that the implementation of {{heapSize()}} as it {{getTagsLength()}} on the original {{Cell}} instead of the on {{this}}. Because the tags on the passed in {{Cell}} (which was also a {{TagRewriteCell}}) were null'ed out in the constructor, this results in a NPE by the byte array is null. I believe this isn't observed in normal, unsecure deployments because there is only one RegionObserver/Coprocessor loaded that gets invoked via {{postMutationBeforeWAL}}. When there is only one RegionObserver, the TagRewriteCell isn't passed another TagRewriteCell, but instead a cell from the wire/protobuf. This means that the optimization isn't performed. When we have two (or more) observers that a TagRewriteCell passes through (and a new TagRewriteCell is created and the old TagRewriteCell's tags array is nulled), this enables the described-above NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13320) 'hbase.bucketcache.size' configuration value is not correct in hbase-default.xml
[ https://issues.apache.org/jira/browse/HBASE-13320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504319#comment-14504319 ] ramkrishna.s.vasudevan commented on HBASE-13320: bq.Looks like we explain it in the book but failed to do it in hbase-default HBASE-13281 did it but was not aware that the default was there in the hbase-default. Any ways the value is not right seeing the code and its calculation. bq.If remove it, why not remove all to do w/ bucketcache since all but one value are unset (hbase.bucketcache.sizes description should list default values too?) +1 on doing this. {code} /property property namehbase.bucketcache.ioengine/name value/value descriptionWhere to store the contents of the bucketcache. One of: onheap, offheap, or file. If a file, set it to file:PATH_TO_FILE. See https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html for more information. /description /property property namehbase.bucketcache.combinedcache.enabled/name valuetrue/value descriptionWhether or not the bucketcache is used in league with the LRU on-heap block cache. In this mode, indices and blooms are kept in the LRU blockcache and the data blocks are kept in the bucketcache./description /property property namehbase.bucketcache.size/name value65536/value descriptionThe size of the buckets for the bucketcache if you only use a single size. Defaults to the default blocksize, which is 64 * 1024./description /property property namehbase.bucketcache.sizes/name value/value descriptionA comma-separated list of sizes for buckets for the bucketcache if you use multiple sizes. Should be a list of block sizes in order from smallest to largest. The sizes you use will depend on your data access patterns./description /property {code} Currenly even hbase.bucketcache.sizes are not set. We can just describe the default value. 'hbase.bucketcache.size' configuration value is not correct in hbase-default.xml - Key: HBASE-13320 URL: https://issues.apache.org/jira/browse/HBASE-13320 Project: HBase Issue Type: Bug Components: hbase Affects Versions: 2.0.0 Reporter: Y. SREENIVASULU REDDY Assignee: Y. SREENIVASULU REDDY Fix For: 2.0.0 Attachments: HBASE-13320.patch, HBASE-v2-13320.patch In hbase-default.xml file * 'hbase.bucketcache.size' is not correct We either specify it as a float or in MB's and the default value that is mentioned is never used {code} property namehbase.bucketcache.size/name value65536/value sourcehbase-default.xml/source /property {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-11286) BulkDisabler should use a bulk RPC call for opening regions (just like BulkAssigner)
[ https://issues.apache.org/jira/browse/HBASE-11286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virag Kothari updated HBASE-11286: -- Assignee: (was: Virag Kothari) BulkDisabler should use a bulk RPC call for opening regions (just like BulkAssigner) Key: HBASE-11286 URL: https://issues.apache.org/jira/browse/HBASE-11286 Project: HBase Issue Type: Sub-task Reporter: Francis Liu -- This message was sent by Atlassian JIRA (v6.3.4#6332)