[jira] [Commented] (HBASE-6463) Support multiple memstore snapshots in order to support small/large flushes of cache.
[ https://issues.apache.org/jira/browse/HBASE-6463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480663#comment-13480663 ] Phabricator commented on HBASE-6463: Kannan has resigned from the revision [jira] [HBASE-6463] [89-fb] Multiple Snapshot Buffering. REVISION DETAIL https://reviews.facebook.net/D4389 To: mbautin, Liyin, Karthik, JIRA, nixon Cc: FBHBase Support multiple memstore snapshots in order to support small/large flushes of cache. -- Key: HBASE-6463 URL: https://issues.apache.org/jira/browse/HBASE-6463 Project: HBase Issue Type: Improvement Components: regionserver, util Affects Versions: 0.89-fb Reporter: Brian Nixon If cache is underutilized due to log size triggered flushes, should be able to buffer multiple snapshots in memory and flush all together into one file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows
[ https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480667#comment-13480667 ] ramkrishna.s.vasudevan commented on HBASE-6942: --- {code} OperationStatus codes[] = region.batchMutate(putsWithLocks); for (i = 0; i codes.length; i++) { if (codes[i].getOperationStatusCode() != OperationStatusCode.SUCCESS) { return i; } } {code} This will need a change any way. But currently there is no code that is calling it. All clients will call only batchMutate. The above problem lies in the put(). In trunk it has been removed. Endpoint implementation for bulk delete rows Key: HBASE-6942 URL: https://issues.apache.org/jira/browse/HBASE-6942 Project: HBase Issue Type: Improvement Components: Coprocessors, Performance Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6942_DeleteTemplate.patch, HBASE-6942.patch, HBASE-6942_V2.patch, HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch, HBASE-6942_V6.patch We can provide an end point implementation for doing a bulk deletion of rows(based on a scan) at the server side. This can reduce the time taken for such an operation as right now it need to do a scan to client and issue delete(s) using rowkeys. Query like delete from table1 where... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows
[ https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480668#comment-13480668 ] ramkrishna.s.vasudevan commented on HBASE-6942: --- @Anoop Nice work and covers all the cases. @Ted/Lars Also i feel that this is an endpoint impl. It gives user an idea what all they can do. The rowBatchSize can be optional too right? May be user can see how he wants the rowbatchsize to be. Endpoint implementation for bulk delete rows Key: HBASE-6942 URL: https://issues.apache.org/jira/browse/HBASE-6942 Project: HBase Issue Type: Improvement Components: Coprocessors, Performance Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6942_DeleteTemplate.patch, HBASE-6942.patch, HBASE-6942_V2.patch, HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch, HBASE-6942_V6.patch We can provide an end point implementation for doing a bulk deletion of rows(based on a scan) at the server side. This can reduce the time taken for such an operation as right now it need to do a scan to client and issue delete(s) using rowkeys. Query like delete from table1 where... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6611) Forcing region state offline cause double assignment
[ https://issues.apache.org/jira/browse/HBASE-6611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480709#comment-13480709 ] Hudson commented on HBASE-6611: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #229 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/229/]) HBASE-6611 Forcing region state offline cause double assignment (Revision 1400358) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/RegionTransition.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignCallable.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/GeneralBulkAssigner.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/OfflineCallback.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionState.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ResponseConverter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/KeyLocker.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java * /hbase/trunk/hbase-server/src/main/protobuf/Admin.proto * /hbase/trunk/hbase-server/src/main/protobuf/ZooKeeper.proto * /hbase/trunk/hbase-server/src/main/ruby/shell/commands/assign.rb * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestCloseRegionHandler.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java Forcing region state offline cause double assignment Key: HBASE-6611 URL: https://issues.apache.org/jira/browse/HBASE-6611 Project: HBase Issue Type: Bug Components: master Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: trunk-6611_v2.patch, trunk-6611_v5.patch In assigning a region, assignment manager forces the region state offline if it is not. This could cause double assignment, for example, if the region is already assigned and in the Open state, you should not just change it's state to Offline, and assign it again. I think this could be the root cause for all double assignments IF the region state is reliable. After this loophole is closed, TestHBaseFsck should come up a different way to create some assignment inconsistencies, for example, calling region server to open a region directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows
[ https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480756#comment-13480756 ] Anoop Sam John commented on HBASE-6942: --- bq.For passing the delete type bytes. What I meant was to use a enum backed by bytes Lars we can not make the signature of endpoint like delete(Scan, DeleteType..) I also tried initially like enum backed with byte.. But any way when customer calls the endpoint as delete(Scan, deleteType.getBytes()... ) I thought that will be only confusion..:( We are not supporting enum type as a parameter to Endpoint Endpoint implementation for bulk delete rows Key: HBASE-6942 URL: https://issues.apache.org/jira/browse/HBASE-6942 Project: HBase Issue Type: Improvement Components: Coprocessors, Performance Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6942_DeleteTemplate.patch, HBASE-6942.patch, HBASE-6942_V2.patch, HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch, HBASE-6942_V6.patch We can provide an end point implementation for doing a bulk deletion of rows(based on a scan) at the server side. This can reduce the time taken for such an operation as right now it need to do a scan to client and issue delete(s) using rowkeys. Query like delete from table1 where... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480758#comment-13480758 ] Ted Yu commented on HBASE-6728: --- {code} + * This test can fail from time to time and it is ok. + * It tests some race conditions that can happen + * occasionally, but not every time. + */ +public class TestSizeBasedThrottler { {code} I ran the test in trunk and didn't see failure. I wonder whether the above comment should be kept for trunk. [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 5337-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: 5337-trunk.txt Patch for trunk, ported from fix in 0.89-fb [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 5337-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Fix Version/s: 0.96.0 Status: Patch Available (was: Open) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 5337-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: (was: 5337-trunk.txt) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: 6728-trunk.txt [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: (was: 6728-trunk.txt) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: 6728-trunk.txt [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows
[ https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480766#comment-13480766 ] Lars Hofhansl commented on HBASE-6942: -- Let's commit something very close to V6 soon. Endpoint implementation for bulk delete rows Key: HBASE-6942 URL: https://issues.apache.org/jira/browse/HBASE-6942 Project: HBase Issue Type: Improvement Components: Coprocessors, Performance Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6942_DeleteTemplate.patch, HBASE-6942.patch, HBASE-6942_V2.patch, HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch, HBASE-6942_V6.patch We can provide an end point implementation for doing a bulk deletion of rows(based on a scan) at the server side. This can reduce the time taken for such an operation as right now it need to do a scan to client and issue delete(s) using rowkeys. Query like delete from table1 where... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480768#comment-13480768 ] Hadoop QA commented on HBASE-6728: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550141/6728-trunk.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 82 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransaction Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3104//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3104//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3104//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3104//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3104//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3104//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3104//console This message is automatically generated. [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: (was: 6728-trunk.txt) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: 6728-trunk.txt [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows
[ https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480769#comment-13480769 ] Ted Yu commented on HBASE-6942: --- This is an improvement which should go to trunk first, right ? @Anoop: please provide patch for trunk. Thanks Endpoint implementation for bulk delete rows Key: HBASE-6942 URL: https://issues.apache.org/jira/browse/HBASE-6942 Project: HBase Issue Type: Improvement Components: Coprocessors, Performance Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6942_DeleteTemplate.patch, HBASE-6942.patch, HBASE-6942_V2.patch, HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch, HBASE-6942_V6.patch We can provide an end point implementation for doing a bulk deletion of rows(based on a scan) at the server side. This can reduce the time taken for such an operation as right now it need to do a scan to client and issue delete(s) using rowkeys. Query like delete from table1 where... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480778#comment-13480778 ] Hadoop QA commented on HBASE-6728: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550142/6728-trunk.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 82 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3105//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3105//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3105//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3105//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3105//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3105//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3105//console This message is automatically generated. [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: (was: 6728-trunk.txt) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6728: -- Attachment: 6728-trunk.txt Fixes some comments. [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480795#comment-13480795 ] Hadoop QA commented on HBASE-6728: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12550146/6728-trunk.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 82 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3106//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3106//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3106//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3106//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3106//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3106//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3106//console This message is automatically generated. [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6728) [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded
[ https://issues.apache.org/jira/browse/HBASE-6728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480796#comment-13480796 ] Ted Yu commented on HBASE-6728: --- Will hold integration till Monday so that other people can take a look at the patch. [89-fb] prevent OOM possibility due to per connection responseQueue being unbounded --- Key: HBASE-6728 URL: https://issues.apache.org/jira/browse/HBASE-6728 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Michal Gregorczyk Fix For: 0.96.0 Attachments: 6728-trunk.txt The per connection responseQueue is an unbounded queue. The request handler threads today try to send the response in line, but if things start to backup, the response is sent via a per connection responder thread. This intermediate queue, because it has no bounds, can be another source of OOMs. [Have not looked at this issue in trunk. So it may or may not be applicable there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6727) [89-fb] allow HBaseServers's callqueue to be better configurable to avoid OOMs
[ https://issues.apache.org/jira/browse/HBASE-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-6727: -- Attachment: 6727.txt Patch from 0.89-fb There is one queue in 0.89-fb branch: callQueue In trunk, we need to handle three queues: priorityCallQueue, replicationQueue and callQueue I think one callQueueThrottler should be used for the three queues. [89-fb] allow HBaseServers's callqueue to be better configurable to avoid OOMs -- Key: HBASE-6727 URL: https://issues.apache.org/jira/browse/HBASE-6727 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Adela Maznikar Fix For: 0.89-fb Attachments: 6727.txt The callQueue size (where requests get queued up if all handlers are busy) is a LinkedBlockingQueue of size 100 * number_of_handlers. So, with say 300 handler threads, the call queue can have upto 30k entries queued up. If the requests are large enough, this can result in OOM or severe GC pauses. Ideally, we should allow this param to be separately configurable independent of the numberof handlers; perhaps an even better approach would be to specify a memory size based limit, instead of a number of entries based limit. [I have not looked at the trunk version for this issue. So it may or may not be relevant there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows
[ https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480828#comment-13480828 ] Lars Hofhansl commented on HBASE-6942: -- No. It's an example endpoint; it absolutely does not have to go into trunk first. (Of course it should also go into trunk) Endpoint implementation for bulk delete rows Key: HBASE-6942 URL: https://issues.apache.org/jira/browse/HBASE-6942 Project: HBase Issue Type: Improvement Components: Coprocessors, Performance Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.94.3, 0.96.0 Attachments: HBASE-6942_DeleteTemplate.patch, HBASE-6942.patch, HBASE-6942_V2.patch, HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch, HBASE-6942_V6.patch We can provide an end point implementation for doing a bulk deletion of rows(based on a scan) at the server side. This can reduce the time taken for such an operation as right now it need to do a scan to client and issue delete(s) using rowkeys. Query like delete from table1 where... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6611) Forcing region state offline cause double assignment
[ https://issues.apache.org/jira/browse/HBASE-6611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480852#comment-13480852 ] stack commented on HBASE-6611: -- Yah!!! Forcing region state offline cause double assignment Key: HBASE-6611 URL: https://issues.apache.org/jira/browse/HBASE-6611 Project: HBase Issue Type: Bug Components: master Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: trunk-6611_v2.patch, trunk-6611_v5.patch In assigning a region, assignment manager forces the region state offline if it is not. This could cause double assignment, for example, if the region is already assigned and in the Open state, you should not just change it's state to Offline, and assign it again. I think this could be the root cause for all double assignments IF the region state is reliable. After this loophole is closed, TestHBaseFsck should come up a different way to create some assignment inconsistencies, for example, calling region server to open a region directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6727) [89-fb] allow HBaseServers's callqueue to be better configurable to avoid OOMs
[ https://issues.apache.org/jira/browse/HBASE-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480856#comment-13480856 ] stack commented on HBASE-6727: -- Created HBASE-7023 to forward-port this facility that is nicer than what we have in trunk. [89-fb] allow HBaseServers's callqueue to be better configurable to avoid OOMs -- Key: HBASE-6727 URL: https://issues.apache.org/jira/browse/HBASE-6727 Project: HBase Issue Type: Bug Reporter: Kannan Muthukkaruppan Assignee: Adela Maznikar Fix For: 0.89-fb Attachments: 6727.txt The callQueue size (where requests get queued up if all handlers are busy) is a LinkedBlockingQueue of size 100 * number_of_handlers. So, with say 300 handler threads, the call queue can have upto 30k entries queued up. If the requests are large enough, this can result in OOM or severe GC pauses. Ideally, we should allow this param to be separately configurable independent of the numberof handlers; perhaps an even better approach would be to specify a memory size based limit, instead of a number of entries based limit. [I have not looked at the trunk version for this issue. So it may or may not be relevant there.] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-7023) Forward-port size-based HBaseServer callQueue throttle from 0.89fb branch
stack created HBASE-7023: Summary: Forward-port size-based HBaseServer callQueue throttle from 0.89fb branch Key: HBASE-7023 URL: https://issues.apache.org/jira/browse/HBASE-7023 Project: HBase Issue Type: Improvement Components: IPC/RPC Reporter: stack Forward port the size base throttle that is out in 0.89fb branch. Its nicer than what we have in trunk where we just count queue items. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7023) Forward-port size-based HBaseServer callQueue throttle from 0.89fb branch
[ https://issues.apache.org/jira/browse/HBASE-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7023: -- Attachment: 6727-fb.txt Patch from 0.89-fb branch. This depends on HBASE-6728. Forward-port size-based HBaseServer callQueue throttle from 0.89fb branch - Key: HBASE-7023 URL: https://issues.apache.org/jira/browse/HBASE-7023 Project: HBase Issue Type: Improvement Components: IPC/RPC Reporter: stack Fix For: 0.96.0 Attachments: 6727-fb.txt Forward port the size base throttle that is out in 0.89fb branch. Its nicer than what we have in trunk where we just count queue items. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7023) Forward-port HBASE-6727 size-based HBaseServer callQueue throttle from 0.89fb branch
[ https://issues.apache.org/jira/browse/HBASE-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-7023: -- Fix Version/s: 0.96.0 Summary: Forward-port HBASE-6727 size-based HBaseServer callQueue throttle from 0.89fb branch (was: Forward-port size-based HBaseServer callQueue throttle from 0.89fb branch) Forward-port HBASE-6727 size-based HBaseServer callQueue throttle from 0.89fb branch Key: HBASE-7023 URL: https://issues.apache.org/jira/browse/HBASE-7023 Project: HBase Issue Type: Improvement Components: IPC/RPC Reporter: stack Fix For: 0.96.0 Attachments: 6727-fb.txt Forward port the size base throttle that is out in 0.89fb branch. Its nicer than what we have in trunk where we just count queue items. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7018) Fix and Improve TableDescriptor caching for bulk assignment
[ https://issues.apache.org/jira/browse/HBASE-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480861#comment-13480861 ] stack commented on HBASE-7018: -- Nice numbers G. Minor, FileNotFoundException is an IOException so the FNFE redundant? +1 on patch (after figuring the test fail above). Good stuff. Fix and Improve TableDescriptor caching for bulk assignment --- Key: HBASE-7018 URL: https://issues.apache.org/jira/browse/HBASE-7018 Project: HBase Issue Type: Bug Components: regionserver Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.94.3, 0.96.0 Attachments: HBASE-7018-94.patch, HBASE-7018-94-v2.patch, HBASE-7018-trunk.patch HBASE-6214 backported HBASE-5998 (Bulk assignment: regionserver optimization by using a temporary cache for table descriptors when receiving an open regions request), but it's buggy on 0.94 (0.96 appears correct): {code} HTableDescriptor htd = null; if (htds == null) { htd = this.tableDescriptors.get(region.getTableName()); } else { htd = htds.get(region.getTableNameAsString()); if (htd == null) { htd = this.tableDescriptors.get(region.getTableName()); htds.put(region.getRegionNameAsString(), htd); } } {code} i.e. we get the tableName from the map but write the regionName. Even fixing this, it looks like there are areas for improvement: 1) FSTableDescriptors already has a cache (though it goes to the NameNode each time through to check we have the latest copy. May as well combine these two caches, might be a performance win as well since we don't need to write to multiple caches. 2) FSTableDescriptors makes two RPCs to the NameNode when it encounters a new table. So the total number of RPCs necessary for a bulk assign (without caching is): #regions + #tables (with caching): min(#regions,#tables) + #tables = #tables + #tables = 2 * #tables We can make this only one RPC, yielding: #tables Probably not a big deal for most users, but in a multi-tenant situation where the number of regions being bulk assigned approaches the number of tables being bulk assigned, this could be a nice performance win. Benchmarks coming. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-7023) Forward-port HBASE-6727 size-based HBaseServer callQueue throttle from 0.89fb branch
[ https://issues.apache.org/jira/browse/HBASE-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-7023: - Assignee: Ted Yu Forward-port HBASE-6727 size-based HBaseServer callQueue throttle from 0.89fb branch Key: HBASE-7023 URL: https://issues.apache.org/jira/browse/HBASE-7023 Project: HBase Issue Type: Improvement Components: IPC/RPC Reporter: stack Assignee: Ted Yu Fix For: 0.96.0 Attachments: 6727-fb.txt Forward port the size base throttle that is out in 0.89fb branch. Its nicer than what we have in trunk where we just count queue items. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6977) Multithread processing ZK assignment events
[ https://issues.apache.org/jira/browse/HBASE-6977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6977: --- Status: Open (was: Patch Available) Multithread processing ZK assignment events --- Key: HBASE-6977 URL: https://issues.apache.org/jira/browse/HBASE-6977 Project: HBase Issue Type: Improvement Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: trunk-6977_v1.patch Related to HBASE-6976 and HBASE-6611. ZK events processing is a bottle neck for assignments, since there is only one ZK event thread. If we can use multiple threads, it should be better. With multiple threads, the order of events could be messed up. However, if we pass all events related to one region always to the same worker thread, the order should be kept. We need to play with it and find out how much performance imrovement we can get. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
[ https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480896#comment-13480896 ] Lars Hofhansl commented on HBASE-6733: -- Looks good. I can commit this to 0.94 (either here or in a new porting jira). [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2] --- Key: HBASE-6733 URL: https://issues.apache.org/jira/browse/HBASE-6733 Project: HBase Issue Type: Bug Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.96.0 Attachments: 6733-1.patch, 6733-2.patch, 6733-3.patch, HBASE-6733-0.94.patch The failure is in TestReplication.queueFailover (fails due to unreplicated rows). I have come across two problems: 1. The sleepMultiplier is not properly reset when the currentPath is changed (in ReplicationSource.java). 2. ReplicationExecutor sometime removes files to replicate from the queue too early, resulting in corresponding edits missing. Here the problem is due to the fact the log-file length that the replication executor finds is not the most updated one, and hence it doesn't read anything from there, and ultimately, when there is a log roll, the replication-queue gets a new entry, and the executor drops the old entry out of the queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira