[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709870#comment-13709870 ] Lars Hofhansl commented on HBASE-8806: -- Going to commit later today. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, hbase-8806-0.94-microbenchmark-no-dupe-rows.txt, hbase-8806-0.94-microbenchmarks-dupe-rows.txt, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708525#comment-13708525 ] Lars Hofhansl commented on HBASE-8806: -- Any chance to redo the perf test just for the case with no dups, just to be sure? (Or post the test code, and I'll do it) I still have a slight preference for this patch (over HBASE-8877) for 0.94. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708607#comment-13708607 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12592344/hbase-8806-0.94-microbenchmark-no-dupe-rows.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6342//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, hbase-8806-0.94-microbenchmark-no-dupe-rows.txt, hbase-8806-0.94-microbenchmarks-dupe-rows.txt, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708587#comment-13708587 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12592342/hbase-8806-0.94-v6-microbenchmark.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6341//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, hbase-8806-0.94-v6-microbenchmark-no-dupe-rows.txt, hbase-8806-0.94-v6-microbenchmark.txt, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708835#comment-13708835 ] Lars Hofhansl commented on HBASE-8806: -- Thanks [~davelatham]! I would like to commit this patch to 0.94. I think the perf impact is an acceptable trade off. We can undo this if/when we apply HBASE-8877 to 0.94 (also see my question about reference counting there). Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, hbase-8806-0.94-microbenchmark-no-dupe-rows.txt, hbase-8806-0.94-microbenchmarks-dupe-rows.txt, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709180#comment-13709180 ] Dave Latham commented on HBASE-8806: Sounds good to me. +1 to commit. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, hbase-8806-0.94-microbenchmark-no-dupe-rows.txt, hbase-8806-0.94-microbenchmarks-dupe-rows.txt, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709435#comment-13709435 ] Lars Hofhansl commented on HBASE-8806: -- Cool. Any objections? I am feeling strangely indecisive about this vs HBASE-8877 now (especially with ref counting). A consideration is that rather than having different implementations of reentrant locks between 0.94 and 0.95+ it might be better only introduce reentrant locks in 0.95+ and do the less intrusive fix in 0.94 (i.e. this patch). So, unless I hear objections I am going to commit v6 here to 0.94 only and then close this issue. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, hbase-8806-0.94-microbenchmark-no-dupe-rows.txt, hbase-8806-0.94-microbenchmarks-dupe-rows.txt, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709452#comment-13709452 ] Anoop Sam John commented on HBASE-8806: --- bq.A consideration is that rather than having different implementations of reentrant locks between 0.94 and 0.95+ it might be better only introduce reentrant locks in 0.95+ and do the less intrusive fix in 0.94 (i.e. this patch). +1 Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, hbase-8806-0.94-microbenchmark-no-dupe-rows.txt, hbase-8806-0.94-microbenchmarks-dupe-rows.txt, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707896#comment-13707896 ] Lars Hofhansl commented on HBASE-8806: -- [~davelatham] and [~churromorales], is my reading of the numbers correct? This cannot be committed if it slows down the normal case by almost 100%. Also, did you test this exact patch (with the slight interface change that avoid creating HashedBytes multiple times)? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707897#comment-13707897 ] Lars Hofhansl commented on HBASE-8806: -- Looking at the numbers again this looks like a measurement glitch. Only the case with no duplicates is slow, and there is no reason why it would be slower all the cases with duplicates. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707917#comment-13707917 ] Dave Latham commented on HBASE-8806: [~lhofhansl] Agreed, don't see a reason it should be much slower in that case. Also, you can see the stddev of the measurements is high there, so my guess is it's related to not so good benchmarking (not enough burn in time). However, please check out the work in HBASE-8877 as there's an alternate patch there for 0.94 as well. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13704355#comment-13704355 ] Lars Hofhansl commented on HBASE-8806: -- Looking forward to the perf data. I do not foresee a measurable performance impact in the normal case. I am pretty sure now that we ran into this at Salesforce in our testing, but didn't have time to investigate that thoroughly and just concluded that it had to do with the slave cluster running on VMs. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705001#comment-13705001 ] rahul gidwani commented on HBASE-8806: -- Hi Lars, We at flurry have been running with this patch for the past month. I have not noticed any significant performance impact for the normal case. Although we have seen significant improvements with replication. We are no longer seeing our replication queues growing on the sink side. For some batches it would take upwards of 3 minutes to complete which would cause quite a backlog. With the patch applied this is no longer the case. I have attached the performance test results. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705020#comment-13705020 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12591712/row_lock_perf_results.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6300//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13705171#comment-13705171 ] Lars Hofhansl commented on HBASE-8806: -- Hmm... So the normal case is impacted: Without patch: {quote} STATISTICS FOR DUPLICATE ROWKEY PERCENTAGE: 0.0 DescriptiveStatistics: n: 100 min: 30.0 max: 343.0 mean: 55.450001 std dev: 32.44213963724377 median: 50.0 skewness: 7.236118305713237 kurtosis: 63.33779590624729 {quote} With patch: {quote} STATISTICS FOR DUPLICATE ROWKEY PERCENTAGE: 0.0 DescriptiveStatistics: n: 100 min: 46.0 max: 435.0 mean: 99.840003 std dev: 58.07113714830262 median: 85.0 skewness: 3.5101957571545754 kurtosis: 15.495906287057384 {quote} So on average the requests now take 100ms vs 55ms before. Or am I misreading the results? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.98.0, 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch, row_lock_perf_results.txt If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703824#comment-13703824 ] stack commented on HBASE-8806: -- I'd be +1 on this going into trunk until the reentrant stuff is done. Was there perf test done? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703887#comment-13703887 ] rahul gidwani commented on HBASE-8806: -- Mr. Stack I will run a perf test. Here is the basic logic, I create a batchMutation of size 25k and measure performance for puts. Each pass takes in a percentage value of duplicate rowkeys for that batch [0, .5, .1, .15, .2, .25, .3, .35, .4, .5] For each percentage of duplicate rowkeys we run batchMutate 5 different times storing the result in a DescriptiveStatistic. I will do this for trunk (no patch applied) and trunk (patch applied) and post my results along with the source. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13701926#comment-13701926 ] Lars Hofhansl commented on HBASE-8806: -- Thanks [~davelatham]. How are we doing on this one? Obvious I'm +1 on v6. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700942#comment-13700942 ] Dave Latham commented on HBASE-8806: Filed HBASE-8877 to track the reentrant approach separately and uploaded a new patch there with some cleanup (and fixed heap size logic). Let's continue that discussion there. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699855#comment-13699855 ] Lars Hofhansl commented on HBASE-8806: -- btw. every now and then the testRedundantRowKeys fails (acquires 4 lock instead of 2), I assume this happens due to the breakup of the batches. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699859#comment-13699859 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590792/8806-0.94-v4.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6212//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699867#comment-13699867 ] Lars Hofhansl commented on HBASE-8806: -- Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira? With HashedBytes passed directly to getLock(...) the performance impact should be negligible. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699872#comment-13699872 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590795/8806-0.94-v5.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6213//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699986#comment-13699986 ] Anoop Sam John commented on HBASE-8806: --- bq.Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira? +1 for that.. Going through the latest patch on 94 nit if (providedLockId == null) { + rowsAlreadyLocked.add(currentRow); can be else if ?.. Even though no functional issues Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1360#comment-1360 ] Anoop Sam John commented on HBASE-8806: --- For the normal case there will be always a rowsAlreadyLocked.contains(currentRow) check for every row. One way which could favor this normal case is have this check after acquiredLockId = getLock(providedLockId, currentRow, shouldBlock); iff acquiredLockId == null Well there is no problem in calling getLock before the new contains check as shouldBlock will be false except for the 1st time. When shouldBlock is false it will immediately return with null. Also for the 1st time any way rowsAlreadyLocked will be empty too. In this, the case where the duplicate entries are there in batch, will have to do some extra ops in getLock. Just saying. what do you say Lars? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700019#comment-13700019 ] ramkrishna.s.vasudevan commented on HBASE-8806: --- Is it ok if we sort it out on the client side itself? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700110#comment-13700110 ] Dave Latham commented on HBASE-8806: {quote}The changes in pom.xml are not needed for this fix, right ?{quote} That's right, sorry about that. Just helps get m2eclipse to compile everything correctly. {quote}We need change all the CP hooks signatures? CPs will(should) be able to acquire locks and release?{quote} The CP changes required would be in RegionObserver: - {{preBatchMutate(...,MiniBatchOperationInProgressPairMutation, Integer miniBatchOp)}} becomes - {{preBatchMutate(...,MiniBatchOperationInProgressMutation miniBatchOp)}} - {{postBatchMutate(...,MiniBatchOperationInProgressPairMutation, Integer miniBatchOp)}} becomes - {{postBatchMutate(...,MiniBatchOperationInProgressMutation miniBatchOp)}} I don't have much experience with coprocessor usage, but it seems unlikely to me someone would require manipulating the set of row locks in the middle of a miniBatch mutation. Would it be prudent to poll the mailing list first? Re: patch v5 Looks good to me. Anoop's notion of favoring the case where the lock is new versus the lock is already owned is an interesting one. Two possibilities of code with two cases: Code A: patch v5 - first check rowsAlreadyLocked then getLock - Case 1 (lock is new): hash comparison in rowsAlreadyLocked likely returns !contains quickly. getLock likely also quickly finds no hash match in concurrent hashmap and inserts. - Case 2 (lock is already owned): hashCode match then bytewise equals comparison in rowsAlreadyLocked.contains. getLock not called Code B: first try getLock, then if unable to getLock check rowsAlreadyLocked - Case 1 (lock is new): getLock quickly finds no hash match in concurrent hashmap and inserts. - Case 2 (lock is already owned): getLock checks concurrent hashmap finds hashCode match then does bytewise equals comparison. then rowsAlreadyLocked also gets hashCode match then bytewise equals comparison So Code A has an extra hash check for Case 1 and Code B has an extra hash check and bytewise equal compare for Case 2. I'd favor current patch (Code A) as it's a smaller and constant time addition and I think the code is a bit more intuitive, but like the property in Code B that it has no impact on Case 1 compared to the existing 0.94 releases. I'd be content if Rahul's tests on the latest patch show imapct is small. {quote}Is it ok if we sort it out on the client side itself? {quote} Not for 0.94 as it would be changing the contract on existing clients which may suddenly fail to respect row locks if they aren't updated. Though it would be interesting to ask clients to sort for performance gains and then re-sort with a stable sort on the server. But if people are happy with the other approaches discussed, they sound good to me. {quote}Should we commit something like the 0.94 patch to trunk as well, for now; and then regroup and add re-entrant locks in a different jira?{quote} +1 so long as people are comfortable with performance. Let's give Rahul a chance to run his perf test on this patch if no one's in a hurry. I will try to update my reentrant locks patch for trunk. After posting it up I was considering introducing that lock groups idea too and using it for checkAndPut. May play with the code there unless someone beats me to it. Thanks to everyone for the attention on this issue! Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700197#comment-13700197 ] Lars Hofhansl commented on HBASE-8806: -- [~davelatham] Agreed that in the (usual) case of different rows the overhead is minimal (just checking precalculated hashes). I'd be surprised if anybody can show a measurable performance impact because of that. [~anoop.hbase] Yes, that should be else-if. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700208#comment-13700208 ] Anoop Sam John commented on HBASE-8806: --- Agree with Dave and Lars. I was just saying an option :).. Yes unless a performance test is done I also believe there wont be much of overhead.. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700210#comment-13700210 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590912/8806-0.94-v6.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6219//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13700226#comment-13700226 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590906/HBASE-8806-threadBasedRowLocks-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplicationQueueFailoverCompressed Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6218//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: 8806-0.94-v4.txt, 8806-0.94-v5.txt, 8806-0.94-v6.txt, HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch, HBASE-8806-threadBasedRowLocks-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i =
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698769#comment-13698769 ] Lars Hofhansl commented on HBASE-8806: -- Yeah, the extra byte comparisons required in the normal case would be my concern too. Please attach the simple patch. :) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698803#comment-13698803 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590608/HBASE-8806.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6202//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation,
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698811#comment-13698811 ] ramkrishna.s.vasudevan commented on HBASE-8806: --- Patch looks good. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698874#comment-13698874 ] Lars Hofhansl commented on HBASE-8806: -- Anoop's patch LGTM. * Should that be currentThread().getId() instead? * Do you want to include the test of the original issue? [~churromorales] and [~davelatham], what do you guys think? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698877#comment-13698877 ] Anoop Sam John commented on HBASE-8806: --- bq.•Should that be currentThread().getId() instead? Ya better.. bq.•Do you want to include the test of the original issue? Let me see.. Will upload the new version. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699035#comment-13699035 ] Anoop Sam John commented on HBASE-8806: --- I cannot include the same test as it asserts the number of times getLock() is being called. I am not able to see any simple way to assert the fix. One way I can think of is having a CP and assert the number of time preBatchMutate() hook is getting called. But this will need a RegionServerServices to be available. We need a Minicluster up.. Sounds fine? Any other easy way can suggest? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699057#comment-13699057 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590639/HBASE-8806-threadBasedRowLocks.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6206//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699077#comment-13699077 ] Dave Latham commented on HBASE-8806: To explain a bit more this patch changes the HRegion row lock methods to these alone: - void getRowLock(byte[] row) throws IOException (acquire, throw if unable to acqurie before timeout) - boolean tryRowLock(byte[] row) (acquire without wait and return true iff successful) - void releaseMyRowLocks() (release all row locks held by the current thread) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699403#comment-13699403 ] Lars Hofhansl commented on HBASE-8806: -- I think the client side lock issue is solved by treating client provided locks as before as I suggested in my comment above (July 2nd) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699427#comment-13699427 ] Lars Hofhansl commented on HBASE-8806: -- And, anyway, for 0.94 I'd prefer a more local change just to doMiniBatchMutation. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699530#comment-13699530 ] Lars Hofhansl commented on HBASE-8806: -- Specifically, for 0.94 I'd prefer the fix with the row cache (the first two patches). In order to reduce the performance impact we can probably pass HashedBytes to getLock directly, instead of creating them in doMiniBatchMutation(...) and then again in internalObtainRowLock(...). For 0.96 Dave's patch is nice. For that patch, we could even invent lock groups, simply by passing a grouping object to {get|try}RowLock and releaseMyRowLocks. To group by thread we'd pass in the currentThread object; and we could also group only for this batch, by passing batchOp as group object. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699612#comment-13699612 ] Ted Yu commented on HBASE-8806: --- @Dave: The changes in pom.xml are not needed for this fix, right ? I tried to apply your patch on trunk but there are several chunks in HRegion.java which don't apply cleanly. Once I have a clean patch, I can put it on a cluster and try out. Thanks Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699619#comment-13699619 ] stack commented on HBASE-8806: -- [~ted_yu] Dave said above (This patch is not from the current trunk but an earlier point on 0.95 so may need to be updated a bit) How about helping out by doing the update yourself? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699634#comment-13699634 ] rahul gidwani commented on HBASE-8806: -- Hey folks, Sorry for the late response. Lars, I will update the patch for .94 to pass in hashed bytes to getRowLock and subsequently internalObtainRowLock and resubmit the patch for .94. Dave is on vacation this week so I will try to get his patch to apply to trunk. I have also written a few performance tests to see how these various changes affect doMiniBatchMutation in various cases, such as: all unique row keys, 5% duplicate row keys, 10% duplicate row keys, etc... This weekend I will run those tests against both patches (dave's and .94 as well as the current trunk) and post results. The .94 patch is almost ready, I am just in the process of making sure all tests pass before I attach another patch. cheers and sorry for the delayed response. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699701#comment-13699701 ] Hadoop QA commented on HBASE-8806: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590763/HBASE-8806-0.94.10-v3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6210//console This message is automatically generated. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13699759#comment-13699759 ] Anoop Sam John commented on HBASE-8806: --- We need change all the CP hooks signatures? CPs will(should) be able to acquire locks and release? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch, HBASE-8806-0.94.10-v3.patch, HBASE-8806.patch, HBASE-8806-threadBasedRowLocks.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697841#comment-13697841 ] Lars Hofhansl commented on HBASE-8806: -- [~anoop.hbase] Suggested almost the same above (should've read that first). Also as Anoop suggests in an earlier comment, we need to performance test this carefully for the normal case. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697844#comment-13697844 ] Lars Hofhansl commented on HBASE-8806: -- Yet another approach is to sort the KVs (aren't they sorted anyway?) and then acquire a lock for each changing row key and apply all edits with the same row key using the lock we acquired for the first KV for this row. That way we do not need to remember prior locks. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697875#comment-13697875 ] Lars Hofhansl commented on HBASE-8806: -- W.r.t. the attached patch, we still need to honor the providedLockId (if provided). If the providedLockId is not null, we should follow the old logic. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698534#comment-13698534 ] Anoop Sam John commented on HBASE-8806: --- bq.Yet another approach is to sort the KVs (aren't they sorted anyway?) Yes it will be sorted.. bq.then acquire a lock for each changing row key and apply all edits ... Here also there will be some sort byte[] comparison will come in normal case also.. By saying normal I mean normal batch() calls where the duplicate RKs mostly wont be there (Will there be?) So my thinking was how this scenario we can minimize the impact of new checks. That is why I was saying the option of keeping the Thread name also along with the latch and only when the lock is acquired check the thread names. Yes Lars also said the same above.. Pls see in my comment above the bold portion (ie.while releaseRowLock the 1st occurrence unlocks the row so that other threads can acquire it) Any way in the current patch also the release happens this way only.. As per the current way of code in HRegion this is just fine.. If you okey Lars I can attach the simple patch I made on this issue. :) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Priority: Critical Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13695323#comment-13695323 ] Anoop Sam John commented on HBASE-8806: --- Yes Rahul, I have seen this and because of this there won’t be any functional issue as such. Just adding an extra entry into Set. :) In releaseRowLock {code} if (lockId == null) return; // null lock id, do nothing HashedBytes rowKey = lockIds.remove(lockId); if (rowKey == null) { LOG.warn(Release unknown lockId: + lockId); return; } {code} You will get this warn log for the duplicate rows in minibatch. Do this log to be really warn level? Can we make it info only? One more observation is when the mini batch contain same row say 4 times, the 1st time the rowlock is acquired and remaining 3 will assume the lock is there with it. Later when the releaseRowLock happens the 1st occurrence of row itself will unlock the row(You can see the unlock op in a loop).. Well this is just fine because by the time this unlock is getting called all the write ops are over and it is just okey that other threads can get lock on the row. Another way I am thinking on this similar lines (ie. 1st occurrence of row acquire the lock and remaining just assumes it has the lock and *while releaseRowLock the 1st occurrence unlocks the row* so that other threads can acquire it) is to have a RowLockContext which contains the Latch and the thread name which has acquired the lock. We store this context, instead of the latch, in the Map. When, trying to lock, if context for the row is already there in Map, compare the thread names. Instead of saving every row ref in a Set I am just keeping the thread name in a wrapper obj. Will that be better in a normal op when there are no duplicate rows in the batch? I just tried it and have a patch for Trunk. If it sounds okey, I can upload that patch. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694888#comment-13694888 ] rahul gidwani commented on HBASE-8806: -- while the logic is not wrong, i agree its not as clear as can be. a few lines down you see {code} if (failedToAcquire) { // We failed to grab another lock assert !shouldBlock : Should never fail to get lock when blocking; break; // stop acquiring more rows for this batch } {code} but we shouldn't be adding to a set for no reason, I agree. I will fix this and submit a new patch. I will also look at how this affects the throughput and provide you guys with some results. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694130#comment-13694130 ] Dave Latham commented on HBASE-8806: A little more background on how this came up. We're currently replicating writes in both directions between two large clusters. Occasionally we would see one node's replication queue start falling behind, and once it got behind it appeared to go slower than it did while it was caught up! It would get into a cycle of replicating a batch of 25000 edits with each batch taking something like 3 minutes. Examining threads on the node receiving the writes would show the handler thread in stacks like {noformat} IPC Server handler 68 on 60020 daemon prio=10 tid=0x2aaac0d14800 nid=0x3548 runnable [0x4 java.lang.Thread.State: RUNNABLE at java.util.ArrayList.init(ArrayList.java:112) at com.google.common.collect.Lists.newArrayListWithCapacity(Lists.java:168) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2129) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2059) at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3571) at sun.reflect.GeneratedMethodAccessor83.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) {noformat} The 25000 edits were being sorted by row, with many rows editing up having multiple puts in a batch. Each time HRegion.doMiniBatchMutation encounters multiple puts to the same row it would fail to acquire the lock on that row for the second put, slowing it down. This patch makes it able to handle the full batch in one go. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694147#comment-13694147 ] rahul gidwani commented on HBASE-8806: -- I will provide a patch for trunk, no problem. I should have it by tomorrow Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694475#comment-13694475 ] Anoop Sam John commented on HBASE-8806: --- In your use case the fix approach looks reasonable But what about the normal put flow? There will be perf degradation as we have to deal with Set now for every row in the batch? We should test the write throughput with and without patch. Mind attaching a test result? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694487#comment-13694487 ] Liyin Tang commented on HBASE-8806: --- Is this issue similar as HBASE-6930 ? We solved the problem by sorting the rows for each multiput batch in the client side. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694493#comment-13694493 ] Anoop Sam John commented on HBASE-8806: --- At HRegion it will try to acquire the lock for all the rows in the batch (creating a minibatch) How you are avoiding trying lock the same row by same thread again and again? Not able to get it from the patch attached in the issue HBASE-6930. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13694496#comment-13694496 ] Anoop Sam John commented on HBASE-8806: --- [~churromorales] Going through the patch now. {code} +acquiredLockId = getLock(providedLockId, mutation.getRow(), shouldBlock); +if (acquiredLockId == null) { +failedToAcquire = true; +} +rowsAlreadyLocked.add(currentRow); {code} When not able to get a lock(acquiredLockId==null), then also adding the currentRow to the already locked rows list? Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-8806) Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows.
[ https://issues.apache.org/jira/browse/HBASE-8806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13693622#comment-13693622 ] stack commented on HBASE-8806: -- Approach looks good. You keep the hash for the life of the put only it seems. Is this necessary? - if (!isRowLocked(lockid)) { -throw new IOException(Invalid row lock); + byte[] rowFromLock = getRowFromLock(lockid); + if (!Bytes.equals(rowFromLock, row)) { +throw new IOException(Invalid row lock: LockId: + lockid + holds the lock for row: + Bytes.toString(rowFromLock) + but wanted lock for row: + Bytes.toString(row)); } Nice test. Any chance of a refactor for trunk so can submit via hadoopqa? Thanks. Row locks are acquired repeatedly in HRegion.doMiniBatchMutation for duplicate rows. Key: HBASE-8806 URL: https://issues.apache.org/jira/browse/HBASE-8806 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.5 Reporter: rahul gidwani Fix For: 0.95.2, 0.94.10 Attachments: HBASE-8806-0.94.10.patch, HBASE-8806-0.94.10-v2.patch If we already have the lock in the doMiniBatchMutation we don't need to re-acquire it. The solution would be to keep a cache of the rowKeys already locked for a miniBatchMutation and If we already have the rowKey in the cache, we don't repeatedly try and acquire the lock. A fix to this problem would be to keep a set of rows we already locked and not try to acquire the lock for these rows. We have tested this fix in our production environment and has improved replication performance quite a bit. We saw a replication batch go from 3+ minutes to less than 10 seconds for batches with duplicate row keys. {code} static final int ACQUIRE_LOCK_COUNT = 0; @Test public void testRedundantRowKeys() throws Exception { final int batchSize = 10; String tableName = getClass().getSimpleName(); Configuration conf = HBaseConfiguration.create(); conf.setClass(HConstants.REGION_IMPL, MockHRegion.class, HeapSize.class); MockHRegion region = (MockHRegion) TestHRegion.initHRegion(Bytes.toBytes(tableName), tableName, conf, Bytes.toBytes(a)); ListPairMutation, Integer someBatch = Lists.newArrayList(); int i = 0; while (i batchSize) { if (i % 2 == 0) { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(0)), null)); } else { someBatch.add(new PairMutation, Integer(new Put(Bytes.toBytes(1)), null)); } i++; } long startTime = System.currentTimeMillis(); region.batchMutate(someBatch.toArray(new Pair[0])); long endTime = System.currentTimeMillis(); long duration = endTime - startTime; System.out.println(duration: + duration + ms); assertEquals(2, ACQUIRE_LOCK_COUNT); } @Override public Integer getLock(Integer lockid, byte[] row, boolean waitForLock) throws IOException { ACQUIRE_LOCK_COUNT++; return super.getLock(lockid, row, waitForLock); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira