[jira] [Comment Edited] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15580983#comment-15580983 ] Allan Yang edited comment on HBASE-16698 at 10/17/16 2:31 AM: -- Yes, I know sync operation will batch as many as possible. When you wait for the latch, it is actually waiting for sync as well, so in my analysis, waiting for sync and waiting for latch should take the same time. Have no idea why waiting for sync is faster, the only difference is that if choose to wait for sync, step 5 and step 6 in {{doMiniBatchMutation}} is done without any blocking. was (Author: allan163): Yes, I know sync operation will batch as many as possible. When you wait for the latch, it is actually waiting for sync as well, so in my analysis, waiting for sync and waiting for latch should take the same time. Have no idea why waiting for sync is fast, the only difference is that if choose to wait for sync, step 5 and step 6 in {{doMiniBatchMutation}} is done without any blocking. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15580945#comment-15580945 ] Allan Yang edited comment on HBASE-16698 at 10/17/16 2:02 AM: -- After, carefully reviewed the code of branch-1.2, I understand your problem, in branch-1.2,the handler is stuck waiting for CountDownLatch after appending the WALKey to getting the writeEntry. But the latch is released only after sync completed. But, my question is, even if you solved this problem, the handlers still have to waitting for {{syncOrDefer}} to complete. So either you wait for the latch, or you wait for {{syncOrDefer}}. What is the difference? was (Author: allan163): After, carefully reviewed the code of branch-1.2, I understand your problem, in branch-1.2,the handler is stuck waiting for CountDownLatch after appending the WALKey to getting the writeEntry. But the latch is released only after sync completed. But, my question is, even if you solved this problem. But the handlers still have to waitting for {{syncOrDefer}} to complete. So either you wait for the latch, or you wait for {{syncOrDefer}}. What is the difference? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576730#comment-15576730 ] Yu Li edited comment on HBASE-16698 at 10/14/16 10:29 PM: -- And I'd say the previously existed {{cell.getSequenceId() == 0}} check in {{HRegion#applyFamilyMapToMemstore}} is some kind of protection mechanism from what I've observed :-) was (Author: carp84): And I'd say the previously existed {{cell.getSequenceId() == 0}} check in {{applyFamilyMapToMemstore}} is some kind of protection mechanism from what I've observed :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15574578#comment-15574578 ] Allan Yang edited comment on HBASE-16698 at 10/14/16 7:52 AM: -- The CountDownLatch can't be moved since we need to wait until the data we written can been seen (through advance mvcc). Since your online cluster is a forked 1.1.2 version, is your patch fix this problem, [~carp84]? was (Author: allan163): The CountDownLatch can't be moved since we need to wait until the data we written can been seen (through advance mvcc). Since your online cluster is a forked 1.1.2 version, is your patch fix this problem, [~carp84]]? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15574517#comment-15574517 ] Heng Chen edited comment on HBASE-16698 at 10/14/16 7:24 AM: - I think [~allan163] you are right. It is different between branch-1.1 and branch-1.2. On branch-1.1, we wait for the seqId assigned after sync. So the issue is invalid for branch-1.1. It seems the CountDownLatch could be removed for SYNC_WAL durability? [~carp84] your online cluster is branch-1.2, right? was (Author: chenheng): I think [~allan163] you are right. It is different between branch-1.1 and branch-1.2. On branch-1.1, we wait for the seqId assigned after sync. So the issue is invalid for branch-1.1. It seems the CountDownLatch could be removed for SYNC_WAL durability? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15574381#comment-15574381 ] Allan Yang edited comment on HBASE-16698 at 10/14/16 6:15 AM: -- {code} // -- // STEP 8. Advance mvcc. This will make this put visible to scanners and getters. // -- if (writeEntry != null) { mvcc.completeMemstoreInsertWithSeqNum(writeEntry, walKey); writeEntry = null; } {code} It's in {{doMiniBatchMutation}} of branch1.1 . In {{completeMemstoreInsertWithSeqNum}}, It will get the seqid in {{walKey}} to advance the mvcc, I think that's where [~carp84] said 'stuck at CountDownLatch ' My point is, even if we don't need to sync the wal, the batch still have to stuck here to advance mvcc, that it is a problem. But, if we choose to sync the wal, seqid in walKey should have been assigned in sync operation. Handlers shouldn't stuck here. was (Author: allan163): {code} // -- // STEP 8. Advance mvcc. This will make this put visible to scanners and getters. // -- if (writeEntry != null) { mvcc.completeMemstoreInsertWithSeqNum(writeEntry, walKey); writeEntry = null; } {code} It's in {{doMiniBatchMutation}} of branch1.1 . In {{completeMemstoreInsertWithSeqNum}}, It will get the seqid in {{walKey}} to advance the mvcc, I think that's where [~carp84]] said 'stuck at CountDownLatch ' My point is, even if we don't need to sync the wal, the batch still have to stuck here to advance mvcc, that it is a problem. But, if we choose to sync the wal, seqid in walKey should have been assigned in sync operation. Handlers shouldn't stuck here. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15574381#comment-15574381 ] Allan Yang edited comment on HBASE-16698 at 10/14/16 6:15 AM: -- {code} // -- // STEP 8. Advance mvcc. This will make this put visible to scanners and getters. // -- if (writeEntry != null) { mvcc.completeMemstoreInsertWithSeqNum(writeEntry, walKey); writeEntry = null; } {code} It's in {{doMiniBatchMutation}} of branch1.1 . In {{completeMemstoreInsertWithSeqNum}}, It will get the seqid in {{walKey}} to advance the mvcc, I think that's where [~carp84]] said 'stuck at CountDownLatch ' My point is, even if we don't need to sync the wal, the batch still have to stuck here to advance mvcc, that it is a problem. But, if we choose to sync the wal, seqid in walKey should have been assigned in sync operation. Handlers shouldn't stuck here. was (Author: allan163): {code} // -- // STEP 8. Advance mvcc. This will make this put visible to scanners and getters. // -- if (writeEntry != null) { mvcc.completeMemstoreInsertWithSeqNum(writeEntry, walKey); writeEntry = null; } {code} It's in {{doMiniBatchMutation }} of branch1.1 . In {{completeMemstoreInsertWithSeqNum}}, It will get the seqid in {{walKey}} to advance the mvcc, I think that's where [~carp84]] said 'stuck at CountDownLatch ' My point is, even if we don't need to sync the wal, the batch still have to stuck here to advance mvcc, that it is a problem. But, if we choose to sync the wal, seqid in walKey should have been assigned in sync operation. Handlers shouldn't stuck here. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571584#comment-15571584 ] Yu Li edited comment on HBASE-16698 at 10/13/16 11:16 AM: -- Thanks for revisiting this [~stack] Yes sir, we're running w/ this in production for more than 2 months and everything looks good, no more handler stuck at CountDownLatch ever since, no data loss observed. And yes, let's make this in with option set to off as default for branch-1.2/1.3, and we could revisit whether to set it on later when I have time to provide more perf data with YCSB. :-) was (Author: carp84): Thanks for revisiting this [~stack] Yes sir, we're running w/ this in production for more than 2 months and everything looks good, no more handler stuck at CountDownLatch ever since, no data loss observed. And yes, let's make this in with option set to off as default, and we could revisit whether to set it on later when I have time to provide more perf data with YCSB. :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)