[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833400#comment-15833400 ] Yu Li commented on HBASE-16698: --- The JIRA here introduced mvcc preassign feature, but only implement for Put and has some problems for Append/Increment. HBASE-17471 is a JIRA locating this issue and targeting at fixing it (by [~allan163]) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.4.0 > > Attachments: hadoop0495.et2.jstack, HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833383#comment-15833383 ] Yu Li commented on HBASE-16698: --- HBASE-17482 is a fix for master branch with the same way as stated in [this comment | https://issues.apache.org/jira/browse/HBASE-16698?focusedCommentId=15576721=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15576721] > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.4.0 > > Attachments: hadoop0495.et2.jstack, HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833381#comment-15833381 ] Yu Li commented on HBASE-16698: --- Will do below actions soon: 1. Close this JIRA since all codes already gone into master and branch-1 2. Open another JIRA to supply document mvcc preassign feature, including * A design doc of the mvcc preassign mechanism and attach here * A performance report for SYNC_WAL/ASYNC_WAL/SKIP_WAL with mvcc preassign. (And we could open new JIRA if found any issue later) 3. Link newly found issue related with mvcc preassign with JIRA here. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.4.0 > > Attachments: hadoop0495.et2.jstack, HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833378#comment-15833378 ] Yu Li commented on HBASE-16698: --- After some review of the mechanism, let me further explain about the performance improvement for SYNC_WAL in theory for branch-1: First, let me echo current process of doMiniBatchMutation for branch-1: 1. acquire locks 2. update timestamps 3. build WAL edit 4. append edit to WAL 5. write to memstore 6. sync WAL or defer Just before step #5, we have below codes before patch: {code} // // Acquire the latest mvcc number // -- if (!isInReplay) { writeEntry = walKey.getWriteEntry(); mvccNum = writeEntry.getWriteNumber(); } else { mvccNum = batchOp.getReplaySequenceId(); } // // STEP 5. Write back to memstore // Write to memstore. It is ok to write to memstore // first without syncing the WAL because we do not roll // forward the memstore MVCC. The MVCC will be moved up when // the complete operation is done. These changes are not yet // visible to scanners till we update the MVCC. The MVCC is // moved only when the sync is complete. // -- {code} Where in {{walKey#getWriteEntry}} we will wait for the CountDownLatch, which will be released by {{RingBufferEventHandler#append}} or say {{FSWALEntry#stampRegionSequenceId}}. And since ringbuffer is handled in sequence, this makes a contention for puts on all regions on the same regionserver. Notice that under high put overload this will make step #5 waiting, and in some case this wait may be longer than the sync IO time, then when it arrives at step #6, {{FSHLog#blockOnSync}} will return directly. This mainly answers below question [~allan163] asked in previous comment. bq. But, my question is, even if you solved this problem, the handlers still have to waitting for syncOrDefer to complete Let me prepare some doc and upload here to make it easier for understanding w/o reading the long comment list (smile). > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.4.0 > > Attachments: hadoop0495.et2.jstack, HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823179#comment-15823179 ] Hadoop QA commented on HBASE-16698: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s {color} | {color:red} HBASE-16698 does not apply to branch-1. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12833969/HBASE-16698.branch-1.v2.patch | | JIRA Issue | HBASE-16698 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/5265/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.4.0 > > Attachments: hadoop0495.et2.jstack, HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15600905#comment-15600905 ] stack commented on HBASE-16698: --- On HBASE-3899, I remembered it as an experiment that was never finished (maybe someone else has a better memory of what happened then). I went back and looked at code and seems like it was not hooked up out in Apache HBase. Only TestDelayedRpc used it ( It may have been working internally at FB). It was reverted because it was not being used and much of the machinery it had wired up had been moved or removed (HBaseServer). Above I say owning the dfsclient is the way ouf our WAL perf trough but the SEDA core, as you remind us [~carp84], would be another where threads can be let go after dumping payload on the WAL to go bring on more data while waiting on the syncs to come home. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15599149#comment-15599149 ] Yu Li commented on HBASE-16698: --- bq. Lets not backport to 1.2 until it in 1.3. Thats how we generally do it. Else the discontinuity confuses. Ok, got it, thanks for the confirmation [~stack] bq. On master, when I do a jstack with some load, almost all the handlers are waiting for sync()... For async, we still have to have the latch I think. I see, and makes sense. Let me test to make sure 1) the patch here introduce no perf regression for SYNC_WAL and 2) it benefits ASYNC_WAL, for master branch. bq. The other reason I was asking about this is that I have a hacked up patch which divides the batchMutate() into 3 phases... After sync some other handler or thread will complete the work. Thanks for bring this up and mentioning the paper [~enis], I think this cooperates the idea of "SEDA" JIRA mentioned weeks ago, and we also have some initial work in progress here in Alibaba-search. I believe this is something able to increase our overall throughput and worth a standalone JIRA for further discussion (smile). Also glanced at HBASE-3899, seems like a similar idea but somehow commit reverted... mind telling the whole story sir [~stack]? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15598586#comment-15598586 ] stack commented on HBASE-16698: --- Reinstitute hbase-3899? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15598402#comment-15598402 ] stack commented on HBASE-16698: --- [~enis] you think we could park 'context' in the log, let the handler go to do another transacton, and then as it is leaving the logging subsystem, see if any completed transactions to return to the client? That'd be cool. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15598266#comment-15598266 ] stack commented on HBASE-16698: --- bq. On master, when I do a jstack with some load, almost all the handlers are waiting for sync(), and since the memstore insert does not happen until sync() completes, we do not have to wait for the latch. Actually, the latch can be removed as well (for Durability.SYNC_WAL). For async, we still have to have the latch I think. Yeah, the reordering of the write pipeline in Master changes the equation. It is currently at a 'safe' place. Review and discussion could buy us some more improvement here especially it now much easier to reason about what is happening given the reordering. Consider too though that master branch write path will change again if/when asyncWAL becomes default (there is no ringbuffer when asyncwal). I am of the opinion that we need to get a handle on the dfsclient's packet-sending rhythm if we are to make any progress WAL writing. In studies over the last few days, it is beyond our influence and does the same old behavior whatever we do on our side (ringbuffer aggregations and appends for sure help but having to rely on five syncer threads each interrupting packet formation hopefully w/ jigger so not too many null sends is voodoo engineering and says to me that we need to own the client -- e.g. asyncwal -- or expose more means of controlling the flow in dfsclient#dfsoutputstream to us, the client). Thanks for the paper reference [~enis] Looks great. Handlers having to wait on syncs messes us up (or, not being async in our core messes us up -- take your pick). We should be able to make do with one sync'ing thread but when only one syncer, we are aggregating 70 handlers waiting on sync in primitive tests which means 70 handlers that are stuck NOT adding more load on the server; hence 5 syncers each aggregating 10 or 12 syncs works a bit better. What are you thinking regards where the handler goes after it starts the sync? It goes back to the client? (FB had a Delay thing hacked in once that seems similar). How is the 'pickup' done? It sounds great. On rewrite of batchMutate, our HRegion has loads of duplication by method. You see the batchMutate refactor working elsewhere for other methods? Thanks. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15596488#comment-15596488 ] Enis Soztutar commented on HBASE-16698: --- On master, when I do a jstack with some load, almost all the handlers are waiting for sync(), and since the memstore insert does not happen until sync() completes, we do not have to wait for the latch. Actually, the latch can be removed as well (for Durability.SYNC_WAL). For async, we still have to have the latch I think. The other reason I was asking about this is that I have a hacked up patch which divides the batchMutate() into 3 phases. The first phase is the parts to compute the WALEdit to write, second part is the actual append + sync() and the last part is all other steps after sync() (memstore insert, etc). This architecture makes it possible to do the flush pipelining idea talked about here: http://pandis.net/resources/vldb10aether.pdf (section 4). The theory is that we can free up the handler to process more requests while we are waiting for the sync to happen. After sync some other handler or thread will complete the work. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595796#comment-15595796 ] stack commented on HBASE-16698: --- Lets not backport to 1.2 until it in 1.3. Thats how we generally do it. Else the discontinuity confuses. On Master, yeah, the benefit this patch brings is theoretical though the tests done here would seem to promise master branch benefit. The patch has been committed to master. HBASE-16873 is about doing something better. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593787#comment-15593787 ] Yu Li commented on HBASE-16698: --- Thanks for the confirmation [~mantonov] > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593766#comment-15593766 ] Yu Li commented on HBASE-16698: --- Thanks for the question [~enis], the above numbers are against branch-1 codes, not master. For master branch, the issue description only talks in theory and no real testing done. Let me add the testing and will upload data later. It's confirmed we won't pull this in for branch-1.3 until 1.3.0 got released and 1.3.1 comes out. For branch-1.2, [~stack] could you help confirm sir? Thanks. Will resolve this JIRA if perf data against master branch is good as expected and decision for branch-1.2 is made. :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593072#comment-15593072 ] Enis Soztutar commented on HBASE-16698: --- Time to resolve this? [~carp84] thanks for the explanations earlier. One last question, did you do the tests with 2.0 code or branch-1 code? The issue description says: {code} Notice that for master (2.0) branch since we already change the write pipeline to sync before writing memstore (HBASE-15158), this issue only exists for the ASYNC_WAL writes scenario. {code} so we have committed this only for ASYNC_WAL for the master code? We wait for the whole sync() to happen before proceeding to the memstore inserts anyway in master. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592549#comment-15592549 ] Hudson commented on HBASE-16698: SUCCESS: Integrated in Jenkins build HBase-1.4 #482 (See [https://builds.apache.org/job/HBase-1.4/482/]) HBASE-16698 Performance issue: handlers stuck waiting for CountDownLatch (liyu: rev a7a4e17f1d04d389f87ad22da96d72cd3be050d9) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWALEntry.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592458#comment-15592458 ] stack commented on HBASE-16698: --- Makes sense [~mantonov] > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592412#comment-15592412 ] Mikhail Antonov commented on HBASE-16698: - [~stack] [~carp84] after testing on my side I think we've reached the point where the the most pressing issues on 1.3 have been fixed, so I really want to draw the line and get the first RC for it out. We may be able to pull it later in 1.3.1. Should be good for branch-1 meanwhile. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591092#comment-15591092 ] Yu Li commented on HBASE-16698: --- Thank you sir [~stack], I've just pushed it into branch-1 :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590613#comment-15590613 ] stack commented on HBASE-16698: --- +1 on push to branch-1. Let me know if you want me to do it for you [~carp84], at your service. [~mantonov] You want this back in 1.3? Can be off by default? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590555#comment-15590555 ] Yu Li commented on HBASE-16698: --- [~stack] mind if I push this to branch-1(branch-1.4) first to avoid further code rebase? Since YCSB data indicates in both single region and multiple regions scenarios performance is better w/ patch, I think this also applies for branch-1.2/1.3, but will wait for your decisions on whether to let it in [~stack] [~mantonov] Echo the YCSB data here for your convenience: testing environment: {noformat} YCSB 0.7.0 4 physical client nodes, 8 YCSB processes per node, 32 threads per YCSB process recordcount=3,200,000, fieldcount=1, fieldlength=1024, insertproportion=1, requestdistribution=uniform {noformat} 200 regions: ||TestCase||Round||Throughput||AverageLatency(us)|| |w/o patch|Round-1|66554.48|15263.36| |w/ patch|Round-1|91472.48|11098.85| |w/o patch|Round-2|66083.53|15382.01| |w/ patch|Round-2|91420.26|11104.37| single region: ||TestCase||Throughput||AverageLatency(us)|| |w/o patch|69924.42|14544.38| |w patch|86373.70|11770.09| > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15588348#comment-15588348 ] Yu Li commented on HBASE-16698: --- Perf data for one single region: Test environment (nothing changed but no presplit on the target table): {noformat} YCSB 0.7.0 4 physical client nodes, 8 YCSB processes per node, 32 threads per YCSB process recordcount=3,200,000, fieldcount=1, fieldlength=1024, insertproportion=1, requestdistribution=uniform 1 single RS, 1 single region (no presplit), handlercount=128, hbase.wal.storage.policy=ALL_SSD {noformat} And the comparison data (one round): ||TestCase||Throughput||AverageLatency(us)|| |w/o patch|69924.42|14544.38| |w patch|86373.70|11770.09| >From the result we could see even with one single region, performance w/ patch >is better under high concurrency, which indicates that the discruptor publish >and consume processing is more time-costing than the lock. I could see less CountDownLatch waiting in jstack during testing w/o patch, which could explain why the throughput is better than that against multiple regions. [~chenheng] FYI. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586840#comment-15586840 ] stack commented on HBASE-16698: --- [~carp84] points out above that WALPE doesn't exercise this patch AT ALL so above runs were useless (would account for why the different so small). Ignore the above. Dumb on my part. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586638#comment-15586638 ] Hudson commented on HBASE-16698: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1810 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1810/]) Revert "Revert "HBASE-16698 Performance issue: handlers stuck waiting (stack: rev ec1adb7baaca5b89ff11a24f26f49fec63e754d8) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWALEntry.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java Revert "Revert "HBASE-16698 Performance issue: handlers stuck waiting (stack: rev 0d40a52ee82651866ad124183367edb4d9c52dda) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586213#comment-15586213 ] Yu Li commented on HBASE-16698: --- I also have no idea why we introduce the findbugs issue, will also take a look at it later... > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586211#comment-15586211 ] Yu Li commented on HBASE-16698: --- Thanks [~stack] for opening the sub issue. Regarding WALPE test, mind explain why this patch will affect that? It seems to me WALPE only tests the WAL append and sync part w/o writing through doMiniBatchMutation. Thanks. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586070#comment-15586070 ] Hadoop QA commented on HBASE-16698: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 2s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} branch-1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s {color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 17m 21s {color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 23s {color} | {color:red} hbase-server generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 16s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 113m 38s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion$BatchOperationInProgress) does not release lock on all paths At HRegion.java:on all paths At HRegion.java:[line 3310] | | Failed junit tests | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 | | | hadoop.hbase.mapreduce.TestMultiTableSnapshotInputFormat | | Timed out junit tests | org.apache.hadoop.hbase.TestHBaseTestingUtility | | | org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:b2c5d84
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585979#comment-15585979 ] stack commented on HBASE-16698: --- Ok on my suggestion actually making for a larger patch when I thought it would make a smaller one. [~mantonov] Ok to back port this but off by default? I filed a subissue here, HBASE-16873, to see if we can do better in master. Thanks [~carp84]. Am trying this patch with the WALPE test. Will report back. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585776#comment-15585776 ] stack commented on HBASE-16698: --- I revert the revert of this change from master (so the master patch and the fixup for the findbugs is back in master branch). The findbugs out of branch-1 is not same as complaint that came out of master (and was fixed in subsequent commit) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585766#comment-15585766 ] stack commented on HBASE-16698: --- Numbers looks good. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.branch-1.v2.patch, > HBASE-16698.patch, HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585433#comment-15585433 ] Yu Li commented on HBASE-16698: --- Ok, here are more performance number with YCSB. First the testing environment: {noformat} YCSB 0.7.0 4 physical client nodes, 8 YCSB processes per node, 32 threads per YCSB process recordcount=3,200,000, fieldcount=1, fieldlength=1024, insertproportion=1, requestdistribution=uniform 1 single RS, regionnumber(presplit)=200, handlercount=128, hbase.wal.storage.policy=ALL_SSD patch applied on not latest but recent branch-1 code (commit 06cc123849aefb67570f0c016829b53ab958721b), not latest because using the same package doing PE testing {noformat} And the comparison data (two rounds): ||TestCase||Round||Throughput||AverageLatency(us)|| |w/o patch|Round-1|66554.48|15263.36| |w/ patch|Round-1|91472.48|11098.85| |w/o patch|Round-2|66083.53|15382.01| |w/ patch|Round-2|91420.26|11104.37| This should be sufficient to prove the effect of the patch when heavy load on multiple regions. Feel free to reproduce the test and let me know if any different results. More data coming for heavy load on one single region to see whether there's perf regression for this case. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585090#comment-15585090 ] Yu Li commented on HBASE-16698: --- Thanks for coming back sir [~stack] :-) bq. what if you added a new constructor on HLogKey, one that took a WriteEntry... I ever tried this way and believe me it will make the patch bigger because of the new constructor (smile). And a new preAssignedWriteEntry won't interfere with existing writeEntry so maybe easier to understand the logic? bq. What to do for 1.3? Backport but flip the switch to false? We'd have to ask Mikhail. Yes, I guess false for now, and we could turn it to true latter if the perf number shows no regression (I'm running the YCSB case but still haven't got all numbers). And yes, I'd also like to hear Mikhail's opinion. bq. For Master, should we try and do something better? Try batching up sequenceid assign. I also tried batch seqId assign but found it hard to pass the UT (I think currently we have quite some logics depending on the fact of sequential seqId assign, the {{SafePointZigZagLatch}} for example) so I applied current solution here with lower risk to resolve our online issue (smile). But yes definitely worthwhile to try since batch seqId assign is the real solution in my opinion. bq. Apply a version of this patch in the meantime? Yes, still a good work-around IMHO (smile). > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584509#comment-15584509 ] stack commented on HBASE-16698: --- [~allan163] Thanks for the questions [~carp84] So, story is clearer now after the recent discussion. I'n +1 on the patch for branch-1. Since I spent more time looking at the patch (smile), what if you added a new constructor on HLogKey, one that took a WriteEntry. Then you wouldn't need setPreAssignedWriteEntry nor preAssignedWriteEntry... just assign writeEntry in the constructor. It'd make the patch smaller/clearer? But no biggie. What to do for 1.3? Backport but flip the switch to false? We'd have to ask Mikhail. I should see if this patch applies to 1.2 because I know at least one crew who'd be interested. For Master, should we try and do something better? Try batching up sequenceid assign. Apply a version of this patch in the meantime? Thanks [~carp84] > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584453#comment-15584453 ] stack commented on HBASE-16698: --- Let me answer [~enis] I just went through this issue again and the patch. * Our write path has gone through a bunch of change. Some stepped (The Xiaomi redo, the intro of ringbuffer). Others evolutions (Reorder because rely on mvcc instead of row locks). Its can be hard to keep it all straight. For example, [~allan163]'s comment above is against 1.1 but [~carp84] patch is for the next version on -- 1.2 but patched back to 1.1. * Agree we should pick an approach with fall-back just-in-case. The patch here has that. Patch also has the benefit of having been run in production showing good numbers. * The lock is region-scoped. It is not across the ringbuffer. The RB can make progress on other region appends. * The perf gain looks to the result of two phenomenon: 1. parallelism: a single thread stamping every edit with a sequence id -- having to cross a region-scoped synchronize on each impression -- marching in order over all appends looks to be slower than a stamping that is done with some parallelism as each handler does its own imprint though there is friction as each handler has to contend on the reentrant lock with other handlers that are in the same region trying to do the same thing; and 2. no-wait: with the new patch, the handler can make progress after calling append where before not until the RB consumer on the other side of the RB had let go of the latch. The RB is good as transmission between N handlers and the single WAL writer. The notion that the single consumer manage sequenceid assignment in line w/ the appends to WAL, while appealing because of its simplicity, seems to hold up throughput because our sequenceid is by region. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581685#comment-15581685 ] Yu Li commented on HBASE-16698: --- Thanks for the quick feedback [~chenheng]. The previous decision is for branch-1 or say branch-1.4 we open it by default, while for branch-1.2/1.3 make it off until more perf numbers are supplied. [~stack] and [~enis], please share your thoughts, thanks. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581674#comment-15581674 ] Yu Li commented on HBASE-16698: --- Current process of {{doMiniBatchMutation}} for branch-1 is like: 1. acquire locks 2. update timestamps 3. build WAL edit 4. append edit to WAL 5. write to memstore 6. sync WAL or defer W/o patch mvcc number is attained between step #4 and #5, which makes step #4 and #5 serializing even though appending edit WAL is asynchronous through disruptor W/ patch mvcc number is preassigned and step#4 and #5 could run in parallel Since we will update the global {{highestSyncedSequence}} in {{SyncRunner}}, the sooner it arrives at {{syncOrDefer}}, the more chance it will release other sync task/be released without waiting on real sync operation. To quote and emphasize, patch here limit the contention to region level instead of regionserver level, and parallel writes on different regions will benefit. What's more, all the back and forth discussions here are around SYNC_WAL, but don't forget the ASYNC_WAL writes, the improvement on ASYNC_WAL is way more obvious. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581074#comment-15581074 ] Heng Chen commented on HBASE-16698: --- patch for branch-1 LGTM. +1 We will open it by default on branch-1, right? Just confirm it with all your guys. :) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581067#comment-15581067 ] Heng Chen commented on HBASE-16698: --- {quote} so in my analysis, waiting for sync and waiting for latch should take the same time. Have no idea why waiting for sync is faster {quote} [~allan163] Not exactly, currently we waiting for seqId assigned in one handler, but we do sync in multi threads parallel (5 default). > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580983#comment-15580983 ] Allan Yang commented on HBASE-16698: Yes, I know sync operation will batch as many as possible. When you wait for the latch, it is actually waiting for sync as well, so in my analysis, waiting for sync and waiting for latch should take the same time. Have no idea why waiting for sync is fast, the only difference is that if choose to wait for sync, step 5 and step 6 in {{doMiniBatchMutation}} is done without any blocking. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580977#comment-15580977 ] Yu Li commented on HBASE-16698: --- {quote} Update patch for branch-1 after code rebase. Also make it able to work together with HBASE-16768 (we should also call HRegion#updateSequenceId when mvcc is preassigned after HBASE-16768, please refer to patch for the reason. I found this problem because TestHRegion#testReverseScanner_StackOverflow failed w/o the new changes) {quote} Requesting some binding +1s on the new branch-1 patch gentlemen [~stack] [~enis] [~chenheng] [~eclark], thanks. :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580967#comment-15580967 ] Yu Li commented on HBASE-16698: --- Actually we won't sync one by one, but sync a bunch at a time, or say we could regard {{syncOrDefer}} in parallel under heavy workload. Refer to the while loop in {{FSHLog$SyncRunner#run}} and {{FSHLog$SyncRunner#releaseSyncFutures}} for more details. And believe it or not, the issue got resolved with the patch in our production cluster with 1000+ nodes. :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580945#comment-15580945 ] Allan Yang commented on HBASE-16698: After, carefully reviewed the code of branch-1.2, I understand your problem, in branch-1.2,the handler is stuck waiting for CountDownLatch after appending the WALKey to getting the writeEntry. But the latch is released only after sync completed. But, my question is, even if you solved this problem. But the handlers still have to waitting for {{syncOrDefer}} to complete. So either you wait for the latch, or you wait for {{syncOrDefer}}. What is the difference? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579655#comment-15579655 ] Hadoop QA commented on HBASE-16698: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 6s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} branch-1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s {color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} branch-1 passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 16m 32s {color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 12s {color} | {color:red} hbase-server generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_80 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 57s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 104m 15s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion$BatchOperationInProgress) does not release lock on all paths At HRegion.java:on all paths At HRegion.java:[line 3310] | | Failed junit tests | hadoop.hbase.mapreduce.TestMultiTableSnapshotInputFormat | | Timed out junit tests | org.apache.hadoop.hbase.TestHBaseTestingUtility | | | org.apache.hadoop.hbase.quotas.TestQuotaThrottle | | | org.apache.hadoop.hbase.regionserver.TestClusterId | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:b2c5d84 | | JIRA Patch
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15579535#comment-15579535 ] Yu Li commented on HBASE-16698: --- [~allan163] and [~enis], please let me know whether my explanation address your concern, so we could move on here. Thanks. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576919#comment-15576919 ] Hudson commented on HBASE-16698: SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #1787 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1787/]) Revert "HBASE-16698 Performance issue: handlers stuck waiting for (stack: rev f555b5be9c4574be7969c734270bd8922f522391) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java Revert "HBASE-16698 Performance issue: handlers stuck waiting for (stack: rev 13baf4d37a7d3b4b0194dc616c8ac15959efa18f) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWALEntry.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576798#comment-15576798 ] Enis Soztutar commented on HBASE-16698: --- The other thing is that we cannot maintain two different code paths for this core piece. We should pick an approach and go with it. The only acceptable thing is that if the plan is to switch to a new approach, and we will keep the old implementation as a safe guard. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576766#comment-15576766 ] Enis Soztutar commented on HBASE-16698: --- I also have a hard time understanding the patch. We are still serializing the seq assining, but this time via a lock. This lock is used from handlers to append to the ring buffer as well, which basically means that we are going back to the old model (0.98) and have no benefits coming from disruptor. Is the perf gains from context switches in case there is less contention for the mvcc lock? We have to serialize the edits via ring buffer and assign seq ids in the same order anyway. Is the RBEH doing too much work for appending? maybe we need two consumers for the ring buffer, one for assigning seq ids, and the other for doing the actual append? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576730#comment-15576730 ] Yu Li commented on HBASE-16698: --- And I'd say the previously existed {{cell.getSequenceId() == 0}} check in {{applyFamilyMapToMemstore}} is some kind of protection mechanism from what I've observed :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, > HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576097#comment-15576097 ] stack commented on HBASE-16698: --- Pardon me. Was looking in master branch. Let me revert this patch from master branch. This discussion is not done. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574596#comment-15574596 ] binlijin commented on HBASE-16698: -- I find master branch-1.2 branch-1.1 they are all different. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574588#comment-15574588 ] binlijin commented on HBASE-16698: -- There is no problem for master version, only some of the branch-1 version? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574578#comment-15574578 ] Allan Yang commented on HBASE-16698: The CountDownLatch can't be moved since we need to wait until the data we written can been seen (through advance mvcc). Since your online cluster is a forked 1.1.2 version, is your patch fix this problem, [~carp84]]? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574563#comment-15574563 ] binlijin commented on HBASE-16698: -- a forked 1.1.2 version. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574517#comment-15574517 ] Heng Chen commented on HBASE-16698: --- I think [~allan163] you are right. It is different between branch-1.1 and branch-1.2. On branch-1.1, we wait for the seqId assigned after sync. So the issue is invalid for branch-1.1. It seems the CountDownLatch could be removed for SYNC_WAL durability? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574381#comment-15574381 ] Allan Yang commented on HBASE-16698: {code} // -- // STEP 8. Advance mvcc. This will make this put visible to scanners and getters. // -- if (writeEntry != null) { mvcc.completeMemstoreInsertWithSeqNum(writeEntry, walKey); writeEntry = null; } {code} It's in {{doMiniBatchMutation }} of branch1.1 . In {{completeMemstoreInsertWithSeqNum}}, It will get the seqid in {{walKey}} to advance the mvcc, I think that's where [~carp84]] said 'stuck at CountDownLatch ' My point is, even if we don't need to sync the wal, the batch still have to stuck here to advance mvcc, that it is a problem. But, if we choose to sync the wal, seqid in walKey should have been assigned in sync operation. Handlers shouldn't stuck here. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574336#comment-15574336 ] stack commented on HBASE-16698: --- bq. Why handlers stuck at CountDownLatch? They are waiting on sync threads to finish up their sync so they can return to the client. Where is mvcc.completeMemstoreInsertWithSeqNum ? I can't find it. Can you please say more [~allan163] I'm not following exactly what you are saying. It sounds interesting though. Thanks. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574087#comment-15574087 ] Hudson commented on HBASE-16698: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1782 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1782/]) HBASE-16698 Performance issue: handlers stuck waiting for CountDownLatch (stack: rev e1923b7c0c14b435ea0d9eb306d968f1927a0c6e) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573654#comment-15573654 ] stack commented on HBASE-16698: --- Committed below addendum to address this FindBugs complaint: CodeWarning UL org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion$BatchOperationInProgress) does not release lock on all paths Bug type UL_UNRELEASED_LOCK (click for details) In class org.apache.hadoop.hbase.regionserver.HRegion In method org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion$BatchOperationInProgress) At HRegion.java:[line 3313] stack-MBP:hbase stack$ git show -1 commit e1923b7c0c14b435ea0d9eb306d968f1927a0c6e Author: Michael StackDate: Thu Oct 13 17:16:47 2016 -0700 HBASE-16698 Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload; ADDENDUM. Fix findbugs diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java index 3715ca1..a486599 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java @@ -3310,7 +3310,6 @@ public class HRegion implements HeapSize, PropagatingConfigurationObserver, Regi this.mvcc.advanceTo(batchOp.getReplaySequenceId()); } else { // writeEntry won't be empty if not in replay mode -assert writeEntry != null; mvcc.completeAndWait(writeEntry); writeEntry = null; } @appy kicked me for committing w/ a FindBugs (Thanks @appy) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572906#comment-15572906 ] Hudson commented on HBASE-16698: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1780 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1780/]) HBASE-16698 Performance issue: handlers stuck waiting for CountDownLatch (stack: rev 9b13514483991889cd6ebe097c3c8eb0e7983e6d) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSWALEntry.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALKey.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571587#comment-15571587 ] Yu Li commented on HBASE-16698: --- Thanks for review [~chenheng], sure will upload numbers for single-region-single-RS case later, but probably after 11.11 :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571584#comment-15571584 ] Yu Li commented on HBASE-16698: --- Thanks for revisiting this [~stack] Yes sir, we're running w/ this in production for more than 2 months and everything looks good, no more handler stuck at CountDownLatch ever since, no data loss observed. And yes, let's make this in with option set to off as default, and we could revisit whether to set it on later when I have time to provide more perf data with YCSB. :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570840#comment-15570840 ] Heng Chen commented on HBASE-16698: --- Numbers seems 20 regions on one RS. If you have time, please upload numbers one region on one RS. I am very inerested about it. As [~stack] said, set it to off as default is good for me. BTW. The patch lgtm. +1 for it. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570145#comment-15570145 ] stack commented on HBASE-16698: --- So, [~carp84], you running w/ this in production? I should apply this to master and branch-1 for hbase-1.4 and to branch-1.3 and branch-1.2 but with this option set to off as default? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15540760#comment-15540760 ] Hadoop QA commented on HBASE-16698: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 26s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 59s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} branch-1 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} branch-1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s {color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} branch-1 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 16m 4s {color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 6s {color} | {color:red} hbase-server generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 44s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 113m 20s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion$BatchOperationInProgress) does not release lock on all paths At HRegion.java:on all paths At HRegion.java:[line 3313] | | Timed out junit tests | org.apache.hadoop.hbase.regionserver.TestClusterId | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:date2016-10-02 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12831253/HBASE-16698.branch-1.patch | | JIRA Issue | HBASE-16698 | | Optional Tests | asflicense javac javadoc unit
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15540746#comment-15540746 ] Yu Li commented on HBASE-16698: --- btw, this testing is against latest code of branch-1, not master > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15540744#comment-15540744 ] Yu Li commented on HBASE-16698: --- This is a common workload and nothing special, I hope this simple result answers your question [~chenheng], or else please wait for the YCSB result. Thanks. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15540740#comment-15540740 ] Yu Li commented on HBASE-16698: --- Ok, here is the result with a single RS and PE tool, with below command: {{hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --table=PERandomWrite --presplit=20 --latency randomWrite 20}} round-1: ||Type||AverageTime(ms)||ThroughputPerClient|| |w/ patch|376150|2.74MB/s| |w/o patch|382549|2.69MB/s| round-2: ||Type||AverageTime(ms)||ThroughputPerClient|| |w/ patch|381925|2.70MB/s| |w/o patch|385666|2.67MB/s| round-3: ||Type||AverageTime(ms)||ThroughputPerClient|| |w/ patch|364555|2.83MB/s| |w/o patch|374948|2.75MB/s| And when testing w/o patch we could easily see the waiting on CountDownLatch. This simple result could show the effect of the patch to some extent, but not quite obviously (I'm afraid PE output for write is not well formatted and we could not see metrics like throughput directly...). Will upload more testing result with YCSB workload later. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15528581#comment-15528581 ] Heng Chen commented on HBASE-16698: --- {quote} so the main problem is sequential appends and the logic that getting MVCC has to wait for the relative append to finish. {quote} Yeah, but just for this sequential dealing, we could avoid lock to keep mvcc and wal in the same order. So i am not sure in which workload, the performance will be improved. And i think "per-table configuration" makes sense if we could do it. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526894#comment-15526894 ] Yu Li commented on HBASE-16698: --- I don't have much benchmarking number at hand, but from our online cluster we see no regression on put average time. Notice that the problem exists even if there's only one region but many parallel writes (yes, after a relook I think I stated something wrong, the issue stands even if there's only one region), allow me to quote the existing code flow on append handling: {noformat} RingBufferEventHandler grab one append -> FSHLog#append is called -> FSWALEntry#stampRegionSequenceId is called -> One CountDownLatch is released -> RingBufferEventHandler grab another append -> Another CountDownLatch is released -> Repeat {noformat} so the main problem is *sequential* appends and the logic that getting MVCC has to wait for the relative append to finish. I'll supply some perf number with YCSB, but should be days later because of some headache online issues... Or it will be highly appreciated if anyone could offer some help on the bench testing. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15524987#comment-15524987 ] Heng Chen commented on HBASE-16698: --- How much the performance will be downgrade when ops are just for one region. [~carp84] do you have some performance results? In our production cluster (Not big cluster), many tables have just few regions but QPS is high, have a litter worried about it after we set it to be default. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15523055#comment-15523055 ] Yu Li commented on HBASE-16698: --- bq. Why not? If a false positive and you can't clean it up... Because doMiniBatchMutate is a big and critical method, and I'm afraid adding such a suppress will make us ignore some real bugs in future changes... Is this a valid concern or I should still add the suppress? [~stack] bq. On the patch, I'd be good w/ it going in as off by default in branch-1 and on by default in master branch. ok, let me prepare a branch-1 patch > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521978#comment-15521978 ] stack commented on HBASE-16698: --- On the patch, I'd be good w/ it going in as off by default in branch-1 and on by default in master branch. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521975#comment-15521975 ] stack commented on HBASE-16698: --- bq. but I don't think it's a good idea doing the same thing for doMiniBatchMutate. Why not? If a false positive and you can't clean it up, add the suppress with your justification. Thanks [~carp84] > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15521114#comment-15521114 ] Yu Li commented on HBASE-16698: --- Checked below failed UT cases in HadoopQA report and confirmed all could pass in local: {noformat} org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.client.TestFromClientSide org.apache.hadoop.hbase.client.TestIncrementFromClientSideWithCoprocessor org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence {noformat} I have seen some of these failed cases in HadoopQA report for several JIRAs, not sure whether any JIRA already track them down. Regarding the findbugs issue: {noformat} Bug type UL_UNRELEASED_LOCK (click for details) In class org.apache.hadoop.hbase.regionserver.HRegion In method org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion$BatchOperation) At HRegion.java:[line 3262] {noformat} I think it's a similar fingbugs false positive like this one in [stackoverflow|http://stackoverflow.com/questions/5408940/possible-findbugs-false-positive-of-ul-unreleased-lock-exception-path]. I could see some methods suppress fingbugs warning through {{@edu.umd.cs.findbugs.annotations.SuppressWarnings}} such as {{HRegion#doClose}}, but I don't think it's a good idea doing the same thing for {{doMiniBatchMutate}}. Any suggestions [~stack]? > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15518312#comment-15518312 ] Yu Li commented on HBASE-16698: --- ok, so firstly set per-table is an option not necessity, it's just like all other configuration properties. Secondly, consider this case: with multiple WAL enabled and {{hbase.wal.regiongrouping.strategy}} set to {{NamespaceGroupingStrategy}}, and namespace A has and will only have one table, then we could set this property to false for it. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15518215#comment-15518215 ] Hadoop QA commented on HBASE-16698: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 43s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 58s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 9s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 59s {color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 31s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 133m 2s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion$BatchOperation) does not release lock on all paths At HRegion.java:on all paths At HRegion.java:[line 3262] | | Timed out junit tests | org.apache.hadoop.hbase.client.TestReplicasClient | | | org.apache.hadoop.hbase.client.TestFromClientSide | | | org.apache.hadoop.hbase.client.TestIncrementFromClientSideWithCoprocessor | | | org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient | | | org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12830123/HBASE-16698.v2.patch | | JIRA Issue | HBASE-16698 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux c6777619b1c9 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 7ed93f8 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/3700/artifact/patchprocess/new-findbugs-hbase-server.html | | unit |
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15518057#comment-15518057 ] Hadoop QA commented on HBASE-16698: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 35s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 36s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 8s {color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 20s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 126m 56s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion$BatchOperation) does not release lock on all paths At HRegion.java:on all paths At HRegion.java:[line 3262] | | Failed junit tests | hadoop.hbase.client.TestBlockEvictionFromClient | | Timed out junit tests | org.apache.hadoop.hbase.client.TestReplicasClient | | | org.apache.hadoop.hbase.TestClusterBootOrder | | | org.apache.hadoop.hbase.client.TestHCM | | | org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12830123/HBASE-16698.v2.patch | | JIRA Issue | HBASE-16698 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux cbcd94f50984 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 4082424 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/3698/artifact/patchprocess/new-findbugs-hbase-server.html | | unit |
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517980#comment-15517980 ] binlijin commented on HBASE-16698: -- I think set per-table is not a good idea, because all regions in a some region server share the same HLogs, so they interact with each other. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517773#comment-15517773 ] Yu Li commented on HBASE-16698: --- btw, will try to get some perf comparison data through YCSB benchmark later if I could spare some time, crazy preparing for Alibaba's 11-11 festival recently (not advertising :-P)... > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517763#comment-15517763 ] Yu Li commented on HBASE-16698: --- bq. So no speed up if one region only on a regionserver. Ok. I buy it. Yes :-) bq. On what Elliott Clark raises... Your patch looks safe to me though but let me think on it more Sure, let me also revisit the whole logic. But maybe I'm not that worried since this fix already ran on our production environment for weeks and no problem observed:-) In theory I think the lock plus ringbuffer's sequential handling could make sure MVCC and WAL have the same order. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517748#comment-15517748 ] stack commented on HBASE-16698: --- bq. So all CountDownLatch are released in sequential, no parallelism... So no speed up if one region only on a regionserver. Ok. I buy it. On what [~eclark] raises, lets be careful. He spent a bunch of time tracking a super weird issue where sequenceids in the WAL were not monotonically increasing because of a hole in our locking/reasoning. Your patch looks safe to me though but let me think on it more > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, HBASE-16698.v2.patch, > hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517721#comment-15517721 ] Yu Li commented on HBASE-16698: --- bq. So, why is this patch faster? In current implementation, contention is farmed out to be per WALKey instance. Each has its own latch. Yes, each WALKey has its own latch, but the contention is not on the latch itself, but the sequential handling of ringbuffer event. The whole process is like: {noformat} RingBufferEventHandler grab one append -> FSHLog#append is called -> FSWALEntry#stampRegionSequenceId is called -> One CountDownLatch is released -> RingBufferEventHandler grab another append -> Another CountDownLatch is released -> Repeat {noformat} So all CountDownLatch are released in sequential, no parallelism... bq. I was thinking there a correctness issue but the numbering/mvcc is scoped to the region so if you lock across the region append while getting the mvcc, and this is only place mvcc is incremented, then all should be good Yes, agree. And it seems our mighty [~eclark] has the same concern here. Hope this answers your question also [~eclark] :-) bq. Pity we have to lock. Could we be more radical and use the ringbuffer bucket number? Then no locking needed. The change would be way more intrusive though. You'd have to change a lot Cannot agree more... Actually I ever tried to use multiple event handlers, but too much logic to make sure if breaking sequential append, so I finally quit... But I agree that we should revisit this sometime later, worth the efforts I think. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517689#comment-15517689 ] stack commented on HBASE-16698: --- bq. Previously the CountDownLatch will be released one by one due to ringbuffer sequential handling, so writes on different regions will race. Ok. Helps if I look at the right branch (smile). Makes sense. It is branch-1 so flag makes sense. We might enable it by default in 1.3 since it not out yet ([~mantonov] FYI). So, why is this patch faster? In current implementation, contention is farmed out to be per WALKey instance. Each has its own latch. This patch swaps this model for an upfront contention on a ReentrantLock that is scoped to the Region. You think that the freeing of the latches in order costs >>> reentrant lock on every append? I was thinking there a correctness issue but the numbering/mvcc is scoped to the region so if you lock across the region append while getting the mvcc, and this is only place mvcc is incremented, then all should be good (Lock is to ensure ordering of appends only so doesn't have to be across all mvcc.begin invocations). Pity we have to lock. Could we be more radical and use the ringbuffer bucket number? Then no locking needed. The change would be way more intrusive though. You'd have to change a lot. On the patch, move these defines to the class where they are used I'd say: 1318 /** Config key for using mvcc pre-assign feature for put */ 1319 public static final String HREGION_MVCC_PRE_ASSIGN = "hbase.hregion.mvcc.preassign"; 1320 public static final boolean DEFAULT_HREGION_MVCC_PRE_ASSIGN = true; They are used once only in HRegion. HConstants is/was a bad idea. Otherwise, patch looks good to me (That jstack is crazy) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517679#comment-15517679 ] Elliott Clark commented on HBASE-16698: --- If mvcc isn't the same order as the wal log order then there's a chance of acid violations. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517572#comment-15517572 ] Yu Li commented on HBASE-16698: --- grep "CountDownLatch.await" in the jstack and we could see 98 handlers waiting there. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517563#comment-15517563 ] Yu Li commented on HBASE-16698: --- Thanks for chiming in boss [~stack] :-) bq. Yes, intentionally one-by-one in a single thread... How is it a bottleneck Since there's only one WAL per RS, and single event handler for the ringbuffer, it makes contention among writes on different regions. And I forgot to attach the jstack when issue happens online (oops...), let me upload it. (the version is our modified 1.1.2, so lines may not match, but enough to show the issue I guess) bq. ...Would need to set the ringbuffer initial sequence to be that of the most recent edit for the region...Would be interested to hear/see what you are thinking Oh I meant to call {{writeEntry = mvcc.begin();}} and set it into {{WALKey}} before publishing the append to ringbuffer, the lock and ringbuffer's sequential mechanism could make sure writes with lower mvcc/sequenceId written into WAL first. Please check the patch for more details and let me know your thoughts sir. bq. I see an added reentrant lock. Otherwise, all else is the same? In the patch we call {{writeEntry = mvcc.begin();}} and set it into {{WALKey}} before publishing the append to ringbuffer, so we won't block on waiting for the CountDownLatch. Previously the CountDownLatch will be released one by one due to ringbuffer sequential handling, so writes on different regions will race. Please check the attached jstack. :-) > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517531#comment-15517531 ] stack commented on HBASE-16698: --- Oh. Saw the patch. Why is this faster? I see an added reentrant lock. Otherwise, all else is the same? Maybe I am not following. Thanks [~carp84] > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16698) Performance issue: handlers stuck waiting for CountDownLatch inside WALKey#getWriteEntry under high writing workload
[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517490#comment-15517490 ] stack commented on HBASE-16698: --- Thank you for digging in here [~carp84]. This bit I don't follow bq. the append calls are handled one by one (actually lot's of our current logic depending on this sequential dealing logic), and this becomes a bottleneck under high writing workload Yes, intentionally one-by-one in a single thread (no need of locks, logic is easier to reason about, and less likely we'll stay on core). How is it a bottleneck? bq. that we could grab the WriteEntry before publishing append onto ringbuffer and use it as sequence id I considered doing this. Would need to set the ringbuffer initial sequence to be that of the most recent edit for the region. It is always increasing so could work but was wary tying mvcc tied to a ringbuffer intrinsic? Would be interested to hear/see what you are thinking [~carp84]. bq. only that we need to add a lock to make "grab WriteEntry" and "append edit" a transaction. ... Won't this undo some of the ringbuffer benefit? Or, maybe I'm misunderstanding. bq. This solution is already verified in our online environment and proved to be effective. Production experience beats all theoretical reasoning (smile). > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.1.6, 1.2.3 >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-16698.patch > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)