[ https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15574381#comment-15574381 ]
Allan Yang edited comment on HBASE-16698 at 10/14/16 6:15 AM: -------------------------------------------------------------- {code} // ------------------------------------------------------------------ // STEP 8. Advance mvcc. This will make this put visible to scanners and getters. // ------------------------------------------------------------------ if (writeEntry != null) { mvcc.completeMemstoreInsertWithSeqNum(writeEntry, walKey); writeEntry = null; } {code} It's in {{doMiniBatchMutation}} of branch1.1 . In {{completeMemstoreInsertWithSeqNum}}, It will get the seqid in {{walKey}} to advance the mvcc, I think that's where [~carp84] said 'stuck at CountDownLatch ' My point is, even if we don't need to sync the wal, the batch still have to stuck here to advance mvcc, that it is a problem. But, if we choose to sync the wal, seqid in walKey should have been assigned in sync operation. Handlers shouldn't stuck here. was (Author: allan163): {code} // ------------------------------------------------------------------ // STEP 8. Advance mvcc. This will make this put visible to scanners and getters. // ------------------------------------------------------------------ if (writeEntry != null) { mvcc.completeMemstoreInsertWithSeqNum(writeEntry, walKey); writeEntry = null; } {code} It's in {{doMiniBatchMutation}} of branch1.1 . In {{completeMemstoreInsertWithSeqNum}}, It will get the seqid in {{walKey}} to advance the mvcc, I think that's where [~carp84]] said 'stuck at CountDownLatch ' My point is, even if we don't need to sync the wal, the batch still have to stuck here to advance mvcc, that it is a problem. But, if we choose to sync the wal, seqid in walKey should have been assigned in sync operation. Handlers shouldn't stuck here. > Performance issue: handlers stuck waiting for CountDownLatch inside > WALKey#getWriteEntry under high writing workload > -------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-16698 > URL: https://issues.apache.org/jira/browse/HBASE-16698 > Project: HBase > Issue Type: Improvement > Components: Performance > Affects Versions: 1.1.6, 1.2.3 > Reporter: Yu Li > Assignee: Yu Li > Fix For: 2.0.0 > > Attachments: HBASE-16698.branch-1.patch, HBASE-16698.patch, > HBASE-16698.v2.patch, hadoop0495.et2.jstack > > > As titled, on our production environment we observed 98 out of 128 handlers > get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside > {{WALKey#getWriteEntry}} under a high writing workload. > After digging into the problem, we found that the problem is mainly caused by > advancing mvcc in the append logic. Below is some detailed analysis: > Under current branch-1 code logic, all batch puts will call > {{WALKey#getWriteEntry}} after appending edit to WAL, and > {{seqNumAssignedLatch}} is only released when the relative append call is > handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). > Because currently we're using a single event handler for the ringbuffer, the > append calls are handled one by one (actually lot's of our current logic > depending on this sequential dealing logic), and this becomes a bottleneck > under high writing workload. > The worst part is that by default we only use one WAL per RS, so appends on > all regions are dealt with in sequential, which causes contention among > different regions... > To fix this, we could also take use of the "sequential appends" mechanism, > that we could grab the WriteEntry before publishing append onto ringbuffer > and use it as sequence id, only that we need to add a lock to make "grab > WriteEntry" and "append edit" a transaction. This will still cause contention > inside a region but could avoid contention between different regions. This > solution is already verified in our online environment and proved to be > effective. > Notice that for master (2.0) branch since we already change the write > pipeline to sync before writing memstore (HBASE-15158), this issue only > exists for the ASYNC_WAL writes scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)