[
https://issues.apache.org/jira/browse/HBASE-16698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15517689#comment-15517689
]
stack commented on HBASE-16698:
-------------------------------
bq. Previously the CountDownLatch will be released one by one due to ringbuffer
sequential handling, so writes on different regions will race.
Ok. Helps if I look at the right branch (smile). Makes sense. It is branch-1 so
flag makes sense. We might enable it by default in 1.3 since it not out yet
([~mantonov] FYI).
So, why is this patch faster? In current implementation, contention is farmed
out to be per WALKey instance. Each has its own latch. This patch swaps this
model for an upfront contention on a ReentrantLock that is scoped to the
Region. You think that the freeing of the latches in order costs >>> reentrant
lock on every append?
I was thinking there a correctness issue but the numbering/mvcc is scoped to
the region so if you lock across the region append while getting the mvcc, and
this is only place mvcc is incremented, then all should be good (Lock is to
ensure ordering of appends only.... so doesn't have to be across all mvcc.begin
invocations).
Pity we have to lock. Could we be more radical and use the ringbuffer bucket
number? Then no locking needed. The change would be way more intrusive though.
You'd have to change a lot.
On the patch, move these defines to the class where they are used I'd say:
1318 /** Config key for using mvcc pre-assign feature for put */
1319 public static final String HREGION_MVCC_PRE_ASSIGN =
"hbase.hregion.mvcc.preassign";
1320 public static final boolean DEFAULT_HREGION_MVCC_PRE_ASSIGN = true;
They are used once only in HRegion. HConstants is/was a bad idea.
Otherwise, patch looks good to me (That jstack is crazy)
> Performance issue: handlers stuck waiting for CountDownLatch inside
> WALKey#getWriteEntry under high writing workload
> --------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-16698
> URL: https://issues.apache.org/jira/browse/HBASE-16698
> Project: HBase
> Issue Type: Improvement
> Components: Performance
> Affects Versions: 1.1.6, 1.2.3
> Reporter: Yu Li
> Assignee: Yu Li
> Attachments: HBASE-16698.patch, hadoop0495.et2.jstack
>
>
> As titled, on our production environment we observed 98 out of 128 handlers
> get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside
> {{WALKey#getWriteEntry}} under a high writing workload.
> After digging into the problem, we found that the problem is mainly caused by
> advancing mvcc in the append logic. Below is some detailed analysis:
> Under current branch-1 code logic, all batch puts will call
> {{WALKey#getWriteEntry}} after appending edit to WAL, and
> {{seqNumAssignedLatch}} is only released when the relative append call is
> handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}).
> Because currently we're using a single event handler for the ringbuffer, the
> append calls are handled one by one (actually lot's of our current logic
> depending on this sequential dealing logic), and this becomes a bottleneck
> under high writing workload.
> The worst part is that by default we only use one WAL per RS, so appends on
> all regions are dealt with in sequential, which causes contention among
> different regions...
> To fix this, we could also take use of the "sequential appends" mechanism,
> that we could grab the WriteEntry before publishing append onto ringbuffer
> and use it as sequence id, only that we need to add a lock to make "grab
> WriteEntry" and "append edit" a transaction. This will still cause contention
> inside a region but could avoid contention between different regions. This
> solution is already verified in our online environment and proved to be
> effective.
> Notice that for master (2.0) branch since we already change the write
> pipeline to sync before writing memstore (HBASE-15158), this issue only
> exists for the ASYNC_WAL writes scenario.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)