Yu Li commented on HBASE-16698:

Thanks for chiming in [~enis]

bq. We are still serializing the seq assining, but this time via a lock. This 
lock is used from handlers to append to the ring buffer as well
True, but the lock is per *region* while the disruptor is per *regionserver*. 
In another word, current implementation w/o patch makes parallel writes on 
*different* regions serializing (waiting on the same disruptor for their mvcc 
number, with single WAL by default). Change to use lock, the contention could 
be limited to region level.

IMHO one benefit from disruptor is that it makes the append asynchronous so 
appending to WAL and inserting into MemStore could be parallelized, but current 
impl doesn't take full advantage of it. From another perspective, currently we 
are making sure mvcc number and region sequence id are uniform by stamping them 
when append starts, while w/ patch we're stamping them at the very beginning 
and use the assigned number for both WAL and MemStore insertion.

bq. The only acceptable thing is that if the plan is to switch to a new 
approach, and we will keep the old implementation as a safe guard.
Agreed, and this is why I'm making it optional for now. This way operators 
could easily rollback if any fatal bug observed, raise a JIRA here and won't be 
blocked before we address it.

> Performance issue: handlers stuck waiting for CountDownLatch inside 
> WALKey#getWriteEntry under high writing workload
> --------------------------------------------------------------------------------------------------------------------
>                 Key: HBASE-16698
>                 URL: https://issues.apache.org/jira/browse/HBASE-16698
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance
>    Affects Versions: 1.2.3
>            Reporter: Yu Li
>            Assignee: Yu Li
>             Fix For: 2.0.0
>         Attachments: HBASE-16698.branch-1.patch, 
> HBASE-16698.branch-1.v2.patch, HBASE-16698.patch, HBASE-16698.v2.patch, 
> hadoop0495.et2.jstack
> As titled, on our production environment we observed 98 out of 128 handlers 
> get stuck waiting for the CountDownLatch {{seqNumAssignedLatch}} inside 
> {{WALKey#getWriteEntry}} under a high writing workload.
> After digging into the problem, we found that the problem is mainly caused by 
> advancing mvcc in the append logic. Below is some detailed analysis:
> Under current branch-1 code logic, all batch puts will call 
> {{WALKey#getWriteEntry}} after appending edit to WAL, and 
> {{seqNumAssignedLatch}} is only released when the relative append call is 
> handled by RingBufferEventHandler (see {{FSWALEntry#stampRegionSequenceId}}). 
> Because currently we're using a single event handler for the ringbuffer, the 
> append calls are handled one by one (actually lot's of our current logic 
> depending on this sequential dealing logic), and this becomes a bottleneck 
> under high writing workload.
> The worst part is that by default we only use one WAL per RS, so appends on 
> all regions are dealt with in sequential, which causes contention among 
> different regions...
> To fix this, we could also take use of the "sequential appends" mechanism, 
> that we could grab the WriteEntry before publishing append onto ringbuffer 
> and use it as sequence id, only that we need to add a lock to make "grab 
> WriteEntry" and "append edit" a transaction. This will still cause contention 
> inside a region but could avoid contention between different regions. This 
> solution is already verified in our online environment and proved to be 
> effective.
> Notice that for master (2.0) branch since we already change the write 
> pipeline to sync before writing memstore (HBASE-15158), this issue only 
> exists for the ASYNC_WAL writes scenario.

This message was sent by Atlassian JIRA

Reply via email to