[ 
https://issues.apache.org/jira/browse/HBASE-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919033#comment-13919033
 ] 

Liyin Tang commented on HBASE-10659:
------------------------------------

1) Since updating memstore is much faster than HLog syncing, one 
memstore-update-thread seems to be sufficient. Or we can make it configurable 
as each HLogSyncer thread will have a corresponding memstore-update-thread.

2)  The HLogSyncer thread will batch multiple transactions, as a group commit, 
from different IPC writer threads, and then sync this group commit into HLog 
stream. And then, the memstore-update-thread will take this group commit and 
update the corresponding memstore in (sequence id) order.

> [89-fb] Optimize the threading model in HBase write path
> --------------------------------------------------------
>
>                 Key: HBASE-10659
>                 URL: https://issues.apache.org/jira/browse/HBASE-10659
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Liyin Tang
>
> Recently, we have done multiple prototypes to optimize the HBase (0.89)write 
> path. And based on the simulator results, the following model is able to 
> achieve much higher overall throughput with less threads.
> IPC Writer Threads Pool: 
> IPC handler threads will prepare all Put requests, and append the WALEdit, as 
> one transaction, into a concurrent collection with a read lock. And then just 
> return;
> HLogSyncer Thread:
> Each HLogSyncer thread is corresponding to one HLog stream. It swaps the 
> concurrent collection with a write lock, and then iterate over all the 
> elements in the previous concurrent collection, generate the sequence id for 
> each transaction, and write to HLog. After the HLog sync is done, append 
> these transactions as a batch into a blocking queue. 
> Memstore Update Thread:
> The memstore update thread will poll the blocking queue and update the 
> memstore for each transaction by using the sequence id as MVCC. Once the 
> memstore update is done, dispatch to the responder thread pool to return to 
> the client.
> Responder Thread Pool:
> Responder thread pool will return the RPC call in parallel. 
> We are still evaluating this model and will share more results/numbers once 
> it is ready. But really appreciate any comments in advance !



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to