[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

Lars Hofhansl (JIRA) Tue, 18 Jun 2013 15:03:36 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687258#comment-13687258
 ]


Lars Hofhansl commented on HBASE-8755:
--------------------------------------

bq. For 0.94 users got one wal edit means one sync. Meaning if you failed to 
sync it was for your edit only. Not really a huge deal, but that was my convern.

Is that true, though? (say) Two services threads call doMiniBatchMutation on 
behalf of two clients at roughly the same time, both threads would write to the 
WAL without sync'ing, and they would sync. Depending on the interleaving the 
first could sync all edits and the 2nd could be a no-op (the {{txid <= 
this.syncedTillHere}} test) in the the write-write-sync-sync case.

It is true that both threads would execute a sync in the context of their own 
thread, and that would no longer be the case. In fact that is the main concern 
with this patch: It is harder to trace the code to verify an edit is truly 
sync'ed in all cases; I am also concerned about possible new/different deadlock 
scenarios with which the existing code is riddled.
Also I wonder if some folks do rely on changing deferred-sync-interval.

If I weren't in Germany right with no VPN access I'd offer to load 0.94 with 
the patch onto one of our test clusters and run some workload on it...

                
> A new write thread model for HLog to improve the overall HBase write 
> throughput
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-8755
>                 URL: https://issues.apache.org/jira/browse/HBASE-8755
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Feng Honghua
>         Attachments: HBASE-8755-0.94-V0.patch, HBASE-8755-0.94-V1.patch, 
> HBASE-8755-trunk-V0.patch
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 70000+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 70000 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

Reply via email to