[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

stack (JIRA) Tue, 17 Sep 2013 22:33:22 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770455#comment-13770455
 ]


stack commented on HBASE-8755:
------------------------------

Trying this on hdfs.  It takes WAY longer.  Threads stuck here:

{code}
"t1" prio=10 tid=0x00007f49fd4fc800 nid=0xfc4 in Object.wait() 
[0x00007f49d0d99000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:1795)
        - locked <0x0000000423dd1cc8> (a java.util.LinkedList)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1689)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1582)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1567)
        at 
org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:116)
        at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:135)
        at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1072)
        at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1195)
        at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog.append(FSHLog.java:910)
        at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog.append(FSHLog.java:844)
        at 
org.apache.hadoop.hbase.regionserver.wal.FSHLog.append(FSHLog.java:838)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLogPerformanceEvaluation$HLogPutBenchmark.run(HLogPerformanceEvaluation.java:110)
        at java.lang.Thread.run(Thread.java:662)
{code}

Here are numbers I have so far for WITHOUT patch:


/tmp/log-patch1.1.txt:2013-09-17 21:19:31,495 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=1, iterations=1000000 took 
991.258s 1008.819ops/s
/tmp/log-patch1.2.txt:2013-09-17 21:35:04,715 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=1, iterations=1000000 took 
924.881s 1081.220ops/s
/tmp/log-patch1.3.txt:2013-09-17 21:51:32,416 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=1, iterations=1000000 took 
979.312s 1021.125ops/s
/tmp/log-patch5.1.txt:2013-09-17 22:07:31,712 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=5, iterations=1000000 took 
950.968s 5257.800ops/s
/tmp/log-patch5.2.txt:2013-09-17 22:23:39,680 INFO  [main] 
wal.HLogPerformanceEvaluation: Summary: threads=5, iterations=1000000 took 
939.312s 5323.045ops/s

Will clean up later but write rate is constant whether 1 or 5 threads.  Will 
see when 50.  Looks like I need to mode the HLogPE.  It calls FSHLog#doWrite 
directory which is not as interesting since is by-passes locks in FSHLog when 
it does not call append.

Will be back.
                
> A new write thread model for HLog to improve the overall HBase write 
> throughput
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-8755
>                 URL: https://issues.apache.org/jira/browse/HBASE-8755
>             Project: HBase
>          Issue Type: Improvement
>          Components: Performance, wal
>            Reporter: Feng Honghua
>            Assignee: stack
>            Priority: Critical
>             Fix For: 0.96.1
>
>         Attachments: 8755trunkV2.txt, HBASE-8755-0.94-V0.patch, 
> HBASE-8755-0.94-V1.patch, HBASE-8755-trunk-V0.patch, HBASE-8755-trunk-V1.patch
>
>
> In current write model, each write handler thread (executing put()) will 
> individually go through a full 'append (hlog local buffer) => HLog writer 
> append (write to hdfs) => HLog writer sync (sync hdfs)' cycle for each write, 
> which incurs heavy race condition on updateLock and flushLock.
> The only optimization where checking if current syncTillHere > txid in 
> expectation for other thread help write/sync its own txid to hdfs and 
> omitting the write/sync actually help much less than expectation.
> Three of my colleagues(Ye Hangjun / Wu Zesheng / Zhang Peng) at Xiaomi 
> proposed a new write thread model for writing hdfs sequence file and the 
> prototype implementation shows a 4X improvement for throughput (from 17000 to 
> 70000+). 
> I apply this new write thread model in HLog and the performance test in our 
> test cluster shows about 3X throughput improvement (from 12150 to 31520 for 1 
> RS, from 22000 to 70000 for 5 RS), the 1 RS write throughput (1K row-size) 
> even beats the one of BigTable (Precolator published in 2011 says Bigtable's 
> write throughput then is 31002). I can provide the detailed performance test 
> results if anyone is interested.
> The change for new write thread model is as below:
>  1> All put handler threads append the edits to HLog's local pending buffer; 
> (it notifies AsyncWriter thread that there is new edits in local buffer)
>  2> All put handler threads wait in HLog.syncer() function for underlying 
> threads to finish the sync that contains its txid;
>  3> An single AsyncWriter thread is responsible for retrieve all the buffered 
> edits in HLog's local pending buffer and write to the hdfs 
> (hlog.writer.append); (it notifies AsyncFlusher thread that there is new 
> writes to hdfs that needs a sync)
>  4> An single AsyncFlusher thread is responsible for issuing a sync to hdfs 
> to persist the writes by AsyncWriter; (it notifies the AsyncNotifier thread 
> that sync watermark increases)
>  5> An single AsyncNotifier thread is responsible for notifying all pending 
> put handler threads which are waiting in the HLog.syncer() function
>  6> No LogSyncer thread any more (since there is always 
> AsyncWriter/AsyncFlusher threads do the same job it does)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-8755) A new write thread model for HLog to improve the overall HBase write throughput

Reply via email to