[jira] [Commented] (HBASE-16890) Analyze the performance of AsyncWAL and fix the same

Duo Zhang (JIRA) Fri, 04 Nov 2016 08:29:43 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15636688#comment-15636688
 ]


Duo Zhang commented on HBASE-16890:
-----------------------------------

Ah I could also observe the same result with a larger data set. FSHLog is 
faster. And I think I found the direct reason.

The command is
{noformat}
./bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation --nomapred 
--presplit=50 --size=50 --columns=50 --valueSize=200 --writeToWAL=true 
--bloomFilter=NONE randomWrite 50
{noformat}

When running with FSHLog, we have flushed 702 times and the average flush size 
is 85.3227MB. And for the AsyncFSWAL with my patch, we have flushed 850 times, 
and the average flush size is 71.609MB.

Usually the flush is triggered by log roller because of too many WAL files.
{noformat}
2016-11-04 22:28:18,925 INFO  
[regionserver/c4-hadoop-build01.bj/10.132.4.49:16020.logRoller] 
wal.AbstractFSWAL: Too many WALs; count=33, max=32; forcing flush of 6 
regions(s): 7f18ef6867d0a36627930da34818069f, 7fdd29e6e2e6be
34b2ea97c9a06281d0, d12c296bd1cb70b2ce78e9a3bc914318, 
9207d10a0f22877079d3896d6cb6ebb2, d2b6ac38e6edf675225a71748fb1274e, 
ad371a623567b35a784256e4f05c5f3a
{noformat}

And for FSHLog, we have rolled 491 times, the average roll size is 130.329MB.
And for AsyncFSWAL with my patch, we have rolled 584 times, the average roll 
size is 109.666MB.

In general, the roll size of FSHLog is little larger than AsyncFSWAL(which 
means the AsyncFSWAL is a little faster when rolling?). But I think the main 
reason is that, for AsyncFSWAL, there are 78 times at which we roll with size 
far away from the roll size, between 10MB-20MB. I think this is the problem why 
we perform so bad in PE. Need to find out where the abnormal rolling comes from.

Have you guys observe the same stuffs? [~ram_krish] [~anoopsamjohn].

Will go out for two days this weekend. Will come back to dig next Monday.

Thanks.

> Analyze the performance of AsyncWAL and fix the same
> ----------------------------------------------------
>
>                 Key: HBASE-16890
>                 URL: https://issues.apache.org/jira/browse/HBASE-16890
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>    Affects Versions: 2.0.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0
>
>         Attachments: AsyncWAL_disruptor.patch, AsyncWAL_disruptor_1 
> (2).patch, AsyncWAL_disruptor_3.patch, AsyncWAL_disruptor_3.patch, 
> AsyncWAL_disruptor_4.patch, AsyncWAL_disruptor_6.patch, 
> HBASE-16890-rc-v2.patch, HBASE-16890-rc-v3.patch, 
> HBASE-16890-remove-contention-v1.patch, HBASE-16890-remove-contention.patch, 
> Screen Shot 2016-10-25 at 7.34.47 PM.png, Screen Shot 2016-10-25 at 7.39.07 
> PM.png, Screen Shot 2016-10-25 at 7.39.48 PM.png, async.svg, classic.svg, 
> contention.png, contention_defaultWAL.png
>
>
> Tests reveal that AsyncWAL under load in single node cluster performs slower 
> than the Default WAL. This task is to analyze and see if we could fix it.
> See some discussions in the tail of JIRA HBASE-15536.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16890) Analyze the performance of AsyncWAL and fix the same

Reply via email to