[
https://issues.apache.org/jira/browse/HBASE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ramkrishna.s.vasudevan updated HBASE-16890:
-------------------------------------------
Attachment: AsyncWAL_disruptor_3.patch
Updated patch. Corrects the failed test cases. All failed ones pass now.
The idea here is that use RingBuffer's sequence to publish the event on that
sequence but DO NOT use that as the WAL's sequence Id. Instead in RB's
onEvent() create your own txid as it was done previously. So by doing this we
release the handlers quickly and thus solving the contention. Now the
SyncFutures are not constructed with a txid instead we set them in the RB's
onEvent().
So running WALPE
{code}
./hbase org.apache.hadoop.hbase.wal.WALPerformanceEvaluation -threads 100
-iterations 25000 -qualifiers 25 -keySize 50 -valueSize 200
{code}
With FSHLog
{code}
Summary: threads=100, iterations=25000, syncInterval=0 took 99.258s
25186.887ops/s
{code}
With Duo's patch
{code}
Summary: threads=100, iterations=25000, syncInterval=0 took 98.965s
25261.457ops/s
{code}
With AsyncWAL_disruptor_3.patch
{code}
Summary: threads=100, iterations=25000, syncInterval=0 took 85.893s
29105.982ops/s
{code}
[~Apache9] and [[email protected]]
Can you have a look at this?
In the mean time will test with PE tool on a cluster.
> Analyze the performance of AsyncWAL and fix the same
> ----------------------------------------------------
>
> Key: HBASE-16890
> URL: https://issues.apache.org/jira/browse/HBASE-16890
> Project: HBase
> Issue Type: Sub-task
> Components: wal
> Affects Versions: 2.0.0
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: AsyncWAL_disruptor.patch, AsyncWAL_disruptor_1
> (2).patch, AsyncWAL_disruptor_3.patch,
> HBASE-16890-remove-contention-v1.patch, HBASE-16890-remove-contention.patch,
> Screen Shot 2016-10-25 at 7.34.47 PM.png, Screen Shot 2016-10-25 at 7.39.07
> PM.png, Screen Shot 2016-10-25 at 7.39.48 PM.png, async.svg, classic.svg,
> contention.png, contention_defaultWAL.png
>
>
> Tests reveal that AsyncWAL under load in single node cluster performs slower
> than the Default WAL. This task is to analyze and see if we could fix it.
> See some discussions in the tail of JIRA HBASE-15536.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)