[jira] [Commented] (HBASE-16890) Analyze the performance of AsyncWAL and fix the same

Duo Zhang (JIRA) Tue, 01 Nov 2016 02:30:03 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624913#comment-15624913
 ]


Duo Zhang commented on HBASE-16890:
-----------------------------------

And there is another difference is that, we will not increase txid when syncing 
in the old implementation. But now, we just use the sequence of RingBuffer as 
txid, so a sync request will also increase the txid. So we need a new algorithm 
to determine if a sync has already been finished.

The new logic is in the AsyncFSWAL.finishSync method. It works like this

1. If highestSyncedTxid is less than highestProcessedAppendTxid, then works 
like the old way, only finish sync request whose txid is less than 
highestSyncedTxid because we still have unacked appends.
2. If highestSyncedTxid is greater than or equal to highestProcessedAppendTxid, 
it means that all outstanding appends have been acked.
  a. waitingAppendEntries is also empty. This means there is no append before 
all the sync request we currently track, so just finish them all, and increase 
highestSyncedTxid  if possible.
  b. waitingAppendEntries is not empty. Let lowestUnprocessedAppendTxid = "the 
txid of the first append in waitingAppendEntries", then we can make sure that, 
there is no append between (highestProcessedAppendTxid, 
lowestUnprocessedAppendTxid), so we can finish all the sync request in this 
range, and set highestSyncedTxid to lowestUnprocessedAppendTxid - 1.

I tried TestAsyncFSWAL.testSyncNoAppend locally. The algorithm here works well.

> Analyze the performance of AsyncWAL and fix the same
> ----------------------------------------------------
>
>                 Key: HBASE-16890
>                 URL: https://issues.apache.org/jira/browse/HBASE-16890
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>    Affects Versions: 2.0.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0
>
>         Attachments: AsyncWAL_disruptor.patch, AsyncWAL_disruptor_1 
> (2).patch, HBASE-16890-remove-contention.patch, Screen Shot 2016-10-25 at 
> 7.34.47 PM.png, Screen Shot 2016-10-25 at 7.39.07 PM.png, Screen Shot 
> 2016-10-25 at 7.39.48 PM.png, async.svg, classic.svg, contention.png, 
> contention_defaultWAL.png
>
>
> Tests reveal that AsyncWAL under load in single node cluster performs slower 
> than the Default WAL. This task is to analyze and see if we could fix it.
> See some discussions in the tail of JIRA HBASE-15536.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16890) Analyze the performance of AsyncWAL and fix the same

Reply via email to