[
https://issues.apache.org/jira/browse/HBASE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624913#comment-15624913
]
Duo Zhang commented on HBASE-16890:
-----------------------------------
And there is another difference is that, we will not increase txid when syncing
in the old implementation. But now, we just use the sequence of RingBuffer as
txid, so a sync request will also increase the txid. So we need a new algorithm
to determine if a sync has already been finished.
The new logic is in the AsyncFSWAL.finishSync method. It works like this
1. If highestSyncedTxid is less than highestProcessedAppendTxid, then works
like the old way, only finish sync request whose txid is less than
highestSyncedTxid because we still have unacked appends.
2. If highestSyncedTxid is greater than or equal to highestProcessedAppendTxid,
it means that all outstanding appends have been acked.
a. waitingAppendEntries is also empty. This means there is no append before
all the sync request we currently track, so just finish them all, and increase
highestSyncedTxid if possible.
b. waitingAppendEntries is not empty. Let lowestUnprocessedAppendTxid = "the
txid of the first append in waitingAppendEntries", then we can make sure that,
there is no append between (highestProcessedAppendTxid,
lowestUnprocessedAppendTxid), so we can finish all the sync request in this
range, and set highestSyncedTxid to lowestUnprocessedAppendTxid - 1.
I tried TestAsyncFSWAL.testSyncNoAppend locally. The algorithm here works well.
> Analyze the performance of AsyncWAL and fix the same
> ----------------------------------------------------
>
> Key: HBASE-16890
> URL: https://issues.apache.org/jira/browse/HBASE-16890
> Project: HBase
> Issue Type: Sub-task
> Components: wal
> Affects Versions: 2.0.0
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: AsyncWAL_disruptor.patch, AsyncWAL_disruptor_1
> (2).patch, HBASE-16890-remove-contention.patch, Screen Shot 2016-10-25 at
> 7.34.47 PM.png, Screen Shot 2016-10-25 at 7.39.07 PM.png, Screen Shot
> 2016-10-25 at 7.39.48 PM.png, async.svg, classic.svg, contention.png,
> contention_defaultWAL.png
>
>
> Tests reveal that AsyncWAL under load in single node cluster performs slower
> than the Default WAL. This task is to analyze and see if we could fix it.
> See some discussions in the tail of JIRA HBASE-15536.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)