[jira] [Commented] (HBASE-16689) Durability == ASYNC_WAL means no SYNC

stack (JIRA) Thu, 22 Sep 2016 21:38:33 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-16689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515385#comment-15515385
 ]


stack commented on HBASE-16689:
-------------------------------

h1. History

We committed the below in time for 0.98.0:

Author: Michael Stack <[email protected]>
Date:   Fri Dec 13 17:32:09 2013 +0000

    HBASE-8755 A new write thread model for HLog to improve the overall HBase 
write throughput

    git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1550778 
13f79535-47bb-0310-9956-ffa450edef68

It removed the LogSyncer thread because it didn't fit the new model..... From 
comments in above issue:

+   * 6). No LogSyncer thread any more (since there is always 
AsyncWriter/AsyncFlusher threads
+   *     do the same job it does)

And from reviews:

{code}
[stack] How does deferred log flush still work when you remove stuff like 
optionalFlushInterval? You say '...don't pend on HLog.syncer() waiting for its 
txid to be sync-ed' but that is another behavior than what we had here 
previously.
===> When say 'still support deferred log flush' I mean for 'deferred log 
flush' it can still response write success to client without wait/pend on 
syncer(txid),
in this sense, the AsyncWriter/AsyncSyncer do what the previous LogSyncer does 
from the point view of the write handler threads: clients don't wait for the 
write persist before get reponse success.
{code}

The above got further clarification over in  HBASE-10324:

{code}
"By the new write thread model introduced by HBASE-8755, some 
deferred-log-flush/Durability API/code/names should be change accordingly:
1. no timer-triggered deferred-log-flush since flush is always done by async 
threads, so configuration 'hbase.regionserver.optionallogflushinterval' is no 
longer needed
2. the async writer-syncer-notifier threads will always be triggered 
implicitly, this semantic is that it always holds that 
'hbase.regionserver.optionallogflushinterval' > 0, so deferredLogSyncDisabled 
in HRegion.java which affects durability behavior should always be false
3. what HTableDescriptor.isDeferredLogFlush really means is the write can 
return without waiting for the sync is done, so the interface name should be 
changed to isAsyncLogFlush/setAsyncLogFlush to reflect their real meaning"
{code}

Reading the patch, we just always did sync. There was no support for deferred.

In 1.0.0, a new WAL refactor was brought in by HBASE-10156. It removed all 
vestiges of deferred. They weren't working anyways. But it also changed the 
model. It added support for durability with a variety of actions dependent on 
how durability is set. ASYNC_WAL became a pass through for sync.

> Durability == ASYNC_WAL means no SYNC
> -------------------------------------
>
>                 Key: HBASE-16689
>                 URL: https://issues.apache.org/jira/browse/HBASE-16689
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 1.0.3, 1.1.6, 1.2.3
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>
> Setting DURABILITY=ASYNC_WAL on a Table suspends all syncs for all table 
> Table appends. If all tables on a cluster have this setting, data is flushed 
> from the RS to the DN at some arbitrary time and a bunch may just hang out in 
> DFSClient buffers on the RS-side indefinitely if writes are sporadic, at 
> least until there is a WAL roll -- a log roll sends a sync through the write 
> pipeline to flush out any outstanding appends -- or a region close which does 
> similar.... or we crash and drop the data in buffers RS.
> This is probably not what a user expects when they set ASYNC_WAL (We don't 
> doc anywhere that I could find clearly what ASYNC_WAL means). Worse, old-time 
> users probably associate ASYNC_WAL and DEFERRED_FLUSH, an old 
> HTableDescriptor config that was deprecated and replaced by ASYNC_WAL. 
> DEFERRED_FLUSH ran a background thread -- LogSyncer -- that on a configurable 
> interval, sent a sync down the write pipeline so any outstanding appends 
> since last last interval start get pushed out to the DN.  ASYNC_WAL doesn't 
> do this (see below for history on how we let go of the LogSyncer feature).
> Of note, we always sync meta edits. You can't turn this off. Also, given WALs 
> are per regionserver, if other regions on the RS are from tables that have 
> sync set, these writes will push out to the DN any appends done on tables 
> that have DEFERRED/ASYNC_WAL set.
> To fix, we could do a few things:
>  * Simple and comprehensive would be always queuing a sync, even if ASYNC_WAL 
> is set but we let go of Handlers as soon as we write the memstore -- we don't 
> wait on the sync to complete as we do with the default setting of 
> Durability=SYNC_WAL.
>  * Be like a 'real' database and add in a sync after N bytes of data have 
> been appended (configurable) or after M milliseconds have passed, which ever 
> threshold happens first. The size check would be easy. The sync-ever-M-millis 
> would mean another thread.
> Let me take a look and report back. Will file a bit of history on how we got 
> here in next comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16689) Durability == ASYNC_WAL means no SYNC

Reply via email to