stack updated HBASE-16689:
    Fix Version/s:     (was: 2.0.0)

> Durability == ASYNC_WAL means no SYNC
> -------------------------------------
>                 Key: HBASE-16689
>                 URL: https://issues.apache.org/jira/browse/HBASE-16689
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 1.0.3, 1.1.6, 1.2.3
>         Environment: At least get the above doc into the refguide.
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
> Setting DURABILITY=ASYNC_WAL on a Table suspends all syncs for all table 
> Table appends. If all tables on a cluster have this setting, data is flushed 
> from the RS to the DN at some arbitrary time and a bunch may just hang out in 
> DFSClient buffers on the RS-side indefinitely if writes are sporadic, at 
> least until there is a WAL roll -- a log roll sends a sync through the write 
> pipeline to flush out any outstanding appends -- or a region close which does 
> similar.... or we crash and drop the data in buffers RS.
> This is probably not what a user expects when they set ASYNC_WAL (We don't 
> doc anywhere that I could find clearly what ASYNC_WAL means). Worse, old-time 
> users probably associate ASYNC_WAL and DEFERRED_FLUSH, an old 
> HTableDescriptor config that was deprecated and replaced by ASYNC_WAL. 
> DEFERRED_FLUSH ran a background thread -- LogSyncer -- that on a configurable 
> interval, sent a sync down the write pipeline so any outstanding appends 
> since last last interval start get pushed out to the DN.  ASYNC_WAL doesn't 
> do this (see below for history on how we let go of the LogSyncer feature).
> Of note, we always sync meta edits. You can't turn this off. Also, given WALs 
> are per regionserver, if other regions on the RS are from tables that have 
> sync set, these writes will push out to the DN any appends done on tables 
> that have DEFERRED/ASYNC_WAL set.
> To fix, we could do a few things:
>  * Simple and comprehensive would be always queuing a sync, even if ASYNC_WAL 
> is set but we let go of Handlers as soon as we write the memstore -- we don't 
> wait on the sync to complete as we do with the default setting of 
> Durability=SYNC_WAL.
>  * Be like a 'real' database and add in a sync after N bytes of data have 
> been appended (configurable) or after M milliseconds have passed, which ever 
> threshold happens first. The size check would be easy. The sync-ever-M-millis 
> would mean another thread.
>  * Doc what ASYNC_WAL means (and other durability options)
> Let me take a look and report back. Will file a bit of history on how we got 
> here in next comment.

This message was sent by Atlassian JIRA

Reply via email to