stack updated HBASE-16689:
Fix Version/s: (was: 2.0.0)
> Durability == ASYNC_WAL means no SYNC
> Key: HBASE-16689
> URL: https://issues.apache.org/jira/browse/HBASE-16689
> Project: HBase
> Issue Type: Bug
> Components: wal
> Affects Versions: 1.0.3, 1.1.6, 1.2.3
> Environment: At least get the above doc into the refguide.
> Reporter: stack
> Assignee: stack
> Priority: Critical
> Setting DURABILITY=ASYNC_WAL on a Table suspends all syncs for all table
> Table appends. If all tables on a cluster have this setting, data is flushed
> from the RS to the DN at some arbitrary time and a bunch may just hang out in
> DFSClient buffers on the RS-side indefinitely if writes are sporadic, at
> least until there is a WAL roll -- a log roll sends a sync through the write
> pipeline to flush out any outstanding appends -- or a region close which does
> similar.... or we crash and drop the data in buffers RS.
> This is probably not what a user expects when they set ASYNC_WAL (We don't
> doc anywhere that I could find clearly what ASYNC_WAL means). Worse, old-time
> users probably associate ASYNC_WAL and DEFERRED_FLUSH, an old
> HTableDescriptor config that was deprecated and replaced by ASYNC_WAL.
> DEFERRED_FLUSH ran a background thread -- LogSyncer -- that on a configurable
> interval, sent a sync down the write pipeline so any outstanding appends
> since last last interval start get pushed out to the DN. ASYNC_WAL doesn't
> do this (see below for history on how we let go of the LogSyncer feature).
> Of note, we always sync meta edits. You can't turn this off. Also, given WALs
> are per regionserver, if other regions on the RS are from tables that have
> sync set, these writes will push out to the DN any appends done on tables
> that have DEFERRED/ASYNC_WAL set.
> To fix, we could do a few things:
> * Simple and comprehensive would be always queuing a sync, even if ASYNC_WAL
> is set but we let go of Handlers as soon as we write the memstore -- we don't
> wait on the sync to complete as we do with the default setting of
> * Be like a 'real' database and add in a sync after N bytes of data have
> been appended (configurable) or after M milliseconds have passed, which ever
> threshold happens first. The size check would be easy. The sync-ever-M-millis
> would mean another thread.
> * Doc what ASYNC_WAL means (and other durability options)
> Let me take a look and report back. Will file a bit of history on how we got
> here in next comment.
This message was sent by Atlassian JIRA