Inline. J-D
On Mon, Jan 11, 2010 at 8:12 PM, Joydeep Sarma <jssa...@apache.org> wrote: > ok - hadn't thought about it that way - but yeah with a default of 1 - > the semantics seem correct. > > under high load - some batching would automatically happen at this > setting (or so one would think - not sure if hdfs appends are blocked > on pending syncs (in which case the batching wouldn't quite happen i > think) - cc'ing Dhruba). Yes this is our version of group commit. > > if the performance with setting of 1 doesn't work out - we may need an > option to delay acks until actual syncs .. (most likely we would be > able to compromise on latency to get higher throughput - but wouldn't > be willing to compromise on data integrity) Good idea, we don't currently support that feature although we have the opposite running by default which is deferred log flush. Tables are never sync'ed and they rely on the LogSyncer thread awaitNanos' timeout (configurable) or tables that are highly durable. In our opinion, a cluster with a healthy mix of deferred and non-deferred tables still guarantees a very high level of durability for the default setting.