2010/5/16 Tatsuya Kawano <tatsuya6...@gmail.com>

> 2. On Hadoop trunk, I'd prefer not to hflush() every single put, but rely
> on un-flushed replicas on HDFS nodes, so I can avoid the performace penalty.
> Will this still durable? Will HMaster see un-flushed appends right after a
> region server failure?
>
>
If you don't call hflush(), you can still lose edits up to the last block
boundary, since hflush is required to persist block locations to the
namenode.

hflush() does *not* sync to disk - it just makes sure that the edits are in
memory on all of the replicas.

I have some patches staged for CDH3 that will also make the performance of
this quite competitive by pipelining hflushes - basically it has little to
no effect on throughput, but only a few ms penalty on each write.

-Todd


-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to