Hi, 

On 05/17/2010, at 11:50 PM, Todd Lipcon wrote:

> 2010/5/16 Tatsuya Kawano <tatsuya6...@gmail.com>
> 
>> 2. On Hadoop trunk, I'd prefer not to hflush() every single put, but rely
>> on un-flushed replicas on HDFS nodes, so I can avoid the performace penalty.
>> Will this still durable? Will HMaster see un-flushed appends right after a
>> region server failure?
>> 
>> 
> If you don't call hflush(), you can still lose edits up to the last block
> boundary, since hflush is required to persist block locations to the
> namenode.
> 
> hflush() does *not* sync to disk - it just makes sure that the edits are in
> memory on all of the replicas.
> 
> I have some patches staged for CDH3 that will also make the performance of
> this quite competitive by pipelining hflushes - basically it has little to
> no effect on throughput, but only a few ms penalty on each write.


Thanks Todd. I thought hflush() does sync to disk and I was wrong. It seems the 
stuff you put on CDH3 is just the one I wanted! 

Is your stuff already on the current CDH3 beta?



On 05/17/2010, at 2:22 PM, Ryan Rawson wrote:

> 2010/5/16 Tatsuya Kawano <tatsuya6...@gmail.com>:
>> 1. On Hadoop 0.20.x (without HDFS-200 patch), I must close HLog to make it's
>> entries durable, right? While rolling HLog does this, how about region
>> server failure?
> 
> The problem is that during failure how do you execute user code?  If
> the JVM segfaults hard, we have no opportunity to execute Java code.

Thanks Ryan. That's right. And OS can crash by a hardware failure (memory, cpu) 
and network can be disconnected at anytime. In those cases, we don't have any 
opportunity to execute Java code. 

Is there anything the data node can do after detecting client timeout? 

And how much edits I could lose? If a log-roll never happens, is it going to be 
up to dfs.block.size (64MB by default)? 


Thanks, 
Tatsuya

-- 
河野 達也
Tatsuya Kawano (mr.)
Tokyo, Japan


Reply via email to