Re: HLog durabilty on the current and future Hadoop releases

Todd Lipcon Mon, 17 May 2010 18:45:21 -0700

2010/5/17 Tatsuya Kawano <tatsuya6...@gmail.com>

>
> Hi,
>
> On 05/17/2010, at 11:50 PM, Todd Lipcon wrote:
>
> > 2010/5/16 Tatsuya Kawano <tatsuya6...@gmail.com>
> >
> >> 2. On Hadoop trunk, I'd prefer not to hflush() every single put, but
> rely
> >> on un-flushed replicas on HDFS nodes, so I can avoid the performace
> penalty.
> >> Will this still durable? Will HMaster see un-flushed appends right after
> a
> >> region server failure?
> >>
> >>
> > If you don't call hflush(), you can still lose edits up to the last block
> > boundary, since hflush is required to persist block locations to the
> > namenode.
> >
> > hflush() does *not* sync to disk - it just makes sure that the edits are
> in
> > memory on all of the replicas.
> >
> > I have some patches staged for CDH3 that will also make the performance
> of
> > this quite competitive by pipelining hflushes - basically it has little
> to
> > no effect on throughput, but only a few ms penalty on each write.
>
>
> Thanks Todd. I thought hflush() does sync to disk and I was wrong. It seems
> the stuff you put on CDH3 is just the one I wanted!
>
> Is your stuff already on the current CDH3 beta?
>


Not yet - still undergoing testing in my "lab" cluster. We should have a
beta out next month.


>
>
>
> On 05/17/2010, at 2:22 PM, Ryan Rawson wrote:
>
> > 2010/5/16 Tatsuya Kawano <tatsuya6...@gmail.com>:
> >> 1. On Hadoop 0.20.x (without HDFS-200 patch), I must close HLog to make
> it's
> >> entries durable, right? While rolling HLog does this, how about region
> >> server failure?
> >
> > The problem is that during failure how do you execute user code?  If
> > the JVM segfaults hard, we have no opportunity to execute Java code.
>
> Thanks Ryan. That's right. And OS can crash by a hardware failure (memory,
> cpu) and network can be disconnected at anytime. In those cases, we don't
> have any opportunity to execute Java code.
>
> Is there anything the data node can do after detecting client timeout?
>
> And how much edits I could lose? If a log-roll never happens, is it going
> to be up to dfs.block.size (64MB by default)?
>
>
> Thanks,
> Tatsuya
>
> --
> 河野 達也
> Tatsuya Kawano (mr.)
> Tokyo, Japan
>
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

Re: HLog durabilty on the current and future Hadoop releases

Reply via email to