Re: Hadoop 0.19.1

Doug Judd Mon, 02 Feb 2009 18:19:01 -0800

Comments inline ...

On Mon, Feb 2, 2009 at 4:23 PM, Konstantin Shvachko <s...@yahoo-inc.com>wrote:

> >  What do you recommend?
>
> In general. There may be people/organizations, which will not compromise
> on the reduced functionality in favor of the stability, this is
> understandable.
> I would propose to create a separate (unofficial experimental) branch,
> which
> would track changes like HADOOP-4379. The branch may later either die when
> the
> main stream is fixed or be merged with the trunk if the changes proved to
> be stable.

Sure, that sounds reasonable.  One thing I would caution against is spending
a lot of time doing incremental patchwork on something that needs a
ground-up overhaul.  I would much rather wait a couple of months longer and
get software that is based on a well thought out design that is
fundamentally sound.  Ultimately that will be the fastest path to stability.

>
> >1. the file length (as returned by getFileStatus) is incorrect

> May be the following work around will be useful.
> If you read from a file you always try to read more data than the length
> reported
> by the name-node. How much more? The size of one block would be enough, or
> even to the next (ceiling) block boundary.

I could certainly implement a workaround, however, from an API standpoint,
the filesystem (IMHO) should always give you a way to obtain the real length
of the file.  The semantics of the current getFileStatus() make it difficult
to reason about the state of your filesystem.  It basically returns a
"possibly stale" version of the length.  I would prefer to wait for an
implementation that gives an accurate answer and spend my time and energy
helping to test that one, rather than spending a bunch of time implementing
a workaround for the current version.

>2. When an application comes up after a crash, it seems to hang for about
> 60
>
> Don't have enough context on that, sorry.

I spoke too soon on this.  The reason that HDFS was hanging on lease
recovery was because I was opening the file in append mode to force lease
recovery (at Dhruba's suggestion) so that it would update the NameNode with
the proper length.  If I had a method of obtaining the accurate length of
the file, I wouldn't need to do this.  Hence, I didn't bother filing an
issue on this.

- Doug

> Thanks,
> --Konstantin
>
> Doug Judd wrote:
>
>> Sounds good.  I would much rather wait and have fsync() done correctly in
>> 0.20 than get some sort of hacked version in 0.19.  I'll create a couple
>> of
>> issues and mark them for 0.20  Thanks.
>>
>> - Doug
>>
>> On Mon, Feb 2, 2009 at 1:51 PM, Owen O'Malley <omal...@apache.org> wrote:
>>
>>  On Feb 2, 2009, at 12:51 PM, Doug Judd wrote:
>>>
>>>  What do you recommend?  Is there anyway we could get these two issues
>>>
>>>> fixed
>>>> for 0.19.1, or should I file issues for them and get them on the
>>>> schedule
>>>> for 0.19.2?
>>>>
>>>>  Given the outstanding problems and general level of uncertainty, I'd
>>> favor
>>> releasing a 0.19.1 with the equivalent of the 0.18.3 disable on fsync and
>>> append. Let's get them fixed in 0.20 first and then we can debate whether
>>> the rewards of pushing them back into an 0.19.2 would make sense. I'm
>>> pretty
>>> uncomfortable at the moment with how the entire functional complex seems
>>> to
>>> cause a continuous stream of problems.
>>>
>>> -- Owen
>>>
>>>
>>

Re: Hadoop 0.19.1

Reply via email to