RE: Hadoop 0.19.1

Koji Noguchi Fri, 06 Feb 2009 11:39:12 -0800

Sanjay,

> The team at Yahoo had to scramble to recover the lost data, 
> put in several emergency patches to deal with 
> the append code.
>
This makes it sounds like all of our lost blocks were due to append.
Yes, we did lose many blocks/files in the past few months, but I'd say
it was more of a combination of bugs and not solely from append.

Some that I remember: 

HADOOP-4556: in 0.17.2. Nothing to do with append.
HADOOP-4935: Probably more from the new feature of replicating corrupted
blocks (that I requested...)
and append related issues.

I have no opinion about branching, but can we assume that data loss
could happen again and think about a slightly better snapshot?

Current '-upgrade' almost works as long as I have a 'renew!' button
instead of requiring a dfs restart. 

Koji

-----Original Message-----
From: Sanjay Radia [mailto:sra...@yahoo-inc.com] 
Sent: Thursday, February 05, 2009 4:25 PM
To: core-dev@hadoop.apache.org
Subject: Re: Hadoop 0.19.1

On Feb 4, 2009, at 3:38 AM, Steve Loughran wrote:

> Sanjay Radia wrote:
> >
> > On Feb 2, 2009, at 4:23 PM, Konstantin Shvachko wrote:
> >
> >>
> >>  >  What do you recommend?
> >>
> >> In general. There may be people/organizations, which will not  
> compromise
> >> on the reduced functionality in favor of the stability, this is
> >> understandable.
> >> I would propose to create a separate (unofficial experimental)  
> branch,
> >> which
> >> would track changes like HADOOP-4379. The branch may later either  
> die
> >> when the
> >> main stream is fixed or be merged with the trunk if the changes  
> proved
> >> to be stable.
> >>
> >
> >
> > This is very a interesting suggestion.
> > Many in the team  have come to the conclusion that complex  
> projects like
> > append should be done on a separate branch in the first place and
> > integrated with trunk when the project is stable.
> >
>
> There's a lot to be said for branching; I'm also looking at git so I  
> can
> do my service lifecycle stuff under SCM properly.
>
> but the cost of merging can be high. I'd estimate 1 morning/week is
> spent updating my local SVN and then seeing that everything still  
> works.
> If hudson could both test the branches and test any merged branches,
> life would be better
>

I agree on the cost of merging.
When a project is branched,  after a while one can spend as much as  
30% of cycles merging
in changes.
But when a system is used in production to store data we cannot afford  
to have users loose their data.
The team at Yahoo had to scramble to recover the lost data, put in  
several emergency patches to deal with
the append code.

I am all for extending hudson testing for branches, but hudson  
testing, while helpful, will not be sufficient  for big
projects because hudson does not have a comprehensive set of tests.  
Each new release is tested significantly beyond the hudson tests.

For me the lesson is that large complex projects should be branched.
(This is how commercial software products are engineered).
There will increased cost to the project team, but over all, the  
community  will have more solid releases and the total cost to the  
community  in delivering the techology will be smaller.

sanjay
>
>
> The other problem is incompatible branches: the more branches you have
> live, the higher the merge cost.
>
> That said, Git promises wonderful things, and we ought to be able to  
> set
> up Apache support for git for people wanting to do their own branches
> -svn would still be the official SCM tool
>

RE: Hadoop 0.19.1

Reply via email to