Sanjay, > The team at Yahoo had to scramble to recover the lost data, > put in several emergency patches to deal with > the append code. > This makes it sounds like all of our lost blocks were due to append. Yes, we did lose many blocks/files in the past few months, but I'd say it was more of a combination of bugs and not solely from append.
Some that I remember: HADOOP-4556: in 0.17.2. Nothing to do with append. HADOOP-4935: Probably more from the new feature of replicating corrupted blocks (that I requested...) and append related issues. I have no opinion about branching, but can we assume that data loss could happen again and think about a slightly better snapshot? Current '-upgrade' almost works as long as I have a 'renew!' button instead of requiring a dfs restart. Koji -----Original Message----- From: Sanjay Radia [mailto:sra...@yahoo-inc.com] Sent: Thursday, February 05, 2009 4:25 PM To: core-dev@hadoop.apache.org Subject: Re: Hadoop 0.19.1 On Feb 4, 2009, at 3:38 AM, Steve Loughran wrote: > Sanjay Radia wrote: > > > > On Feb 2, 2009, at 4:23 PM, Konstantin Shvachko wrote: > > > >> > >> > What do you recommend? > >> > >> In general. There may be people/organizations, which will not > compromise > >> on the reduced functionality in favor of the stability, this is > >> understandable. > >> I would propose to create a separate (unofficial experimental) > branch, > >> which > >> would track changes like HADOOP-4379. The branch may later either > die > >> when the > >> main stream is fixed or be merged with the trunk if the changes > proved > >> to be stable. > >> > > > > > > This is very a interesting suggestion. > > Many in the team have come to the conclusion that complex > projects like > > append should be done on a separate branch in the first place and > > integrated with trunk when the project is stable. > > > > There's a lot to be said for branching; I'm also looking at git so I > can > do my service lifecycle stuff under SCM properly. > > but the cost of merging can be high. I'd estimate 1 morning/week is > spent updating my local SVN and then seeing that everything still > works. > If hudson could both test the branches and test any merged branches, > life would be better > I agree on the cost of merging. When a project is branched, after a while one can spend as much as 30% of cycles merging in changes. But when a system is used in production to store data we cannot afford to have users loose their data. The team at Yahoo had to scramble to recover the lost data, put in several emergency patches to deal with the append code. I am all for extending hudson testing for branches, but hudson testing, while helpful, will not be sufficient for big projects because hudson does not have a comprehensive set of tests. Each new release is tested significantly beyond the hudson tests. For me the lesson is that large complex projects should be branched. (This is how commercial software products are engineered). There will increased cost to the project team, but over all, the community will have more solid releases and the total cost to the community in delivering the techology will be smaller. sanjay > > > The other problem is incompatible branches: the more branches you have > live, the higher the merge cost. > > That said, Git promises wonderful things, and we ought to be able to > set > up Apache support for git for people wanting to do their own branches > -svn would still be the official SCM tool >