On Apr 23, 2010, at 10:38 PM, Chris Douglas wrote:

>> I'm not proposing any more back- or forward-porting than will be done anyway.
>> 
> 
> Because trunk is the shared repository that contains the security
> work. And a working append. And dozens of smaller, but important
> features including the 1.0 APIs. Symlinks. Optimizations to the
> shuffle. Splittable bzip compression. Stability and scalability fixes
> to the NameNode and JobTracker. Unicorns and happiness.
> 

I'm for anything that gets all the goodies above out in a release.  I don't 
care if they all get in one release or if its spread out over 2 or 3.
Right now, about 1/4 of the above (e.g. happiness, but no unicorns) is in 
CDH2/3.  Trunk has stalled, getting new --  CORE -- features requires using 
other branches. 

Although I would like to see the changes that these other branches have in 
apache's SVN, they belong in trunk.  0.20 is old already.  Its the old, stable 
branch now and new stuff should go into newer releases.  I've been waiting for 
things like the Shuffle refactor (30% performance improvement for some of my 
job flows) for a long time.

Just because Y! is not going to upgrade their deployment past their branch for 
a long time does not mean the rest of the community has to wait.  I lived on 
0.19.2 in production until very recently -- it became a solid branch without Y! 
or Facebook.  Without the same testing muscle, it might take 1 or two more 
minor releases to stabilize, but the community's release schedule IMO 
desperately needs to become more independent of the biggest players.  

Trunk should be moved forward and incorporate Cloudera and Yahoo's improvements 
aggressively.  Its OK to have a 0.x.0 release that isn't completely stable yet, 
or backed by the biggest users.   It is important to incorporate improvements 
made by productive contributors into actual releases in a timely fashion, or 
else those contributors will roll their own versions and eventually diverge 
significantly from the community rather than wait to get value from their work.


> Stabilizing, packaging, and testing trunk is drudgery, but it can be shared.
> 
> I can see the value in restarting collaboration between major
> contributors by reestablishing a common branch, and 0.20 will probably
> be more successful in that respect, at least earlier. However, I
> continue to oppose sinking combined energy into 0.20 at the expense of
> trunk, for reasons already discussed at length. -C
> 

I would love to see an apache release with new, useful features and 
enhancements.  That could be a 0.20 with all or most of the Y! and Cloudera 
stuff in there.  However, if any such effort slows down progress on trunk -- 
forget it.  Get a 0.21 or 0.22 out with whatever features are ready, and move 
the ball forward on trunk.   We should not encourage 0.20 to live forever.  

0.21 and 0.22 should be releases that are compelling enough for Y!, Cloudera, 
and anyone else with their own customizations to want to move to for their own 
sake.


>> Doug
>> 

Reply via email to