Agreed.

I think we moved to trunk because of lazy serde from what Zheng tells me (I was 
out of office when this happened)...

Regarding performance fixes, I would rather categorize performance regressions 
as blocker bugs and keep performance improvements as features. By that measure 
I think lazy serde was fine as a feature. I think we should just have let 0.2 
stabilize and deployed lazy serde when we released 0.2 and cut out a 0.3 branch 
and moved our systems to 0.3. Keeping the criteria for what gets categorized as 
a blocker tight is quite critical otherwise we will always be in danger of a 
constant feature creep and that would totally defeat the purpose of 
stabilization. In any case if we had been able to stabilize in a months time 
say for 0.2, I do not think the users would be too unhappy to get the lazy 
serde a month late. So from that token I would not categorize it to be a 
blocker as such.

One constant problem is that the best stress testing environment that we have 
for Hive right now is our production work load at FB. So I am not sure whether 
we can have a certificate of stability to a branch if we at FB pull in patches 
and run a version that is different from the release. Though of course others 
are always free to get the patches from the JIRA and apply them as they see 
fit. I am not sure how to address this. Thoughts?

Ashish

-----Original Message-----
From: Joydeep Sen Sarma [mailto:jssa...@facebook.com] 
Sent: Tuesday, March 10, 2009 11:37 AM
To: hive-dev@hadoop.apache.org
Subject: RE: branching Hive and getting to first release

I am in general agreement - but the problems is the mail below doesn't explain 
why trunk was deployed.

Performance fixes are like critical bugs. We cannot run a production cluster 
that's hurting for performance on non-performant software. To that extent - it 
was a mistake for us to consider lazyserde to be a 'feature' (which is why we 
didn't back-port it to 0.2). so is hive-223 for example - we just need to have 
it asap in deployment - and by conventional definition - it certainly wasn't a 
regression that would go into a bug fix branch. I suspect there may be more 
such jiras.

One way of looking at this is that we either branched too early, or we need to 
reconsider what goes into a branch.

The other way to look at this is that every cluster administrator (including 
the one at Facebook - who is just like any user of Hive) - needs to have the 
option to pull in latest patches that are critical to his/her deployment. The 
success of Hive and the happiness of it's internal Facebook users should not 
and cannot be at odds with each other.


-----Original Message-----
From: Ashish Thusoo [mailto:athu...@facebook.com]
Sent: Tuesday, March 10, 2009 11:08 AM
To: hive-dev@hadoop.apache.org
Subject: RE: branching Hive and getting to first release

I think a big reason for what killed 0.2 was the fact that we decided to deploy 
trunk into production because of some features that the internal users were 
asking for, instead of just continuing with the 0.2 branch. What I want to 
stress is that we cannot do that going forward. Once we branch out 0.3, we have 
to let 0.3 soak in production till we have atleast 2 weeks of run with no 
blockers (I did not mean that we will just certify a branch to be a relase 
after 2 weeks - what I meant was that we have at least 2 weeks of run with no 
blockers) before we cut out a release from the branch. Again I must stress that 
we have to continue deploying the candidate branch into production and we 
cannot move the production machines to trunk as that will completely kill the 
branch (as happened with 0.2). We have to realy isolate blocker bug fixes from 
features and we have to understand that we cannot role out features overnight 
(as we have done so far for our users at FB) as doing that will make it 
absolutely hopeless in getting any branch stable.

Having said that, we could move to a model where we make a new branch (not a 
release) from trunk once the previous candidate branch is released instead of 
having a train of branches at every 2 weeks. I am fine with that too. What is 
perhaps more critical is that we have a firm commitment that we are not going 
to deploy new features into production till we stabilize 0.3 and we should set 
the expectations accordingly...

Ashish

-----Original Message-----
From: Johan Oskarsson [mailto:jo...@oskarsson.nu]
Sent: Tuesday, March 10, 2009 9:52 AM
To: hive-dev@hadoop.apache.org
Subject: Re: branching Hive and getting to first release

+1, sounds like a solid plan.

Joydeep Sen Sarma wrote:
> I am also a little worried about a lot of releases and managing them. perhaps 
> what's clouding my judgement is that there are a lot of critical bugs yet to 
> be fixed - so I don't see how we can stabilize the first release in a couple 
> of weeks - or even a month (which is what killed 0.2 I think to some extent).
> 
> I would say that the first release is somewhat special. We are fixing a 
> boatload of issues from a very large push of code (all of it!). In subsequent 
> releases - there wouldn't be as many bugs - and a faster release cycle would 
> be feasible.
> 
> So my vote would be to branch now (before predicate push down), get the 
> release stable as fast as possible (but potentially wait as long as it takes) 
> - and then only start cutting more branches. Over time - we can converge to a 
> faster release cycle - but right now this seems dubious to me.
> 
> Can't put a newborn into kindergarten directly man .. :-)
> 
> -----Original Message-----
> From: Johan Oskarsson [mailto:jo...@oskarsson.nu]
> Sent: Tuesday, March 10, 2009 3:43 AM
> To: hive-dev@hadoop.apache.org
> Subject: Re: branching Hive and getting to first release
> 
> I'm worried that trying to create a new release every other week will 
> be too often. Isn't there a risk that we're still fixing bugs in 0.3 
> when the 0.5 branch is cut if we run into something unexpected?
> It seems Hadoop is suffering from this issue a bit lately even though 
> they branch quarterly, 0.19 still have lots of issues open when people 
> are committing patches to 0.21 (trunk). Granted Hadoop is a much 
> larger codebase with more patches applied.
> 
> That said, I won't oppose trying the period suggested and see how it 
> goes, it's quite easy to change after all.
> 
> /Johan
> 
> Ashish Thusoo wrote:
>> For 0.2 we had set a feature freeze date on the 28th of Jan and as I 
>> had mentioned in the previous email, the plan was cut a branch on the last 
>> wednesday of every month and then issue a vote for making it a release once 
>> it ran satisfactorily (no blocker bugs) for atleast 2 weeks @ facebook. 
>> Accordingly I was hoping that we would limit the changes that would go into 
>> the branch (0.2) in this case to the blocker bugs only but it seems that we 
>> had some feature creep and as a result we switched to using trunk at 
>> facebook without giving sufficient time for 0.2 to stabilize. It also means 
>> that perhaps waiting for a month for each release is too long at this stage 
>> at least for FB. If others are in agreement, how about we do the following 
>> going forward..
>>
>>
>> Cut a branch every other wednesday, only checkin the most ciritcal blocker 
>> bugs into the branch and reserve the features for trunk which will be picked 
>> up in the next branch and relegiously deploy only the versions of the branch 
>> at FB. We can start off a vote to make a branch an official release once we 
>> have atleast 2 weeks of run on the branch without any blocker bugs (i.e. we 
>> did not have a need to upgrade the production machines at FB).
>>
>> We can start off by creating a 0.3 branch this wednesday accordingly...
>>
>> Once we have an agreement on this we can document this procedure on the wiki 
>> and religiously follow it. Without controlling the tendency of a feature 
>> creep it would be difficult to get a stable version out...
>>
>> Thoughts?
>>
>> Ashish
>>
>>
>>
>> -----Original Message-----
>> From: Johan Oskarsson [mailto:jo...@oskarsson.nu]
>> Sent: Tuesday, March 03, 2009 2:54 AM
>> To: hive-dev@hadoop.apache.org
>> Subject: Re: branching Hive and getting to first release
>>
>> To be honest I must've missed that 0.2 was branched (I found the email now 
>> though), was there a feature freeze date set?
>>
>> After branching shouldn't we have moved the non critical issues to 0.3 and 
>> pushed for fixing the remaining bugs in order to release?
>>
>> That aside, I don't have a strong opinion whether the next release is
>> 0.2 or 0.3, since there hasn't been an Apache release yet. How about setting 
>> a feature freeze date now and take it from there?
>>
>> /Johan
>>
>> Joydeep Sen Sarma wrote:
>>> Hey folks,
>>>
>>> A few of us were chatting earlier today (some Facebook and Cloudera folks) 
>>> on best approach to get to a first Hive release.
>>>
>>> While 0.2 has been branched - it seems awkward to base the first release on 
>>> it. The reason is twofold:
>>>
>>> -          new changes to trunk since 0.2 have been relatively contained 
>>> AFAIK (so no added instability). As evidence - Facebook has reverted to 
>>> running trunk in production for the last week or so.
>>> -          the changes that have gone into trunk since 0.2 are extremely 
>>> important from performance perspective. This includes the LazySerDe that 
>>> Zheng added and upcoming hive-232.
>>>
>>> So one proposal is to branch 0.3 at this point and try to make that first 
>>> official release for Hive.
>>>
>>> This does look a little haphazard - and the natural question is whether we 
>>> can stick to this (or we end up repeating this once we throw in some more 
>>> goodies). The feeling is that this may be a good time - hive-279 has major 
>>> changes to the hive compiler and branching 0.3 before those changes are 
>>> checked in gives us a good chance of producing a stable release with good 
>>> performance (and the major changes will probably prevent us from repeating 
>>> this trick going forward :)).
>>>
>>> What do people think?
>>>
>>> Joydeep
>>>
> 

Reply via email to