In the end, I think, boils down to the established process. Anybody can open a jira and propose a patch. If it gets +1's from a few committers and no -1's we should commit it. As I said on HBASE-7965, if we cannot convince Jon and Elliot that this is safe to do, we should not do it (either because Enis and I agree, or because Jon -1's it). No hard feeling either way, I hope (none from my side at least).
It seems we're mostly in agreement and just differ a bit in what constitutes a feature vs. a bug fix. -- Lars ________________________________ From: Jonathan Hsieh <[email protected]> To: [email protected] Cc: lars hofhansl <[email protected]> Sent: Saturday, March 2, 2013 8:26 AM Subject: Re: [DISCUSS] More new feature backports to 0.94. To be clear, a key point is that unit testing is a required but not sufficient. I need anecdotes about system testing with at least some unexpected fault handling and stress. If the feature is actively being developed still, go into a dev branch (git hub or svn) that eventually merges. Some info about perf would be nice as well if that is affected. In cases that aren't too burdensome, I would prefer consecutive individual commits to a stable branch as opposed to a single mega patch. This of course is a case-by-case decision. (snapshots is about 80 patches.. way too burdensome). Jon. On Sat, Mar 2, 2013 at 8:14 AM, Ted Yu <[email protected]> wrote: bq. I would want to see this feature come in as a big bang -- get it > >complete enough in trunk before backporting the pieces to a stable branch. > >I agree with Jon on this point. >Porting in one big patch allows us to think through related use cases. >Another benefit is that there wouldn't be glitch in API, in case the first >batch of backports went into 0.94.x and the second batch goes into 0.94.x+1 >Running the feature through test suite in trunk continuously gives us time >to discover defects before the backport. > >Cheers > > >On Sat, Mar 2, 2013 at 7:36 AM, Jonathan Hsieh <[email protected]> wrote: > >> In general, I have a preference against backporting features for the >> reasons that Enis, Elliott, and Jean-Marc consider valid. To be clear, >> this preference doesn't mean I am -1 to all backports onto the stable >> apache branch. Let's do it case-by-case; my main ask is to make major >> backports rare and to make it the norm to require significantly more >> evidence of testing than usual. I will -1 a major backport that lacks this >> evidence. This will come up again in the future. >> >> With the cases Lars proposed -- I prefer #3 (just say no) but find #1 (be >> very careful) acceptable given higher level of evidence. #2 (new release >> branch) is onerous -- I'd rather we just get preview-release branches out >> more frequently to not have deal with this. Arguably, the reason we have >> the preview-release branches serve the purpose of getting releases out more >> frequently and giving a feature time to harden from a few common points. >> My hope is that these preview release will replace what were the 0.x.0 and >> 0.x.1 releases from previous versions >> >> So what kind of evidence would I like to see? We can use snapshots case as >> an example. >> >> When backporting snapshots was brought up, I actually preferred that we not >> backport that feature. There was demand, so we agreed that we'd do it but >> no backport it until it is "rock solid". Here's evidence to support the >> case that the feature and backport is solid: >> * It's code history is publicly documented and has been available since >> December. >> * It's design documentation has been available for even longer. >> * The feature is mostly additive and doesn't affect vital paths. >> * It was tested against trunk and the later tested against a 0.94 variant >> that is closer to the target apache branch. >> * The version in the trunk branch has been reviewed by 5 committers. >> * Limitations are either documented (please let me know if we should >> improve it more) or non-critical. >> * Testing and hardening anecdotes have been documented in the original and >> backport jira. There has been some relatively long term testing and fault >> injection testing (roughly 4-6 weeks). >> * It will be backported in a "big bang" -- all pieces get added or none >> will. >> >> This is a level I consider to be stronger than the normal testing expected >> for a patch. Ideally, something at least this level is what I would expect >> for other major backports. Do we agree on that? >> >> For the table locks case, there maybe some of this may be a misperception >> in timing from my point of view. I see a notification about this in jira >> which makes me think it is more imminent. Looking into it, I see that >> currently the development and application of the zk table lock feature >> isn't complete -- the mechanism is committed but it isn't applied and >> integrated into all the operations (split, assign etc still on the way). >> I've asked for documentation and Enis has graciously added a great design >> doc that will help reviewers understand it. I'd love to be able to spend >> time system testing to really beating it up or at least have anecdotes from >> folks about their efforts on the apache verison. Finally, I would want to >> see this feature come in as a big bang -- get it complete enough in trunk >> before backporting the pieces to a stable branch. >> >> I haven't invested time into the online merge backport decision but my >> instinct there is to not port the feature as well. It is less risky since >> it is an additive feature but has less reward since we already have a >> less-friendly-but-comparable mechanism. Since merge seems similar to split >> (which took a while to get right) testing its correctness in failure cases >> at the system level would be a prereq. >> >> Jon. >> >> On Sat, Mar 2, 2013 at 3:43 AM, Nicolas Liochon <[email protected]> wrote: >> >> > New feature is a red herring imho: To me the only question is the >> > regression risk.. And a feature can have a much lower regression risk >> than >> > a bug fix. I guess we've all seen a fix for a non critical bug putting >> down >> > a production system. Being able to backport features is a competitive >> > advantage that leverages on a good architecture and a good test suite. >> > Maintaining a branch adds a cost for everybody: if you have a bug to fix >> in >> > 94.6.1, you need to fix it in 0.94.7 as well. So we should do it only if >> we >> > really have to, and plan it accordingly (i.e. we should not have to >> create >> > a 0.94.7.1 a week after the creation of the 0.94.6.1). >> > >> > In the future, the test suite should also help us to estimate and >> minimize >> > the risk. We're not there yet, but having a good test coverage is key for >> > version 1 imho. >> > >> > So that makes me +1 for backport, and 0 for branching (+1 if there is a >> > good reason and a plan, but here it's a theoretical discussion, so,... >> ;-) >> > ) >> > >> > Nicolas >> > >> > >> > On Sat, Mar 2, 2013 at 4:44 AM, lars hofhansl <[email protected]> wrote: >> > >> > > I did mean "stablizing". What I was trying to point is that stuff we >> > > backport might stabilize HBase. >> > > >> > > >> > > >> > > ________________________________ >> > > From: Ted Yu <[email protected]> >> > > To: [email protected]; lars hofhansl <[email protected]> >> > > Sent: Friday, March 1, 2013 7:30 PM >> > > Subject: Re: [DISCUSS] More new feature backports to 0.94. >> > > >> > > bq. That is only if we do not backport stabilizing "features". >> > > Did you mean destabilizing above :-) >> > > >> > > My preference is option #1. With option #2, the community would be >> > dealing >> > > with one more branch which would increase the amount of work validating >> > > each release candidate. >> > > >> > > To me, the difference between option #2 and the upcoming release >> > candidates >> > > of 0.95 would blur. >> > > >> > > Cheers >> > > >> > > On Fri, Mar 1, 2013 at 7:24 PM, lars hofhansl <[email protected]> >> wrote: >> > > >> > > > That is only if we do not backport stabilizing "features". There is >> an >> > > > "opportunity cost" to be paid if we take a too rigorous approach too. >> > > > >> > > > Take >> > > > for example table-locks (which prompted this discussion). With that >> in >> > > > place we can do safe online schema changes (that won't fail and leave >> > > > the table in an undefined state when a concurrent split happens), it >> > > > also allows for online merge. >> > > > >> > > > Now, is that a destabilizing >> > > > "feature", or will it make HBase more stable and hence is an >> > > > "improvement"? Depends on viewpoint, doesn't it? >> > > > -- Lars >> > > > >> > > > >> > > > ________________________________ >> > > > From: Jean-Marc Spaggiari <[email protected]> >> > > > To: [email protected] >> > > > Sent: Friday, March 1, 2013 7:12 PM >> > > > Subject: Re: [DISCUSS] More new feature backports to 0.94. >> > > > >> > > > @Lars: No, not any concern about anything already backported. Just a >> > > > preference to #2 because it seems to make things more stable and >> > > > easier to manage. New feature = new release. Given new sub-releases >> > > > are for fixes and improvements, but not new features. Also, if we >> > > > backport a feature in one or many previous releases, we will have to >> > > > backport also all the fixes each time there will be an issue. So we >> > > > will have more maintenant work on previous releases. >> > > > >> > > > 2013/3/1 Enis Söztutar <[email protected]>: >> > > > > I think the current way of risk vs rewards analysis is working >> well. >> > We >> > > > > will just continue doing that on a case by case basis, discussing >> the >> > > > > implications on individual issues. >> > > > > >> > > > > >> > > > > >> > > > > On Fri, Mar 1, 2013 at 6:46 PM, Lars Hofhansl <[email protected] >> > >> > > > wrote: >> > > > > >> > > > >> BTW are you concerned about any specific back port we did in the >> > past? >> > > > So >> > > > >> far we have not seen any destabilization in any of the 0.94 >> > releases. >> > > > >> >> > > > >> Jean-Marc Spaggiari <[email protected]> wrote: >> > > > >> >> > > > >> >Hi Lars, #2, does it mean you will stop back-porting the new >> > features >> > > > >> >when it will become a "long-term" release? If so, I'm for option >> > > #2... >> > > > >> > >> > > > >> >JM >> > > > >> > >> > > > >> >In your option >> > > > >> >2013/3/1 Enis Söztutar <[email protected]>: >> > > > >> >> Thanks Lars, I think it is a good listing of the options we >> have. >> > > > >> >> >> > > > >> >> I'll be +1 for #1 and #2, with #1 being a preference. >> > > > >> >> >> > > > >> >> Enis >> > > > >> >> >> > > > >> >> >> > > > >> >> On Fri, Mar 1, 2013 at 6:10 PM, lars hofhansl < >> [email protected]> >> > > > wrote: >> > > > >> >> >> > > > >> >>> So it seems that until we have a stable 0.96 (maybe 0.96.1 or >> > > > 0.96.2) >> > > > >> we >> > > > >> >>> have three options: >> > > > >> >>> 1. Backport new features to 0.94 as we see fit as long as we >> do >> > > not >> > > > >> >>> destabilize 0.94. >> > > > >> >>> 2. Declare a certain point release (0.94.6 looks like a good >> > > > >> candidate) as >> > > > >> >>> a "long term", create an 0.94.6 branch (in addition to the >> usual >> > > > 0.94.6 >> > > > >> >>> tag) and than create 0.94.6.x fix only releases. I would >> > volunteer >> > > > to >> > > > >> >>> maintain a 0.94.6 branch in addition to the 0.94 branch. >> > > > >> >>> 3. Categorically do not backport new features into 0.94 and >> > defer >> > > to >> > > > >> 0.95. >> > > > >> >>> >> > > > >> >>> I'd be +1 on option #1 and #2, and -1 on option #3. >> > > > >> >>> >> > > > >> >>> -- Lars >> > > > >> >>> >> > > > >> >>> >> > > > >> >>> >> > > > >> >>> ________________________________ >> > > > >> >>> From: Jonathan Hsieh <[email protected]> >> > > > >> >>> To: [email protected]; lars hofhansl <[email protected]> >> > > > >> >>> Sent: Friday, March 1, 2013 3:11 PM >> > > > >> >>> Subject: Re: [DISCUSS] More new feature backports to 0.94. >> > > > >> >>> >> > > > >> >>> I think we are basically agreeing -- my primary concern is >> > > bringing >> > > > new >> > > > >> >>> features in vital paths introduces more risk, I'd rather not >> > > > backport >> > > > >> major >> > > > >> >>> new features unless we achieve a higher level of assurance >> > through >> > > > >> system >> > > > >> >>> and basic fault injection testing. >> > > > >> >>> >> > > > >> >>> For the three current examples -- snapshots, zk table locks, >> > > online >> > > > >> merge >> > > > >> >>> -- I actually would prefer not including any in apache 0.94. >> Of >> > > the >> > > > >> bunch, >> > > > >> >>> I feel the table locks are the most risky since it affects >> vital >> > > > paths >> > > > >> a >> > > > >> >>> user must use, where as snapshots and online merge are >> features >> > > > that a >> > > > >> >>> user could choose to use but does not necessarily have to use. >> > > I'll >> > > > >> voice >> > > > >> >>> my concerns, reason for concerns, and justifications on the >> > > > individual >> > > > >> >>> jiras. >> > > > >> >>> >> > > > >> >>> I do feel that new features being in a dev/preview release >> like >> > > 0.95 >> > > > >> aligns >> > > > >> >>> well and doesn't create situations where different versions >> have >> > > > >> different >> > > > >> >>> feature sets. New features should be introduced and hardened >> > in a >> > > > >> >>> dev/preview version, and the turn into the production ready >> > > versions >> > > > >> after >> > > > >> >>> they've been proven out a bit. >> > > > >> >>> >> > > > >> >>> Jon. >> > > > >> >>> >> > > > >> >>> On Fri, Mar 1, 2013 at 11:00 AM, lars hofhansl < >> > [email protected]> >> > > > >> wrote: >> > > > >> >>> >> > > > >> >>> > This is an open source project, as long as there is a >> > volunteer >> > > to >> > > > >> >>> > backport a patch I see no problem with doing this. >> > > > >> >>> > The only thing we as the community should ensure is that it >> > must >> > > > be >> > > > >> >>> > demonstrated that the patch does not destabilize the 0.94 >> code >> > > > base; >> > > > >> that >> > > > >> >>> > has to be done on a case by case basis. >> > > > >> >>> > >> > > > >> >>> > >> > > > >> >>> > Also, there is no stable release of HBase other than 0.94 >> > (0.95 >> > > is >> > > > >> not >> > > > >> >>> > stable, and we specifically state that it should not be used >> > in >> > > > >> >>> production). >> > > > >> >>> > >> > > > >> >>> > -- Lars >> > > > >> >>> > >> > > > >> >>> > >> > > > >> >>> > >> > > > >> >>> > ________________________________ >> > > > >> >>> > From: Jonathan Hsieh <[email protected]> >> > > > >> >>> > To: [email protected] >> > > > >> >>> > Sent: Friday, March 1, 2013 8:31 AM >> > > > >> >>> > Subject: [DISCUSS] More new feature backports to 0.94. >> > > > >> >>> > >> > > > >> >>> > I was thinking more about HBASE-7360 (backport snapshots to >> > > 0.94) >> > > > and >> > > > >> >>> also >> > > > >> >>> > saw HBASE-7965 which suggests porting some major-ish >> features >> > > > (table >> > > > >> >>> locks, >> > > > >> >>> > online merge) in to the apache 0.94 line. We should chat >> > about >> > > > >> what we >> > > > >> >>> > want to do about new features and bringing them into stable >> > > > versions >> > > > >> >>> (0.94 >> > > > >> >>> > today) and in general criteria we use for future versions. >> > > > >> >>> > >> > > > >> >>> > This is similar to the snapshots backport discussion and >> > earlier >> > > > >> backport >> > > > >> >>> > discussions. Here's my understanding of high level points >> we >> > > > >> basically >> > > > >> >>> > agree upon. >> > > > >> >>> > * Backporting new features to the previous major version >> > incurs >> > > > more >> > > > >> cost >> > > > >> >>> > when developing new features, pushes back efforts on making >> > the >> > > > >> trunk >> > > > >> >>> > versions and reduces incentive to move to newer versions. >> > > > >> >>> > * Backporting new features to earlier versions (0.9x.0, >> > 0.9x.1) >> > > is >> > > > >> >>> > reasonable since they are generally less stable. >> > > > >> >>> > * Backporting new features to later version (0.9x.5, 0.9x.6) >> > is >> > > > less >> > > > >> >>> > reasonable -- (ex: a 0.94.6, or 0.94.7 should only include >> > > robust >> > > > >> >>> > features). >> > > > >> >>> > * Backporting orthogonal features (snapshots) seems less >> risky >> > > > than >> > > > >> core >> > > > >> >>> > changing features >> > > > >> >>> > * An except: If multiple distributions declare intent to >> > > > backport, it >> > > > >> >>> makes >> > > > >> >>> > sense to backport a feature. (snapshots for example). >> > > > >> >>> > >> > > > >> >>> > Some new circumstances and discussion topics: >> > > > >> >>> > * We now have a dev branch (0.95) with looser compat >> > > requirements >> > > > >> that we >> > > > >> >>> > could more readily release with dev/preview versions. >> > Shouldn't >> > > > this >> > > > >> >>> > reduce the need to backport features to the apache stable >> > > > branches? >> > > > >> >>> Would >> > > > >> >>> > releases of these releases "replace" the 0.x.0 or 0.x.1 >> > > releases? >> > > > >> >>> > * For major features in later versions we should raise the >> bar >> > > on >> > > > the >> > > > >> >>> > amount of testing probably be more explicit about what >> testing >> > > is >> > > > >> done >> > > > >> >>> > (unit tests not suffcient, system testing stories/resports a >> > > > >> >>> requirement). >> > > > >> >>> > Any other suggestions? >> > > > >> >>> > >> > > > >> >>> > Jon. >> > > > >> >>> > >> > > > >> >>> > -- >> > > > >> >>> > // Jonathan Hsieh (shay) >> > > > >> >>> > // Software Engineer, Cloudera >> > > > >> >>> > // [email protected] >> > > > >> >>> > >> > > > >> >>> >> > > > >> >>> >> > > > >> >>> >> > > > >> >>> -- >> > > > >> >>> // Jonathan Hsieh (shay) >> > > > >> >>> // Software Engineer, Cloudera >> > > > >> >>> // [email protected] >> > > > >> >>> >> > > > >> >> > > > >> > > >> > >> >> >> >> -- >> // Jonathan Hsieh (shay) >> // Software Engineer, Cloudera >> // [email protected] >> > -- // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [email protected]
