I on the same page as N here.

There are "features" that make HBase more stable. Table locks (when implemented 
correctly and finished of course) could be such a feature.


I do not understand how 0.95 plays into this. There is no rolling upgrade path 
from 0.94 to 0.95 and 0.95 is not a stable release.

-- Lars



________________________________
 From: Jonathan Hsieh <[email protected]>
To: [email protected] 
Cc: lars hofhansl <[email protected]> 
Sent: Saturday, March 2, 2013 7:36 AM
Subject: Re: [DISCUSS] More new feature backports to 0.94.
 

In general, I have a preference against backporting  features for the reasons 
that Enis, Elliott, and Jean-Marc consider valid.  To be clear, this preference 
doesn't mean I am -1 to all backports onto the stable apache branch.  Let's do 
it case-by-case; my main ask is to make major backports rare and to make it the 
norm to require significantly more evidence of testing than usual.  I will -1 a 
major backport that lacks this evidence.  This will come up again in the future.

With the cases Lars proposed -- I prefer #3 (just say no) but find #1 (be very 
careful) acceptable given higher level of evidence.  #2 (new release branch) is 
onerous -- I'd rather we just get preview-release branches out more frequently 
to not have deal with this.  Arguably, the reason we have the preview-release 
branches serve the purpose of getting releases out more frequently and giving a 
feature time to harden from a few common points.  My hope is that these preview 
release will replace what were the 0.x.0 and 0.x.1 releases from  previous 
versions 

So what kind of evidence would I like to see? We can use snapshots case as an 
example.
When backporting snapshots was brought up, I actually preferred that we not 
backport that feature.  There was demand, so we agreed that we'd do it but no 
backport it until it is "rock solid".  Here's evidence to support the case that 
the feature and backport is solid:

* It's code history is publicly documented and has been available since 
December.
* It's design documentation has been available for even longer.  
* The feature is mostly additive and doesn't affect vital paths.
* It was tested against trunk and the later tested against a 0.94 variant that 
is closer to the target apache branch.  
* The version in the trunk branch has been reviewed by 5 committers.
* Limitations are either documented (please let me know if we should improve it 
more) or non-critical.
* Testing and hardening anecdotes have been documented in the original and 
backport jira.  There has been some relatively long term testing and fault 
injection testing (roughly 4-6 weeks).
* It will be backported in a "big bang" -- all pieces get added or none will.
 
This is a level I consider to be stronger than the normal testing expected for 
a patch.  Ideally, something at least this level is what I would expect for 
other major backports.  Do we agree on that?

For the table locks case, there maybe some of this may be a misperception in 
timing from my point of view.  I see a notification about this in jira which 
makes me think it is more imminent.   Looking into it, I see that currently the 
development and application of the zk table lock feature isn't complete -- the 
mechanism is committed but it isn't applied and integrated into all the 
operations (split, assign etc still on the way).  I've asked for documentation 
and Enis has graciously added a great design doc that will help reviewers 
understand it.  I'd love to be able to spend time system testing to really 
beating it up or at least have anecdotes from folks about their efforts on the 
apache verison.  Finally, I would want to see this feature come in as a big 
bang -- get it complete enough in trunk before backporting the pieces to a 
stable branch.

I haven't invested time into the online merge backport decision but my instinct 
there is to not port the feature as well.  It is less risky since it is an 
additive feature but has less reward since we already have a 
less-friendly-but-comparable mechanism.  Since merge seems similar to split 
(which took a while to get right) testing its correctness in failure cases at 
the system level would be a prereq.

Jon.


On Sat, Mar 2, 2013 at 3:43 AM, Nicolas Liochon <[email protected]> wrote:

New feature is a red herring imho: To me the only question is the
>regression risk.. And a feature can have a much lower regression risk than
>a bug fix. I guess we've all seen a fix for a non critical bug putting down
>a production system. Being able to backport features is a competitive
>advantage that leverages on a good architecture and a good test suite.
>Maintaining a branch adds a cost for everybody: if you have a bug to fix in
>94.6.1, you need to fix it in 0.94.7 as well. So we should do it only if we
>really have to, and plan it accordingly (i.e. we should not have to create
>a 0.94.7.1 a week after the creation of the 0.94.6.1).
>
>In the future, the test suite should also help us to estimate and minimize
>the risk. We're not there yet, but having a good test coverage is key for
>version 1 imho.
>
>So that makes me +1 for backport, and  0 for branching (+1 if there is a
>good reason and a plan, but here it's a theoretical discussion, so,... ;-) )
>
>Nicolas
>
>
>
>On Sat, Mar 2, 2013 at 4:44 AM, lars hofhansl <[email protected]> wrote:
>
>> I did mean "stablizing". What I was trying to point is that stuff we
>> backport might stabilize HBase.
>>
>>
>>
>> ________________________________
>>  From: Ted Yu <[email protected]>
>> To: [email protected]; lars hofhansl <[email protected]>
>> Sent: Friday, March 1, 2013 7:30 PM
>> Subject: Re: [DISCUSS] More new feature backports to 0.94.
>>
>> bq. That is only if we do not backport stabilizing "features".
>> Did you mean destabilizing above :-)
>>
>> My preference is option #1. With option #2, the community would be dealing
>> with one more branch which would increase the amount of work validating
>> each release candidate.
>>
>> To me, the difference between option #2 and the upcoming release candidates
>> of 0.95 would blur.
>>
>> Cheers
>>
>> On Fri, Mar 1, 2013 at 7:24 PM, lars hofhansl <[email protected]> wrote:
>>
>> > That is only if we do not backport stabilizing "features". There is an
>> > "opportunity cost" to be paid if we take a too rigorous approach too.
>> >
>> > Take
>> >  for example table-locks (which prompted this discussion). With that in
>> > place we can do safe online schema changes (that won't fail and leave
>> > the table in an undefined state when a concurrent split happens), it
>> > also allows for online merge.
>> >
>> > Now, is that a destabilizing
>> > "feature", or will it make HBase more stable and hence is an
>> > "improvement"? Depends on viewpoint, doesn't it?
>> > -- Lars
>> >
>> >
>> > ________________________________
>> >  From: Jean-Marc Spaggiari <[email protected]>
>> > To: [email protected]
>> > Sent: Friday, March 1, 2013 7:12 PM
>> > Subject: Re: [DISCUSS] More new feature backports to 0.94.
>> >
>> > @Lars: No, not any concern about anything already backported. Just a
>> > preference to #2 because it seems to make things more stable and
>> > easier to manage. New feature = new release. Given new sub-releases
>> > are for fixes and improvements, but not new features. Also, if we
>> > backport a feature in one or many previous releases, we will have to
>> > backport also all the fixes each time there will be an issue. So we
>> > will have more maintenant work on previous releases.
>> >
>> > 2013/3/1 Enis Söztutar <[email protected]>:
>> > > I think the current way of risk vs rewards analysis is working well. We
>> > > will just continue doing that on a case by case basis, discussing the
>> > > implications on individual issues.
>> > >
>> > >
>> > >
>> > > On Fri, Mar 1, 2013 at 6:46 PM, Lars Hofhansl <[email protected]>
>> > wrote:
>> > >
>> > >> BTW are you concerned about any specific back port we did in the past?
>> > So
>> > >> far we have not seen any destabilization in any of the 0.94 releases.
>> > >>
>> > >> Jean-Marc Spaggiari <[email protected]> wrote:
>> > >>
>> > >> >Hi Lars, #2, does it mean you will stop back-porting the new features
>> > >> >when it will become a "long-term" release? If so, I'm for option
>> #2...
>> > >> >
>> > >> >JM
>> > >> >
>> > >> >In your option
>> > >> >2013/3/1 Enis Söztutar <[email protected]>:
>> > >> >> Thanks Lars, I think it is a good listing of the options we have.
>> > >> >>
>> > >> >> I'll be +1 for #1 and #2, with #1 being a preference.
>> > >> >>
>> > >> >> Enis
>> > >> >>
>> > >> >>
>> > >> >> On Fri, Mar 1, 2013 at 6:10 PM, lars hofhansl <[email protected]>
>> > wrote:
>> > >> >>
>> > >> >>> So it seems that until we have a stable 0.96 (maybe 0.96.1 or
>> > 0.96.2)
>> > >> we
>> > >> >>> have three options:
>> > >> >>> 1. Backport new features to 0.94 as we see fit as long as we do
>> not
>> > >> >>> destabilize 0.94.
>> > >> >>> 2. Declare a certain point release (0.94.6 looks like a good
>> > >> candidate) as
>> > >> >>> a "long term", create an 0.94.6 branch (in addition to the usual
>> > 0.94.6
>> > >> >>> tag) and than create 0.94.6.x fix only releases. I would volunteer
>> > to
>> > >> >>> maintain a 0.94.6 branch in addition to the 0.94 branch.
>> > >> >>> 3. Categorically do not backport new features into 0.94 and defer
>> to
>> > >> 0.95.
>> > >> >>>
>> > >> >>> I'd be +1 on option #1 and #2, and -1 on option #3.
>> > >> >>>
>> > >> >>> -- Lars
>> > >> >>>
>> > >> >>>
>> > >> >>>
>> > >> >>> ________________________________
>> > >> >>>  From: Jonathan Hsieh <[email protected]>
>> > >> >>> To: [email protected]; lars hofhansl <[email protected]>
>> > >> >>> Sent: Friday, March 1, 2013 3:11 PM
>> > >> >>> Subject: Re: [DISCUSS] More new feature backports to 0.94.
>> > >> >>>
>> > >> >>> I think we are basically agreeing -- my primary concern is
>> bringing
>> > new
>> > >> >>> features in vital paths introduces more risk, I'd rather not
>> > backport
>> > >> major
>> > >> >>> new features unless we achieve a higher level of assurance through
>> > >> system
>> > >> >>> and basic fault injection testing.
>> > >> >>>
>> > >> >>> For the three current examples -- snapshots, zk table locks,
>> online
>> > >> merge
>> > >> >>> -- I actually would prefer not including any in apache 0.94.  Of
>> the
>> > >> bunch,
>> > >> >>> I feel the table locks are the most risky since it affects vital
>> > paths
>> > >> a
>> > >> >>> user must use,  where as snapshots and online merge are features
>> > that a
>> > >> >>> user could choose to use but does not necessarily have to use.
>> I'll
>> > >> voice
>> > >> >>> my concerns, reason for concerns, and justifications on the
>> > individual
>> > >> >>> jiras.
>> > >> >>>
>> > >> >>> I do feel that new features being in a dev/preview release like
>> 0.95
>> > >> aligns
>> > >> >>> well and doesn't create situations where different versions have
>> > >> different
>> > >> >>> feature sets.  New features should be introduced and hardened in a
>> > >> >>> dev/preview version, and the turn into the production ready
>> versions
>> > >> after
>> > >> >>> they've been proven out a bit.
>> > >> >>>
>> > >> >>> Jon.
>> > >> >>>
>> > >> >>> On Fri, Mar 1, 2013 at 11:00 AM, lars hofhansl <[email protected]>
>> > >> wrote:
>> > >> >>>
>> > >> >>> > This is an open source project, as long as there is a volunteer
>> to
>> > >> >>> > backport a patch I see no problem with doing this.
>> > >> >>> > The only thing we as the community should ensure is that it must
>> > be
>> > >> >>> > demonstrated that the patch does not destabilize the 0.94 code
>> > base;
>> > >> that
>> > >> >>> > has to be done on a case by case basis.
>> > >> >>> >
>> > >> >>> >
>> > >> >>> > Also, there is no stable release of HBase other than 0.94 (0.95
>> is
>> > >> not
>> > >> >>> > stable, and we specifically state that it should not be used in
>> > >> >>> production).
>> > >> >>> >
>> > >> >>> > -- Lars
>> > >> >>> >
>> > >> >>> >
>> > >> >>> >
>> > >> >>> > ________________________________
>> > >> >>> >  From: Jonathan Hsieh <[email protected]>
>> > >> >>> > To: [email protected]
>> > >> >>> > Sent: Friday, March 1, 2013 8:31 AM
>> > >> >>> > Subject: [DISCUSS] More new feature backports to 0.94.
>> > >> >>> >
>> > >> >>> > I was thinking more about HBASE-7360 (backport snapshots to
>> 0.94)
>> > and
>> > >> >>> also
>> > >> >>> > saw HBASE-7965 which suggests porting some major-ish features
>> > (table
>> > >> >>> locks,
>> > >> >>> > online merge) in to the apache 0.94 line.   We should chat about
>> > >> what we
>> > >> >>> > want to do about new features and bringing them into stable
>> > versions
>> > >> >>> (0.94
>> > >> >>> > today) and in general criteria we use for future versions.
>> > >> >>> >
>> > >> >>> > This is similar to the snapshots backport discussion and earlier
>> > >> backport
>> > >> >>> > discussions.  Here's my understanding of  high level points we
>> > >> basically
>> > >> >>> > agree upon.
>> > >> >>> > * Backporting new features to the previous major version incurs
>> > more
>> > >> cost
>> > >> >>> > when developing new features,  pushes back efforts on making the
>> > >> trunk
>> > >> >>> > versions and reduces incentive to move to newer versions.
>> > >> >>> > * Backporting new features to earlier versions (0.9x.0, 0.9x.1)
>> is
>> > >> >>> > reasonable since they are generally less stable.
>> > >> >>> > * Backporting new features to later version (0.9x.5, 0.9x.6) is
>> > less
>> > >> >>> > reasonable --  (ex: a 0.94.6, or 0.94.7 should only include
>> robust
>> > >> >>> > features).
>> > >> >>> > * Backporting orthogonal features (snapshots) seems less risky
>> > than
>> > >> core
>> > >> >>> > changing features
>> > >> >>> > * An except: If multiple distributions declare intent to
>> > backport, it
>> > >> >>> makes
>> > >> >>> > sense to backport a feature. (snapshots for example).
>> > >> >>> >
>> > >> >>> > Some new circumstances and discussion topics:
>> > >> >>> > * We now have a dev branch (0.95) with looser compat
>> requirements
>> > >> that we
>> > >> >>> > could more readily release with dev/preview versions.  Shouldn't
>> > this
>> > >> >>> > reduce the need to backport features to the apache stable
>> > branches?
>> > >> >>> Would
>> > >> >>> > releases of these releases "replace" the 0.x.0 or 0.x.1
>> releases?
>> > >> >>> > * For major features in later versions we should raise the bar
>> on
>> > the
>> > >> >>> > amount of testing probably be more explicit about what testing
>> is
>> > >> done
>> > >> >>> > (unit tests not suffcient, system testing stories/resports a
>> > >> >>> requirement).
>> > >> >>> > Any other suggestions?
>> > >> >>> >
>> > >> >>> > Jon.
>> > >> >>> >
>> > >> >>> > --
>> > >> >>> > // Jonathan Hsieh (shay)
>> > >> >>> > // Software Engineer, Cloudera
>> > >> >>> > // [email protected]
>> > >> >>> >
>> > >> >>>
>> > >> >>>
>> > >> >>>
>> > >> >>> --
>> > >> >>> // Jonathan Hsieh (shay)
>> > >> >>> // Software Engineer, Cloudera
>> > >> >>> // [email protected]
>> > >> >>>
>> > >>
>> >
>>
>


-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera

// [email protected]

Reply via email to