Re: About how features are integrated to different HBase versions

2019-01-18 Thread Sean Busbey
I agree with Andrew that we can't both have maintenance releases and expect
every feature in ongoing branch-1 releases to be in branches-2.y.

Tracking consideration for when features are available across major
versions fits in well with the "upgrade paths" section in the ref guide.

We've just gotten in the habit of it only getting filled in when a big
release is coming up.



On Fri, Jan 18, 2019, 23:46 张铎(Duo Zhang)  Then we must have a upgrade path, for example, 1.5.x can only be upgraded
> to 2.2.x if you want all the features still there?
>
> Maybe we should have a release timeline for the first release of all the
> minor releases? So when user want to upgrade, they can choose the minor
> release which is released later than the current one.
>
> Andrew Purtell 于2019年1月19日 周六13:15写道:
>
> > Also I think branch-1 releases will be done on a monthly cadence
> > independent of any branch-2 releases. This is because there are different
> > RMs at work with different needs and schedules.
> >
> > I can certainly help out some with branch-2 releasing if you need it,
> > FWIW.
> >
> > It may also help if we begin talking about 1.x and 2.x as separate
> > "products". This can help avoid confusion about features in 1.5 not in
> 2.1
> > but in 2.2. For all practical purposes they are separate products. Some
> of
> > our community develop and run branch-1. Others develop and run branch-2.
> > There is some overlap but the overlap is not total. The concerns will
> > diverge a bit. I think this is healthy. Everyone is attending to what
> they
> > need. Let's figure out how to make it work.
> >
> > > On Jan 18, 2019, at 9:04 PM, Andrew Purtell 
> > wrote:
> > >
> > > Also please be prepared to support forward evolution and maintenance of
> > branch-1 for, potentially, years. Because it is used in production and
> will
> > continue to do so for a long time. Features may end up in 1.6.0 that only
> > appear in 2.3 or 2.4. And in 1.7 that only appear in 2.5 or 2.6. This
> > shouldn't be confusing. We just need to document it. JIRA helps some,
> > release notes can help a lot more. Maybe in the future a feature to
> version
> > matrix in the book.
> > >
> > >> On Jan 18, 2019, at 8:59 PM, Andrew Purtell  >
> > wrote:
> > >>
> > >> This can't work, because we can put things into a new minor that
> cannot
> > go into a patch relesse. If you say instead 2.2.0 must have everything in
> > 1.5.0, it can work. The alignment of features should happen at the minor
> > releases. If we can also have alignment in patch releases too, that would
> > be great, but can't be mandatory.
> > >>
> > >>> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang) 
> > wrote:
> > >>>
> > >>> Please see the red words carefully, I explicitly mentioned that, the
> > newer
> > >>> version should be released LATER, if you want to get all the
> features.
> > >>>
> > >>> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1
> > >>> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0
> > has,
> > >>> but 2.1.1 should have all features which 1.5.0 has.
> > >>>
> > >>> Sergey Shelukhin 
> > 于2019年1月19日周六
> > >>> 上午10:23写道:
> > >>>
> >  Consider that we actually cannot guarantee this without a time
> > machine,
> >  because some "newer" versions are already released.
> > 
> >  If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0,
> >  2.1.2, etc. because they are already released... if the user
> upgrades
> > from
> >  1.5 to 2.0.1 for example, they will lose the feature no matter what.
> >  The only way to ensure is to
> >  - always update to latest dot version,
> >  - also for us to make sure we never release before releasing every
> > "later"
> >  dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so
> >  there's the latest release for every line).
> >  - and also for us to make sure that every single dot line actually
> > has a
> >  release - when e.g. 2.0.X line is abandoned that may not happen, so
> > the
> >  latest version of 2.0.X will precede latest 1.Y because 1.Y may
> still
> > be
> >  active (like as far as I recall 0.94 was getting dot releases even
> > when
> >  0.96 was abandoned) - so even if the user goes from 1.Y to the
> latest
> > 2.0.X
> >  they will lose the feature.
> > 
> >  I think this is kind of expected... I agree that it needs to be
> >  documented. To an extent it already is in JIRA where fixVersion may
> be
> >  "3.0, 2.2, 1.5", but it makes sense to document explicitly.
> > 
> >  -Original Message-
> >  From: 张铎(Duo Zhang) 
> >  Sent: Friday, January 18, 2019 5:50 PM
> >  To: HBase Dev List 
> >  Subject: About how features are integrated to different HBase
> versions
> > 
> >  I think we have a good discussion on HBASE-21034, where a feature is
> > back
> >  ported to branch-1, but then folks think that we should not back
> port
> > them
> >  to 

[jira] [Created] (HBASE-21746) RegionMover.stripServer will return the last server if the target server does not exist

2019-01-18 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21746:
-

 Summary: RegionMover.stripServer will return the last server if 
the target server does not exist
 Key: HBASE-21746
 URL: https://issues.apache.org/jira/browse/HBASE-21746
 Project: HBase
  Issue Type: Bug
Reporter: Duo Zhang


It should return null for this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: About how features are integrated to different HBase versions

2019-01-18 Thread Duo Zhang
Then we must have a upgrade path, for example, 1.5.x can only be upgraded
to 2.2.x if you want all the features still there?

Maybe we should have a release timeline for the first release of all the
minor releases? So when user want to upgrade, they can choose the minor
release which is released later than the current one.

Andrew Purtell 于2019年1月19日 周六13:15写道:

> Also I think branch-1 releases will be done on a monthly cadence
> independent of any branch-2 releases. This is because there are different
> RMs at work with different needs and schedules.
>
> I can certainly help out some with branch-2 releasing if you need it,
> FWIW.
>
> It may also help if we begin talking about 1.x and 2.x as separate
> "products". This can help avoid confusion about features in 1.5 not in 2.1
> but in 2.2. For all practical purposes they are separate products. Some of
> our community develop and run branch-1. Others develop and run branch-2.
> There is some overlap but the overlap is not total. The concerns will
> diverge a bit. I think this is healthy. Everyone is attending to what they
> need. Let's figure out how to make it work.
>
> > On Jan 18, 2019, at 9:04 PM, Andrew Purtell 
> wrote:
> >
> > Also please be prepared to support forward evolution and maintenance of
> branch-1 for, potentially, years. Because it is used in production and will
> continue to do so for a long time. Features may end up in 1.6.0 that only
> appear in 2.3 or 2.4. And in 1.7 that only appear in 2.5 or 2.6. This
> shouldn't be confusing. We just need to document it. JIRA helps some,
> release notes can help a lot more. Maybe in the future a feature to version
> matrix in the book.
> >
> >> On Jan 18, 2019, at 8:59 PM, Andrew Purtell 
> wrote:
> >>
> >> This can't work, because we can put things into a new minor that cannot
> go into a patch relesse. If you say instead 2.2.0 must have everything in
> 1.5.0, it can work. The alignment of features should happen at the minor
> releases. If we can also have alignment in patch releases too, that would
> be great, but can't be mandatory.
> >>
> >>> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang) 
> wrote:
> >>>
> >>> Please see the red words carefully, I explicitly mentioned that, the
> newer
> >>> version should be released LATER, if you want to get all the features.
> >>>
> >>> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1
> >>> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0
> has,
> >>> but 2.1.1 should have all features which 1.5.0 has.
> >>>
> >>> Sergey Shelukhin 
> 于2019年1月19日周六
> >>> 上午10:23写道:
> >>>
>  Consider that we actually cannot guarantee this without a time
> machine,
>  because some "newer" versions are already released.
> 
>  If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0,
>  2.1.2, etc. because they are already released... if the user upgrades
> from
>  1.5 to 2.0.1 for example, they will lose the feature no matter what.
>  The only way to ensure is to
>  - always update to latest dot version,
>  - also for us to make sure we never release before releasing every
> "later"
>  dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so
>  there's the latest release for every line).
>  - and also for us to make sure that every single dot line actually
> has a
>  release - when e.g. 2.0.X line is abandoned that may not happen, so
> the
>  latest version of 2.0.X will precede latest 1.Y because 1.Y may still
> be
>  active (like as far as I recall 0.94 was getting dot releases even
> when
>  0.96 was abandoned) - so even if the user goes from 1.Y to the latest
> 2.0.X
>  they will lose the feature.
> 
>  I think this is kind of expected... I agree that it needs to be
>  documented. To an extent it already is in JIRA where fixVersion may be
>  "3.0, 2.2, 1.5", but it makes sense to document explicitly.
> 
>  -Original Message-
>  From: 张铎(Duo Zhang) 
>  Sent: Friday, January 18, 2019 5:50 PM
>  To: HBase Dev List 
>  Subject: About how features are integrated to different HBase versions
> 
>  I think we have a good discussion on HBASE-21034, where a feature is
> back
>  ported to branch-1, but then folks think that we should not back port
> them
>  to branch-2.1 and branch-2.0, as usually we should not add new
> features to
>  minor release lines.
> 
>  I think the reason why we do not want the feature in branch-2.1 and
>  branch-2.0 is reasonable, but this will introduce another problem. As
>  later, we will release a 1.5.0 which has the feature, but when a user
> later
>  upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature
> is
>  gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is
> released,
>  as we do not port the feature to these two branches. This will be very
>  confusing to users I'd say.
> 
>  So I think 

Re: About how features are integrated to different HBase versions

2019-01-18 Thread Andrew Purtell
Also I think branch-1 releases will be done on a monthly cadence independent of 
any branch-2 releases. This is because there are different RMs at work with 
different needs and schedules. 

I can certainly help out some with branch-2 releasing if you need it, FWIW. 

It may also help if we begin talking about 1.x and 2.x as separate "products". 
This can help avoid confusion about features in 1.5 not in 2.1 but in 2.2. For 
all practical purposes they are separate products. Some of our community 
develop and run branch-1. Others develop and run branch-2. There is some 
overlap but the overlap is not total. The concerns will diverge a bit. I think 
this is healthy. Everyone is attending to what they need. Let's figure out how 
to make it work. 

> On Jan 18, 2019, at 9:04 PM, Andrew Purtell  wrote:
> 
> Also please be prepared to support forward evolution and maintenance of 
> branch-1 for, potentially, years. Because it is used in production and will 
> continue to do so for a long time. Features may end up in 1.6.0 that only 
> appear in 2.3 or 2.4. And in 1.7 that only appear in 2.5 or 2.6. This 
> shouldn't be confusing. We just need to document it. JIRA helps some, release 
> notes can help a lot more. Maybe in the future a feature to version matrix in 
> the book. 
> 
>> On Jan 18, 2019, at 8:59 PM, Andrew Purtell  wrote:
>> 
>> This can't work, because we can put things into a new minor that cannot go 
>> into a patch relesse. If you say instead 2.2.0 must have everything in 
>> 1.5.0, it can work. The alignment of features should happen at the minor 
>> releases. If we can also have alignment in patch releases too, that would be 
>> great, but can't be mandatory. 
>> 
>>> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang)  wrote:
>>> 
>>> Please see the red words carefully, I explicitly mentioned that, the newer
>>> version should be released LATER, if you want to get all the features.
>>> 
>>> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1
>>> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has,
>>> but 2.1.1 should have all features which 1.5.0 has.
>>> 
>>> Sergey Shelukhin  于2019年1月19日周六
>>> 上午10:23写道:
>>> 
 Consider that we actually cannot guarantee this without a time machine,
 because some "newer" versions are already released.
 
 If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0,
 2.1.2, etc. because they are already released... if the user upgrades from
 1.5 to 2.0.1 for example, they will lose the feature no matter what.
 The only way to ensure is to
 - always update to latest dot version,
 - also for us to make sure we never release before releasing every "later"
 dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so
 there's the latest release for every line).
 - and also for us to make sure that every single dot line actually has a
 release - when e.g. 2.0.X line is abandoned that may not happen, so the
 latest version of 2.0.X will precede latest 1.Y because 1.Y may still be
 active (like as far as I recall 0.94 was getting dot releases even when
 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X
 they will lose the feature.
 
 I think this is kind of expected... I agree that it needs to be
 documented. To an extent it already is in JIRA where fixVersion may be
 "3.0, 2.2, 1.5", but it makes sense to document explicitly.
 
 -Original Message-
 From: 张铎(Duo Zhang) 
 Sent: Friday, January 18, 2019 5:50 PM
 To: HBase Dev List 
 Subject: About how features are integrated to different HBase versions
 
 I think we have a good discussion on HBASE-21034, where a feature is back
 ported to branch-1, but then folks think that we should not back port them
 to branch-2.1 and branch-2.0, as usually we should not add new features to
 minor release lines.
 
 I think the reason why we do not want the feature in branch-2.1 and
 branch-2.0 is reasonable, but this will introduce another problem. As
 later, we will release a 1.5.0 which has the feature, but when a user later
 upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is
 gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released,
 as we do not port the feature to these two branches. This will be very
 confusing to users I'd say.
 
 So I think we should guarantee that, a higher version of HBase release
 will always contain all the features of a HBase release with a lower
 version which is released earlier, unless explicitly mentioned(for example,
 DLR).
 
 And this implies that, when we setup a new major release and make a new
 release on the first minor release line, then the develop branch for the
 previous major release will be useless, as said above, usually we do not
 want to port any new features to the minor 

Re: About how features are integrated to different HBase versions

2019-01-18 Thread Andrew Purtell
Also please be prepared to support forward evolution and maintenance of 
branch-1 for, potentially, years. Because it is used in production and will 
continue to do so for a long time. Features may end up in 1.6.0 that only 
appear in 2.3 or 2.4. And in 1.7 that only appear in 2.5 or 2.6. This shouldn't 
be confusing. We just need to document it. JIRA helps some, release notes can 
help a lot more. Maybe in the future a feature to version matrix in the book. 

> On Jan 18, 2019, at 8:59 PM, Andrew Purtell  wrote:
> 
> This can't work, because we can put things into a new minor that cannot go 
> into a patch relesse. If you say instead 2.2.0 must have everything in 1.5.0, 
> it can work. The alignment of features should happen at the minor releases. 
> If we can also have alignment in patch releases too, that would be great, but 
> can't be mandatory. 
> 
>> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang)  wrote:
>> 
>> Please see the red words carefully, I explicitly mentioned that, the newer
>> version should be released LATER, if you want to get all the features.
>> 
>> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1
>> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has,
>> but 2.1.1 should have all features which 1.5.0 has.
>> 
>> Sergey Shelukhin  于2019年1月19日周六
>> 上午10:23写道:
>> 
>>> Consider that we actually cannot guarantee this without a time machine,
>>> because some "newer" versions are already released.
>>> 
>>> If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0,
>>> 2.1.2, etc. because they are already released... if the user upgrades from
>>> 1.5 to 2.0.1 for example, they will lose the feature no matter what.
>>> The only way to ensure is to
>>> - always update to latest dot version,
>>> - also for us to make sure we never release before releasing every "later"
>>> dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so
>>> there's the latest release for every line).
>>> - and also for us to make sure that every single dot line actually has a
>>> release - when e.g. 2.0.X line is abandoned that may not happen, so the
>>> latest version of 2.0.X will precede latest 1.Y because 1.Y may still be
>>> active (like as far as I recall 0.94 was getting dot releases even when
>>> 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X
>>> they will lose the feature.
>>> 
>>> I think this is kind of expected... I agree that it needs to be
>>> documented. To an extent it already is in JIRA where fixVersion may be
>>> "3.0, 2.2, 1.5", but it makes sense to document explicitly.
>>> 
>>> -Original Message-
>>> From: 张铎(Duo Zhang) 
>>> Sent: Friday, January 18, 2019 5:50 PM
>>> To: HBase Dev List 
>>> Subject: About how features are integrated to different HBase versions
>>> 
>>> I think we have a good discussion on HBASE-21034, where a feature is back
>>> ported to branch-1, but then folks think that we should not back port them
>>> to branch-2.1 and branch-2.0, as usually we should not add new features to
>>> minor release lines.
>>> 
>>> I think the reason why we do not want the feature in branch-2.1 and
>>> branch-2.0 is reasonable, but this will introduce another problem. As
>>> later, we will release a 1.5.0 which has the feature, but when a user later
>>> upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is
>>> gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released,
>>> as we do not port the feature to these two branches. This will be very
>>> confusing to users I'd say.
>>> 
>>> So I think we should guarantee that, a higher version of HBase release
>>> will always contain all the features of a HBase release with a lower
>>> version which is released earlier, unless explicitly mentioned(for example,
>>> DLR).
>>> 
>>> And this implies that, when we setup a new major release and make a new
>>> release on the first minor release line, then the develop branch for the
>>> previous major release will be useless, as said above, usually we do not
>>> want to port any new features to the minor release line of the new major
>>> release, then the new features should not be ported to previous major
>>> release, otherwise we will break the guarantee above. And this also means
>>> that, we could just use the 'develop' branch to make new releases.
>>> 


Re: About how features are integrated to different HBase versions

2019-01-18 Thread Andrew Purtell
This can't work, because we can put things into a new minor that cannot go into 
a patch relesse. If you say instead 2.2.0 must have everything in 1.5.0, it can 
work. The alignment of features should happen at the minor releases. If we can 
also have alignment in patch releases too, that would be great, but can't be 
mandatory. 

> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang)  wrote:
> 
> Please see the red words carefully, I explicitly mentioned that, the newer
> version should be released LATER, if you want to get all the features.
> 
> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1
> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has,
> but 2.1.1 should have all features which 1.5.0 has.
> 
> Sergey Shelukhin  于2019年1月19日周六
> 上午10:23写道:
> 
>> Consider that we actually cannot guarantee this without a time machine,
>> because some "newer" versions are already released.
>> 
>> If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0,
>> 2.1.2, etc. because they are already released... if the user upgrades from
>> 1.5 to 2.0.1 for example, they will lose the feature no matter what.
>> The only way to ensure is to
>> - always update to latest dot version,
>> - also for us to make sure we never release before releasing every "later"
>> dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so
>> there's the latest release for every line).
>> - and also for us to make sure that every single dot line actually has a
>> release - when e.g. 2.0.X line is abandoned that may not happen, so the
>> latest version of 2.0.X will precede latest 1.Y because 1.Y may still be
>> active (like as far as I recall 0.94 was getting dot releases even when
>> 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X
>> they will lose the feature.
>> 
>> I think this is kind of expected... I agree that it needs to be
>> documented. To an extent it already is in JIRA where fixVersion may be
>> "3.0, 2.2, 1.5", but it makes sense to document explicitly.
>> 
>> -Original Message-
>> From: 张铎(Duo Zhang) 
>> Sent: Friday, January 18, 2019 5:50 PM
>> To: HBase Dev List 
>> Subject: About how features are integrated to different HBase versions
>> 
>> I think we have a good discussion on HBASE-21034, where a feature is back
>> ported to branch-1, but then folks think that we should not back port them
>> to branch-2.1 and branch-2.0, as usually we should not add new features to
>> minor release lines.
>> 
>> I think the reason why we do not want the feature in branch-2.1 and
>> branch-2.0 is reasonable, but this will introduce another problem. As
>> later, we will release a 1.5.0 which has the feature, but when a user later
>> upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is
>> gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released,
>> as we do not port the feature to these two branches. This will be very
>> confusing to users I'd say.
>> 
>> So I think we should guarantee that, a higher version of HBase release
>> will always contain all the features of a HBase release with a lower
>> version which is released earlier, unless explicitly mentioned(for example,
>> DLR).
>> 
>> And this implies that, when we setup a new major release and make a new
>> release on the first minor release line, then the develop branch for the
>> previous major release will be useless, as said above, usually we do not
>> want to port any new features to the minor release line of the new major
>> release, then the new features should not be ported to previous major
>> release, otherwise we will break the guarantee above. And this also means
>> that, we could just use the 'develop' branch to make new releases.
>> 


Re: About how features are integrated to different HBase versions

2019-01-18 Thread Duo Zhang
And yes we should document this.

And also we need to have a web page to list all the releases in timeline?
So user could know which version is safe to use when upgrading easily.

张铎(Duo Zhang)  于2019年1月19日周六 上午11:12写道:

> Please see the red words carefully, I explicitly mentioned that, the newer
> version should be released LATER, if you want to get all the features.
>
> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1
> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has,
> but 2.1.1 should have all features which 1.5.0 has.
>
> Sergey Shelukhin  于2019年1月19日周六
> 上午10:23写道:
>
>> Consider that we actually cannot guarantee this without a time machine,
>> because some "newer" versions are already released.
>>
>> If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0,
>> 2.1.2, etc. because they are already released... if the user upgrades from
>> 1.5 to 2.0.1 for example, they will lose the feature no matter what.
>> The only way to ensure is to
>> - always update to latest dot version,
>> - also for us to make sure we never release before releasing every
>> "later" dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc.
>> so there's the latest release for every line).
>> - and also for us to make sure that every single dot line actually has a
>> release - when e.g. 2.0.X line is abandoned that may not happen, so the
>> latest version of 2.0.X will precede latest 1.Y because 1.Y may still be
>> active (like as far as I recall 0.94 was getting dot releases even when
>> 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X
>> they will lose the feature.
>>
>> I think this is kind of expected... I agree that it needs to be
>> documented. To an extent it already is in JIRA where fixVersion may be
>> "3.0, 2.2, 1.5", but it makes sense to document explicitly.
>>
>> -Original Message-
>> From: 张铎(Duo Zhang) 
>> Sent: Friday, January 18, 2019 5:50 PM
>> To: HBase Dev List 
>> Subject: About how features are integrated to different HBase versions
>>
>> I think we have a good discussion on HBASE-21034, where a feature is back
>> ported to branch-1, but then folks think that we should not back port them
>> to branch-2.1 and branch-2.0, as usually we should not add new features to
>> minor release lines.
>>
>> I think the reason why we do not want the feature in branch-2.1 and
>> branch-2.0 is reasonable, but this will introduce another problem. As
>> later, we will release a 1.5.0 which has the feature, but when a user later
>> upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is
>> gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released,
>> as we do not port the feature to these two branches. This will be very
>> confusing to users I'd say.
>>
>> So I think we should guarantee that, a higher version of HBase release
>> will always contain all the features of a HBase release with a lower
>> version which is released earlier, unless explicitly mentioned(for example,
>> DLR).
>>
>> And this implies that, when we setup a new major release and make a new
>> release on the first minor release line, then the develop branch for the
>> previous major release will be useless, as said above, usually we do not
>> want to port any new features to the minor release line of the new major
>> release, then the new features should not be ported to previous major
>> release, otherwise we will break the guarantee above. And this also means
>> that, we could just use the 'develop' branch to make new releases.
>>
>


Re: About how features are integrated to different HBase versions

2019-01-18 Thread Duo Zhang
Please see the red words carefully, I explicitly mentioned that, the newer
version should be released LATER, if you want to get all the features.

For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1
tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has,
but 2.1.1 should have all features which 1.5.0 has.

Sergey Shelukhin  于2019年1月19日周六
上午10:23写道:

> Consider that we actually cannot guarantee this without a time machine,
> because some "newer" versions are already released.
>
> If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0,
> 2.1.2, etc. because they are already released... if the user upgrades from
> 1.5 to 2.0.1 for example, they will lose the feature no matter what.
> The only way to ensure is to
> - always update to latest dot version,
> - also for us to make sure we never release before releasing every "later"
> dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so
> there's the latest release for every line).
> - and also for us to make sure that every single dot line actually has a
> release - when e.g. 2.0.X line is abandoned that may not happen, so the
> latest version of 2.0.X will precede latest 1.Y because 1.Y may still be
> active (like as far as I recall 0.94 was getting dot releases even when
> 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X
> they will lose the feature.
>
> I think this is kind of expected... I agree that it needs to be
> documented. To an extent it already is in JIRA where fixVersion may be
> "3.0, 2.2, 1.5", but it makes sense to document explicitly.
>
> -Original Message-
> From: 张铎(Duo Zhang) 
> Sent: Friday, January 18, 2019 5:50 PM
> To: HBase Dev List 
> Subject: About how features are integrated to different HBase versions
>
> I think we have a good discussion on HBASE-21034, where a feature is back
> ported to branch-1, but then folks think that we should not back port them
> to branch-2.1 and branch-2.0, as usually we should not add new features to
> minor release lines.
>
> I think the reason why we do not want the feature in branch-2.1 and
> branch-2.0 is reasonable, but this will introduce another problem. As
> later, we will release a 1.5.0 which has the feature, but when a user later
> upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is
> gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released,
> as we do not port the feature to these two branches. This will be very
> confusing to users I'd say.
>
> So I think we should guarantee that, a higher version of HBase release
> will always contain all the features of a HBase release with a lower
> version which is released earlier, unless explicitly mentioned(for example,
> DLR).
>
> And this implies that, when we setup a new major release and make a new
> release on the first minor release line, then the develop branch for the
> previous major release will be useless, as said above, usually we do not
> want to port any new features to the minor release line of the new major
> release, then the new features should not be ported to previous major
> release, otherwise we will break the guarantee above. And this also means
> that, we could just use the 'develop' branch to make new releases.
>


RE: About how features are integrated to different HBase versions

2019-01-18 Thread Sergey Shelukhin
Consider that we actually cannot guarantee this without a time machine, because 
some "newer" versions are already released.

If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, 2.1.2, 
etc. because they are already released... if the user upgrades from 1.5 to 
2.0.1 for example, they will lose the feature no matter what. 
The only way to ensure is to
- always update to latest dot version,
- also for us to make sure we never release before releasing every "later" dot 
release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so there's the 
latest release for every line).
- and also for us to make sure that every single dot line actually has a 
release - when e.g. 2.0.X line is abandoned that may not happen, so the latest 
version of 2.0.X will precede latest 1.Y because 1.Y may still be active (like 
as far as I recall 0.94 was getting dot releases even when 0.96 was abandoned) 
- so even if the user goes from 1.Y to the latest 2.0.X they will lose the 
feature.

I think this is kind of expected... I agree that it needs to be documented. To 
an extent it already is in JIRA where fixVersion may be "3.0, 2.2, 1.5", but it 
makes sense to document explicitly.

-Original Message-
From: 张铎(Duo Zhang)  
Sent: Friday, January 18, 2019 5:50 PM
To: HBase Dev List 
Subject: About how features are integrated to different HBase versions

I think we have a good discussion on HBASE-21034, where a feature is back 
ported to branch-1, but then folks think that we should not back port them to 
branch-2.1 and branch-2.0, as usually we should not add new features to minor 
release lines.

I think the reason why we do not want the feature in branch-2.1 and
branch-2.0 is reasonable, but this will introduce another problem. As later, we 
will release a 1.5.0 which has the feature, but when a user later upgrades from 
1.5.0 to 2.1.x or 2.0.x, it will find that the feature is gone, even though the 
2.1.x or 2.0.x is released after 1.5.0 is released, as we do not port the 
feature to these two branches. This will be very confusing to users I'd say.

So I think we should guarantee that, a higher version of HBase release will 
always contain all the features of a HBase release with a lower version which 
is released earlier, unless explicitly mentioned(for example, DLR).

And this implies that, when we setup a new major release and make a new release 
on the first minor release line, then the develop branch for the previous major 
release will be useless, as said above, usually we do not want to port any new 
features to the minor release line of the new major release, then the new 
features should not be ported to previous major release, otherwise we will 
break the guarantee above. And this also means that, we could just use the 
'develop' branch to make new releases.


Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.

2019-01-18 Thread Duo Zhang
https://issues.apache.org/jira/browse/HBASE-21745

张铎(Duo Zhang)  于2019年1月19日周六 上午9:51写道:

> OK, the original issue is HBCK2 for AMv2, but here we need to do more, not
> only for AMv2.
>
> Let me open a new issue and post what Andrew said above there.
>
> 张铎(Duo Zhang)  于2019年1月19日周六 上午9:26写道:
>
>> OK, let me find the original HBCK2 issue and see how can we make progress
>> on it.
>>
>> BTW, on scan performance, Zheng Hu has done a work to get about 40%
>> performance back in this issue for 100% scan case on ycsb
>>
>> https://issues.apache.org/jira/browse/HBASE-21657
>>
>> Andrew Purtell  于2019年1月19日周六 上午8:14写道:
>>
>>> Lars was testing tip of branch-2 with Phoenix and said scans were 50%
>>> slower than branch-1. I’ll try and get him to provide more details.
>>> Anyway
>>> after hbck2 is complete issues like that will come out in the testing
>>> we’d
>>> do as part of sanity checking a move of the pointer.
>>>
>>> On Fri, Jan 18, 2019 at 4:02 PM Zach York 
>>> wrote:
>>>
>>> > I agree with the sentiment around HBCK2. I think these kind of recovery
>>> > tools are essential before marking something stable.
>>> >
>>> > I also remember when we did testing around HBase 2.x/2.1 that we were
>>> > getting perf degradations and couldn't seem to get performance to be as
>>> > good as we were getting in the 1.x line.
>>> >
>>> > - Zach
>>> >
>>> > On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr 
>>> wrote:
>>> >
>>> > > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate
>>> old
>>> > > version data to HBase-2. We have use cases where we are using these
>>> tools
>>> > > to rebuild the meta for further region assignment.
>>> > > Similar discussion is going on HBASE-21665, after fixing the NPE and
>>> > > rebuilding the meta, master don't assign the regions as we skip the
>>> empty
>>> > > regions while loading meta during master startup.
>>> > >
>>> > > A big +1 from my side on this...
>>> > >
>>> > > Regards,
>>> > > Pankaj
>>> > >
>>> > > -Original Message-
>>> > > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com]
>>> > > Sent: 18 January 2019 11:55
>>> > > To: HBase Dev List 
>>> > > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get
>>> the
>>> > > 'stable' pointer.
>>> > >
>>> > > So the first priority is to make progress on HBCK2? If we all agree,
>>> > let's
>>> > > start to work.
>>> > >
>>> > > Andrew Purtell  于2019年1月18日周五 下午12:31写道:
>>> > >
>>> > > > Sorry, let me add... Check all the boxes on that list and I'm +1
>>> for
>>> > > > moving the stable pointer (modulo some time to pound on the
>>> candidate
>>> > > > to really put it through its paces, like two weeks of chaos...)
>>> > > >
>>> > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell <
>>> apurt...@apache.org>
>>> > > > wrote:
>>> > > >
>>> > > > > I do not believe we should move the stable pointer to any 2.x
>>> until
>>> > > > > HBCK2 is feature complete. We can discuss what that milestone
>>> should
>>> > > look like.
>>> > > > > At a minimum, I think we need:
>>> > > > >
>>> > > > >- Rebuild meta from region metadata in the filesystem, aka
>>> offline
>>> > > > >meta rebuild.
>>> > > > >- Fix assignment errors (undeployed regions, double
>>> assignments
>>> > > (yes,
>>> > > > >should not be possible), etc)
>>> > > > >- Fix region holes, overlaps, and other errors in the region
>>> chain
>>> > > > >- Fix failed split and merge transactions that have failed to
>>> roll
>>> > > > >back due to some bug (related to previous)
>>> > > > >- Enumerate store files to determine file level corruption and
>>> > > > >sideline corrupt files
>>> > > > >- Fix hfile link problems (dangling / broken)
>>> > > > >
>>> > > > > This is a list of the real problems I have had to fix in
>>> production
>>> > > > > at least once (in the past 10 years...).
>>> > > > >
>>> > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang)
>>> > > > > 
>>> > > > > wrote:
>>> > > > >
>>> > > > >> There are still lots of small new features which we want to
>>> > > > >> integrate
>>> > > > into
>>> > > > >> branch-2 so I'm -1 on making release directly from branch-2.
>>> > > > >> Backporting at once before release is a pain I'd say, I've tried
>>> > > > >> this many times recently, as we have to follow up the community
>>> > > > >> version...Let's make a branch-2.2 when we want to release 2.2.0,
>>> > > > >> and maybe also retire the branch-2.0?
>>> > > > >>
>>> > > > >> For the stable pointer, I think 2.1.x maybe a good candidate?
>>> > > > >> Though we know that we may still have some bugs for the AMv2,
>>> but
>>> > > > >> actually we all know that the AMv1 for all the branch-1.x also
>>> has
>>> > > > >> lots of bugs, that's why hbck is very important.
>>> > > > >>
>>> > > > >> And also +! on making progress on HBCK2, we need to port he
>>> useful
>>> > > > >> features of HBCK1 to HBCK2. There is no software can guarantee
>>> that
>>> > > > >> there is no bug, so FWIW we should have a way to fix broken
>>> 

[jira] [Created] (HBASE-21745) Make HBCK2 be able to fix issues other than region assignment

2019-01-18 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21745:
-

 Summary: Make HBCK2 be able to fix issues other than region 
assignment
 Key: HBASE-21745
 URL: https://issues.apache.org/jira/browse/HBASE-21745
 Project: HBase
  Issue Type: Umbrella
  Components: hbase-operator-tools, hbck2
Reporter: Duo Zhang


This is what [~apurtell] posted on mailing-list, HBCK2 should support
{quote}
   - Rebuild meta from region metadata in the filesystem, aka offline meta
   rebuild.
   - Fix assignment errors (undeployed regions, double assignments (yes,
   should not be possible), etc)
   - Fix region holes, overlaps, and other errors in the region chain
   - Fix failed split and merge transactions that have failed to roll back
   due to some bug (related to previous)
   - Enumerate store files to determine file level corruption and sideline
   corrupt files
   - Fix hfile link problems (dangling / broken)
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.

2019-01-18 Thread Duo Zhang
OK, the original issue is HBCK2 for AMv2, but here we need to do more, not
only for AMv2.

Let me open a new issue and post what Andrew said above there.

张铎(Duo Zhang)  于2019年1月19日周六 上午9:26写道:

> OK, let me find the original HBCK2 issue and see how can we make progress
> on it.
>
> BTW, on scan performance, Zheng Hu has done a work to get about 40%
> performance back in this issue for 100% scan case on ycsb
>
> https://issues.apache.org/jira/browse/HBASE-21657
>
> Andrew Purtell  于2019年1月19日周六 上午8:14写道:
>
>> Lars was testing tip of branch-2 with Phoenix and said scans were 50%
>> slower than branch-1. I’ll try and get him to provide more details. Anyway
>> after hbck2 is complete issues like that will come out in the testing we’d
>> do as part of sanity checking a move of the pointer.
>>
>> On Fri, Jan 18, 2019 at 4:02 PM Zach York 
>> wrote:
>>
>> > I agree with the sentiment around HBCK2. I think these kind of recovery
>> > tools are essential before marking something stable.
>> >
>> > I also remember when we did testing around HBase 2.x/2.1 that we were
>> > getting perf degradations and couldn't seem to get performance to be as
>> > good as we were getting in the 1.x line.
>> >
>> > - Zach
>> >
>> > On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr 
>> wrote:
>> >
>> > > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate
>> old
>> > > version data to HBase-2. We have use cases where we are using these
>> tools
>> > > to rebuild the meta for further region assignment.
>> > > Similar discussion is going on HBASE-21665, after fixing the NPE and
>> > > rebuilding the meta, master don't assign the regions as we skip the
>> empty
>> > > regions while loading meta during master startup.
>> > >
>> > > A big +1 from my side on this...
>> > >
>> > > Regards,
>> > > Pankaj
>> > >
>> > > -Original Message-
>> > > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com]
>> > > Sent: 18 January 2019 11:55
>> > > To: HBase Dev List 
>> > > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get the
>> > > 'stable' pointer.
>> > >
>> > > So the first priority is to make progress on HBCK2? If we all agree,
>> > let's
>> > > start to work.
>> > >
>> > > Andrew Purtell  于2019年1月18日周五 下午12:31写道:
>> > >
>> > > > Sorry, let me add... Check all the boxes on that list and I'm +1 for
>> > > > moving the stable pointer (modulo some time to pound on the
>> candidate
>> > > > to really put it through its paces, like two weeks of chaos...)
>> > > >
>> > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell > >
>> > > > wrote:
>> > > >
>> > > > > I do not believe we should move the stable pointer to any 2.x
>> until
>> > > > > HBCK2 is feature complete. We can discuss what that milestone
>> should
>> > > look like.
>> > > > > At a minimum, I think we need:
>> > > > >
>> > > > >- Rebuild meta from region metadata in the filesystem, aka
>> offline
>> > > > >meta rebuild.
>> > > > >- Fix assignment errors (undeployed regions, double assignments
>> > > (yes,
>> > > > >should not be possible), etc)
>> > > > >- Fix region holes, overlaps, and other errors in the region
>> chain
>> > > > >- Fix failed split and merge transactions that have failed to
>> roll
>> > > > >back due to some bug (related to previous)
>> > > > >- Enumerate store files to determine file level corruption and
>> > > > >sideline corrupt files
>> > > > >- Fix hfile link problems (dangling / broken)
>> > > > >
>> > > > > This is a list of the real problems I have had to fix in
>> production
>> > > > > at least once (in the past 10 years...).
>> > > > >
>> > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang)
>> > > > > 
>> > > > > wrote:
>> > > > >
>> > > > >> There are still lots of small new features which we want to
>> > > > >> integrate
>> > > > into
>> > > > >> branch-2 so I'm -1 on making release directly from branch-2.
>> > > > >> Backporting at once before release is a pain I'd say, I've tried
>> > > > >> this many times recently, as we have to follow up the community
>> > > > >> version...Let's make a branch-2.2 when we want to release 2.2.0,
>> > > > >> and maybe also retire the branch-2.0?
>> > > > >>
>> > > > >> For the stable pointer, I think 2.1.x maybe a good candidate?
>> > > > >> Though we know that we may still have some bugs for the AMv2, but
>> > > > >> actually we all know that the AMv1 for all the branch-1.x also
>> has
>> > > > >> lots of bugs, that's why hbck is very important.
>> > > > >>
>> > > > >> And also +! on making progress on HBCK2, we need to port he
>> useful
>> > > > >> features of HBCK1 to HBCK2. There is no software can guarantee
>> that
>> > > > >> there is no bug, so FWIW we should have a way to fix broken
>> > > > >> clusters.
>> > > > >>
>> > > > >> Sean Busbey  于2019年1月18日周五 上午11:47写道:
>> > > > >>
>> > > > >> > There are a few related topics I'd like to discuss and I
>> figured
>> > > > >> > this subject line is the most likely to get a bit of attention.
>> > > > >> 

About how features are integrated to different HBase versions

2019-01-18 Thread Duo Zhang
I think we have a good discussion on HBASE-21034, where a feature is back
ported to branch-1, but then folks think that we should not back port them
to branch-2.1 and branch-2.0, as usually we should not add new features to
minor release lines.

I think the reason why we do not want the feature in branch-2.1 and
branch-2.0 is reasonable, but this will introduce another problem. As
later, we will release a 1.5.0 which has the feature, but when a user later
upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is
gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released,
as we do not port the feature to these two branches. This will be very
confusing to users I'd say.

So I think we should guarantee that, a higher version of HBase release will
always contain all the features of a HBase release with a lower version
which is released earlier, unless explicitly mentioned(for example, DLR).

And this implies that, when we setup a new major release and make a new
release on the first minor release line, then the develop branch for the
previous major release will be useless, as said above, usually we do not
want to port any new features to the minor release line of the new major
release, then the new features should not be ported to previous major
release, otherwise we will break the guarantee above. And this also means
that, we could just use the 'develop' branch to make new releases.


Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.

2019-01-18 Thread Duo Zhang
OK, let me find the original HBCK2 issue and see how can we make progress
on it.

BTW, on scan performance, Zheng Hu has done a work to get about 40%
performance back in this issue for 100% scan case on ycsb

https://issues.apache.org/jira/browse/HBASE-21657

Andrew Purtell  于2019年1月19日周六 上午8:14写道:

> Lars was testing tip of branch-2 with Phoenix and said scans were 50%
> slower than branch-1. I’ll try and get him to provide more details. Anyway
> after hbck2 is complete issues like that will come out in the testing we’d
> do as part of sanity checking a move of the pointer.
>
> On Fri, Jan 18, 2019 at 4:02 PM Zach York 
> wrote:
>
> > I agree with the sentiment around HBCK2. I think these kind of recovery
> > tools are essential before marking something stable.
> >
> > I also remember when we did testing around HBase 2.x/2.1 that we were
> > getting perf degradations and couldn't seem to get performance to be as
> > good as we were getting in the 1.x line.
> >
> > - Zach
> >
> > On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr  wrote:
> >
> > > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate old
> > > version data to HBase-2. We have use cases where we are using these
> tools
> > > to rebuild the meta for further region assignment.
> > > Similar discussion is going on HBASE-21665, after fixing the NPE and
> > > rebuilding the meta, master don't assign the regions as we skip the
> empty
> > > regions while loading meta during master startup.
> > >
> > > A big +1 from my side on this...
> > >
> > > Regards,
> > > Pankaj
> > >
> > > -Original Message-
> > > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com]
> > > Sent: 18 January 2019 11:55
> > > To: HBase Dev List 
> > > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get the
> > > 'stable' pointer.
> > >
> > > So the first priority is to make progress on HBCK2? If we all agree,
> > let's
> > > start to work.
> > >
> > > Andrew Purtell  于2019年1月18日周五 下午12:31写道:
> > >
> > > > Sorry, let me add... Check all the boxes on that list and I'm +1 for
> > > > moving the stable pointer (modulo some time to pound on the candidate
> > > > to really put it through its paces, like two weeks of chaos...)
> > > >
> > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell 
> > > > wrote:
> > > >
> > > > > I do not believe we should move the stable pointer to any 2.x until
> > > > > HBCK2 is feature complete. We can discuss what that milestone
> should
> > > look like.
> > > > > At a minimum, I think we need:
> > > > >
> > > > >- Rebuild meta from region metadata in the filesystem, aka
> offline
> > > > >meta rebuild.
> > > > >- Fix assignment errors (undeployed regions, double assignments
> > > (yes,
> > > > >should not be possible), etc)
> > > > >- Fix region holes, overlaps, and other errors in the region
> chain
> > > > >- Fix failed split and merge transactions that have failed to
> roll
> > > > >back due to some bug (related to previous)
> > > > >- Enumerate store files to determine file level corruption and
> > > > >sideline corrupt files
> > > > >- Fix hfile link problems (dangling / broken)
> > > > >
> > > > > This is a list of the real problems I have had to fix in production
> > > > > at least once (in the past 10 years...).
> > > > >
> > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang)
> > > > > 
> > > > > wrote:
> > > > >
> > > > >> There are still lots of small new features which we want to
> > > > >> integrate
> > > > into
> > > > >> branch-2 so I'm -1 on making release directly from branch-2.
> > > > >> Backporting at once before release is a pain I'd say, I've tried
> > > > >> this many times recently, as we have to follow up the community
> > > > >> version...Let's make a branch-2.2 when we want to release 2.2.0,
> > > > >> and maybe also retire the branch-2.0?
> > > > >>
> > > > >> For the stable pointer, I think 2.1.x maybe a good candidate?
> > > > >> Though we know that we may still have some bugs for the AMv2, but
> > > > >> actually we all know that the AMv1 for all the branch-1.x also has
> > > > >> lots of bugs, that's why hbck is very important.
> > > > >>
> > > > >> And also +! on making progress on HBCK2, we need to port he useful
> > > > >> features of HBCK1 to HBCK2. There is no software can guarantee
> that
> > > > >> there is no bug, so FWIW we should have a way to fix broken
> > > > >> clusters.
> > > > >>
> > > > >> Sean Busbey  于2019年1月18日周五 上午11:47写道:
> > > > >>
> > > > >> > There are a few related topics I'd like to discuss and I figured
> > > > >> > this subject line is the most likely to get a bit of attention.
> > > > >> > :)
> > > > >> >
> > > > >> > First, I'd like us all to get on the same page wrt the current
> > > > >> > state of branch-2. Personally, I don't think it can be released
> > > > >> > as-is with a 2.y version because folks can't rolling upgrade
> from
> > > > >> > 2.0 or 2.1 to it due to the current implementation of
> > > > >> > HBASE-20881. 

[jira] [Created] (HBASE-21744) timeout for server list refresh calls

2019-01-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-21744:


 Summary: timeout for server list refresh calls 
 Key: HBASE-21744
 URL: https://issues.apache.org/jira/browse/HBASE-21744
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Not sure why yet, but we are seeing the case when cluster is in overall a bad 
state, where after RS dies and deletes its znode, the notification looks like 
it's lost, so the master doesn't detect the failure. ZK itself appears to be 
healthy and doesn't report anything special.
After some other change is made to the server list, master rescans the list and 
picks up the stale notification. Might make sense to add a config that would 
trigger the refresh if it hasn't happened for a while (e.g. 1 minute).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.

2019-01-18 Thread Andrew Purtell
Lars was testing tip of branch-2 with Phoenix and said scans were 50%
slower than branch-1. I’ll try and get him to provide more details. Anyway
after hbck2 is complete issues like that will come out in the testing we’d
do as part of sanity checking a move of the pointer.

On Fri, Jan 18, 2019 at 4:02 PM Zach York 
wrote:

> I agree with the sentiment around HBCK2. I think these kind of recovery
> tools are essential before marking something stable.
>
> I also remember when we did testing around HBase 2.x/2.1 that we were
> getting perf degradations and couldn't seem to get performance to be as
> good as we were getting in the 1.x line.
>
> - Zach
>
> On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr  wrote:
>
> > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate old
> > version data to HBase-2. We have use cases where we are using these tools
> > to rebuild the meta for further region assignment.
> > Similar discussion is going on HBASE-21665, after fixing the NPE and
> > rebuilding the meta, master don't assign the regions as we skip the empty
> > regions while loading meta during master startup.
> >
> > A big +1 from my side on this...
> >
> > Regards,
> > Pankaj
> >
> > -Original Message-
> > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com]
> > Sent: 18 January 2019 11:55
> > To: HBase Dev List 
> > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get the
> > 'stable' pointer.
> >
> > So the first priority is to make progress on HBCK2? If we all agree,
> let's
> > start to work.
> >
> > Andrew Purtell  于2019年1月18日周五 下午12:31写道:
> >
> > > Sorry, let me add... Check all the boxes on that list and I'm +1 for
> > > moving the stable pointer (modulo some time to pound on the candidate
> > > to really put it through its paces, like two weeks of chaos...)
> > >
> > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell 
> > > wrote:
> > >
> > > > I do not believe we should move the stable pointer to any 2.x until
> > > > HBCK2 is feature complete. We can discuss what that milestone should
> > look like.
> > > > At a minimum, I think we need:
> > > >
> > > >- Rebuild meta from region metadata in the filesystem, aka offline
> > > >meta rebuild.
> > > >- Fix assignment errors (undeployed regions, double assignments
> > (yes,
> > > >should not be possible), etc)
> > > >- Fix region holes, overlaps, and other errors in the region chain
> > > >- Fix failed split and merge transactions that have failed to roll
> > > >back due to some bug (related to previous)
> > > >- Enumerate store files to determine file level corruption and
> > > >sideline corrupt files
> > > >- Fix hfile link problems (dangling / broken)
> > > >
> > > > This is a list of the real problems I have had to fix in production
> > > > at least once (in the past 10 years...).
> > > >
> > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang)
> > > > 
> > > > wrote:
> > > >
> > > >> There are still lots of small new features which we want to
> > > >> integrate
> > > into
> > > >> branch-2 so I'm -1 on making release directly from branch-2.
> > > >> Backporting at once before release is a pain I'd say, I've tried
> > > >> this many times recently, as we have to follow up the community
> > > >> version...Let's make a branch-2.2 when we want to release 2.2.0,
> > > >> and maybe also retire the branch-2.0?
> > > >>
> > > >> For the stable pointer, I think 2.1.x maybe a good candidate?
> > > >> Though we know that we may still have some bugs for the AMv2, but
> > > >> actually we all know that the AMv1 for all the branch-1.x also has
> > > >> lots of bugs, that's why hbck is very important.
> > > >>
> > > >> And also +! on making progress on HBCK2, we need to port he useful
> > > >> features of HBCK1 to HBCK2. There is no software can guarantee that
> > > >> there is no bug, so FWIW we should have a way to fix broken
> > > >> clusters.
> > > >>
> > > >> Sean Busbey  于2019年1月18日周五 上午11:47写道:
> > > >>
> > > >> > There are a few related topics I'd like to discuss and I figured
> > > >> > this subject line is the most likely to get a bit of attention.
> > > >> > :)
> > > >> >
> > > >> > First, I'd like us all to get on the same page wrt the current
> > > >> > state of branch-2. Personally, I don't think it can be released
> > > >> > as-is with a 2.y version because folks can't rolling upgrade from
> > > >> > 2.0 or 2.1 to it due to the current implementation of
> > > >> > HBASE-20881. As Duo has mentioned a couple of times, folks have
> > > >> > to ensure there are no region transitions around during the
> > > >> > upgrade. I think that will be prohibitive for folks looking to
> > upgrade. What do other folks think?
> > > >> >
> > > >> > Second, I think our recent discussions around the need for
> > > >> > shifting to more minor releases for HBase 1.y also applies to the
> > 2.y branches.
> > > >> > branch-2 hasn't had a release since 2.1.0 came out in July 2018.
> > > >> > That's a scary long 

[jira] [Created] (HBASE-21743) stateless assignment

2019-01-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-21743:


 Summary: stateless assignment
 Key: HBASE-21743
 URL: https://issues.apache.org/jira/browse/HBASE-21743
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Running HBase for only a few weeks we found dozen(s?) of bugs with assignment 
that all seem to have the same nature - split brain between 2 procedures; or 
between procedure and master startup (meta replica bugs); or procedure and 
master shutdown (HBASE-21742); or procedure and something else (when SCP had 
incorrect region list persisted, don't recall the bug#). 
To me, it starts to look like a pattern where, like in AMv1 where concurrent 
interactions were unclear and hard to reason about, despite the cleaner 
individual pieces in AMv2 the problem of unclear concurrent interactions has 
been preserved and in fact increased because of the operation state persistence 
and  isolation.

Procedures are great for multi-step operations that need rollback and stuff 
like that, e.g. creating a table or snapshot, or even region splitting. However 
I'm not so sure about assignment. 

We have the persisted information - region state in meta (incl transition 
states like opening, or closing), server list as WAL directory list. Procedure 
state is not any more reliable then those (we can argue that meta update can 
fail, but so can procv2 WAL flush, so we have to handle cases of out of date 
information regardless). So, we don't need any extra state to decide on 
assignment, whether for recovery and balancing. In fact, as mentioned in some 
bugs, deleting procv2 WAL is often the best way to recover the cluster, because 
master can already figure out what to do without additional state.

I think there should be an option for stateless assignment that does that.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.

2019-01-18 Thread Zach York
I agree with the sentiment around HBCK2. I think these kind of recovery
tools are essential before marking something stable.

I also remember when we did testing around HBase 2.x/2.1 that we were
getting perf degradations and couldn't seem to get performance to be as
good as we were getting in the 1.x line.

- Zach

On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr  wrote:

> Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate old
> version data to HBase-2. We have use cases where we are using these tools
> to rebuild the meta for further region assignment.
> Similar discussion is going on HBASE-21665, after fixing the NPE and
> rebuilding the meta, master don't assign the regions as we skip the empty
> regions while loading meta during master startup.
>
> A big +1 from my side on this...
>
> Regards,
> Pankaj
>
> -Original Message-
> From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com]
> Sent: 18 January 2019 11:55
> To: HBase Dev List 
> Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get the
> 'stable' pointer.
>
> So the first priority is to make progress on HBCK2? If we all agree, let's
> start to work.
>
> Andrew Purtell  于2019年1月18日周五 下午12:31写道:
>
> > Sorry, let me add... Check all the boxes on that list and I'm +1 for
> > moving the stable pointer (modulo some time to pound on the candidate
> > to really put it through its paces, like two weeks of chaos...)
> >
> > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell 
> > wrote:
> >
> > > I do not believe we should move the stable pointer to any 2.x until
> > > HBCK2 is feature complete. We can discuss what that milestone should
> look like.
> > > At a minimum, I think we need:
> > >
> > >- Rebuild meta from region metadata in the filesystem, aka offline
> > >meta rebuild.
> > >- Fix assignment errors (undeployed regions, double assignments
> (yes,
> > >should not be possible), etc)
> > >- Fix region holes, overlaps, and other errors in the region chain
> > >- Fix failed split and merge transactions that have failed to roll
> > >back due to some bug (related to previous)
> > >- Enumerate store files to determine file level corruption and
> > >sideline corrupt files
> > >- Fix hfile link problems (dangling / broken)
> > >
> > > This is a list of the real problems I have had to fix in production
> > > at least once (in the past 10 years...).
> > >
> > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang)
> > > 
> > > wrote:
> > >
> > >> There are still lots of small new features which we want to
> > >> integrate
> > into
> > >> branch-2 so I'm -1 on making release directly from branch-2.
> > >> Backporting at once before release is a pain I'd say, I've tried
> > >> this many times recently, as we have to follow up the community
> > >> version...Let's make a branch-2.2 when we want to release 2.2.0,
> > >> and maybe also retire the branch-2.0?
> > >>
> > >> For the stable pointer, I think 2.1.x maybe a good candidate?
> > >> Though we know that we may still have some bugs for the AMv2, but
> > >> actually we all know that the AMv1 for all the branch-1.x also has
> > >> lots of bugs, that's why hbck is very important.
> > >>
> > >> And also +! on making progress on HBCK2, we need to port he useful
> > >> features of HBCK1 to HBCK2. There is no software can guarantee that
> > >> there is no bug, so FWIW we should have a way to fix broken
> > >> clusters.
> > >>
> > >> Sean Busbey  于2019年1月18日周五 上午11:47写道:
> > >>
> > >> > There are a few related topics I'd like to discuss and I figured
> > >> > this subject line is the most likely to get a bit of attention.
> > >> > :)
> > >> >
> > >> > First, I'd like us all to get on the same page wrt the current
> > >> > state of branch-2. Personally, I don't think it can be released
> > >> > as-is with a 2.y version because folks can't rolling upgrade from
> > >> > 2.0 or 2.1 to it due to the current implementation of
> > >> > HBASE-20881. As Duo has mentioned a couple of times, folks have
> > >> > to ensure there are no region transitions around during the
> > >> > upgrade. I think that will be prohibitive for folks looking to
> upgrade. What do other folks think?
> > >> >
> > >> > Second, I think our recent discussions around the need for
> > >> > shifting to more minor releases for HBase 1.y also applies to the
> 2.y branches.
> > >> > branch-2 hasn't had a release since 2.1.0 came out in July 2018.
> > >> > That's a scary long amount of time. I think it contributes to us
> > >> > ending up with changes like the above since it's easy to think
> > >> > about the branch as something that has a lot of time before the
> > >> > next release.
> > >> >
> > >> > Personally, I'd like to see us skip making minor-release specific
> > >> > branches for a bit unless a CVE fix or something comes up.
> > >> > Ideally, that would mean we work towards a 2.2.0 release directly
> > >> > from branch-2 and then 2.2.1, etc. When we have a feature that's
> > >> > ready to 

[jira] [Created] (HBASE-21742) master can create bad procedures during abort, making entire cluster unusable

2019-01-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-21742:


 Summary: master can create bad procedures during abort, making 
entire cluster unusable
 Key: HBASE-21742
 URL: https://issues.apache.org/jira/browse/HBASE-21742
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Some small HDFS hiccup causes master and meta RS to fail together. Master goes 
first:
{noformat}
2019-01-18 08:09:46,790 INFO  [KeepAlivePEWorker-311] 
zookeeper.MetaTableLocator: Setting hbase:meta (replicaId=0) location in 
ZooKeeper as meta-rs,17020,1547824792484
...
2019-01-18 10:01:16,904 ERROR [PEWorker-11] master.HMaster: * ABORTING 
master master,17000,1547604554447: FAILED [blah] *
...
2019-01-18 10:01:17,087 INFO  [master/master:17000] 
assignment.AssignmentManager: Stopping assignment manager
{noformat}
Bunch of stuff keeps happening, including procedure retries, which is also 
suspect, but not the point here:
{noformat}
2019-01-18 10:01:21,598 INFO  [PEWorker-3] procedure2.TimeoutExecutorThread: 
ADDED pid=104031, state=WAITING_TIMEOUT:REGION_STATE_TRANSITION_CLOSE, ... 
{noformat}
{noformat}

Then the meta RS decides it's time to go:
{noformat}
2019-01-18 10:01:25,319 INFO  [RegionServerTracker-0] 
master.RegionServerTracker: RegionServer ephemeral node deleted, processing 
expiration [meta-rs,17020,1547824792484]
...
2019-01-18 10:01:25,463 INFO  [RegionServerTracker-0] 
assignment.AssignmentManager: Added meta-rs,17020,1547824792484 to dead servers 
which carryingMeta=false, submitted ServerCrashProcedure pid=104313
{noformat}
This SCP gets persisted, so when the next master starts, it waits forever for 
meta to be onlined, while there's no SCP with meta=true to online it.

The only way around this is to delete the procv2 WAL - master has all the 
information here, as it often does in bugs I've found recently, but some split 
brain procedures cause it to get stuck one way or another.

I will file a separate bug about that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21741) Add a note in "HFile Tool" section regarding 'seqid=0'

2019-01-18 Thread Sakthi (JIRA)
Sakthi created HBASE-21741:
--

 Summary: Add a note in "HFile Tool" section regarding 'seqid=0'
 Key: HBASE-21741
 URL: https://issues.apache.org/jira/browse/HBASE-21741
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Sakthi
Assignee: Sakthi


In few parts of the HFile, where the seqid is irrelevant such as:
* firstKey=Optional[row0/cf:column/1547846312435/Put/seqid=0]
* lastKey=Optional[row9/cf:column/1547846312490/Put/seqid=0]

Let's make a note on the doc in the 'HFile Tool' section, that seqid=0 in such 
cases means seqid is irrelevant here because it's a 'KeyOnlyKeyValue'.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Requesting access to hbase slack channel

2019-01-18 Thread Peter Somogyi
Sent.

On Fri, Jan 18, 2019 at 7:48 AM Abhishek Gupta  wrote:

> Hi,
>
> I would like an invitation too abhila...@gmail.com
>
> Thanks
>
> On Fri, Jan 18, 2019 at 11:38 AM Manjeet Singh  >
> wrote:
>
> > Done
> >
> > On Fri, 18 Jan 2019, 11:24 Buchi Reddy Busi Reddy  > wrote:
> >
> > > Can you also invite mailtobu...@gmail.com please?
> > >
> > > On Thu, Jan 17, 2019 at 8:34 PM Manjeet Singh <
> > manjeet.chand...@gmail.com>
> > > wrote:
> > >
> > > > Seems someone else already did it
> > > >
> > > > Manjeet
> > > >
> > > > On Wed, 16 Jan 2019, 17:33 Nihal Jain  > > >
> > > > > Hi
> > > > >
> > > > > Could you please invite me: nihaljain...@gmail.com <
> > > > nihaljain...@gmail.com
> > > > > >?
> > > > >
> > > > > Regards,
> > > > > Nihal
> > > > >
> > > > > On Wed, 16 Jan, 2019, 2:54 PM Peter Somogyi  > > wrote:
> > > > >
> > > > > > Sent invitation to madhurpan...@gmail.com.
> > > > > >
> > > > > > On Wed, Jan 16, 2019 at 7:27 AM Madhur Pant <
> > madhurpan...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Team,
> > > > > > >
> > > > > > > I was wondering if I could get access to the HBase user slack
> > > channel
> > > > > > >
> > > > > > > https://apache-hbase.slack.com
> > > > > > >
> > > > > > > It says here <
> https://issues.apache.org/jira/browse/HBASE-16413>
> > > > > that I
> > > > > > > should email you guys  :)
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Madhur Pant
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Requesting access to slack channel

2019-01-18 Thread Balazs Meszaros
Invitation has been sent.

Best,
Balazs

On Fri, Jan 18, 2019 at 6:48 AM aman goyal  wrote:

> Hi Team,
>
> Pls provide access to hbase user slack channel
> https://apache-hbase.slack.com
>
> Pls send invitation to aman...@gmail.com
>
> Thanks,
> Aman
>


[jira] [Created] (HBASE-21740) NPE happens while shutdown the RS

2019-01-18 Thread lujie (JIRA)
lujie created HBASE-21740:
-

 Summary: NPE happens while shutdown the RS
 Key: HBASE-21740
 URL: https://issues.apache.org/jira/browse/HBASE-21740
 Project: HBase
  Issue Type: Bug
Reporter: lujie


while shutdown a NM, we meet the NPE:
{code:java}
2019-01-18 16:52:05,500 INFO [Thread-4] regionserver.HRegionServer: STOPPED: 
Shutdown hook
2019-01-18 16:52:05,896 INFO [regionserver/hadoop15:16020] 
regionserver.MetricsRegionServerWrapperImpl: Computing regionserver metrics 
every 5000 milliseconds
2019-01-18 16:52:05,978 INFO [regionserver/hadoop15:16020.Chore.1] 
hbase.ScheduledChore: Chore: CompactedHFilesCleaner was stopped
2019-01-18 16:52:05,996 ERROR [regionserver/hadoop15:16020] 
regionserver.HRegionServer: Failed init
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startServices(HRegionServer.java:1978)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1572)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:975)
at java.lang.Thread.run(Thread.java:745)
2019-01-18 16:52:06,011 ERROR [regionserver/hadoop15:16020] 
regionserver.HRegionServer: * ABORTING region server 
hadoop15,16020,1547801516426: Unhandled: Region server startup failed *
java.io.IOException: Region server startup failed
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:3392)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1591)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:975)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startServices(HRegionServer.java:1978)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1572)
... 2 more

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)