Re: About how features are integrated to different HBase versions
I agree with Andrew that we can't both have maintenance releases and expect every feature in ongoing branch-1 releases to be in branches-2.y. Tracking consideration for when features are available across major versions fits in well with the "upgrade paths" section in the ref guide. We've just gotten in the habit of it only getting filled in when a big release is coming up. On Fri, Jan 18, 2019, 23:46 张铎(Duo Zhang) Then we must have a upgrade path, for example, 1.5.x can only be upgraded > to 2.2.x if you want all the features still there? > > Maybe we should have a release timeline for the first release of all the > minor releases? So when user want to upgrade, they can choose the minor > release which is released later than the current one. > > Andrew Purtell 于2019年1月19日 周六13:15写道: > > > Also I think branch-1 releases will be done on a monthly cadence > > independent of any branch-2 releases. This is because there are different > > RMs at work with different needs and schedules. > > > > I can certainly help out some with branch-2 releasing if you need it, > > FWIW. > > > > It may also help if we begin talking about 1.x and 2.x as separate > > "products". This can help avoid confusion about features in 1.5 not in > 2.1 > > but in 2.2. For all practical purposes they are separate products. Some > of > > our community develop and run branch-1. Others develop and run branch-2. > > There is some overlap but the overlap is not total. The concerns will > > diverge a bit. I think this is healthy. Everyone is attending to what > they > > need. Let's figure out how to make it work. > > > > > On Jan 18, 2019, at 9:04 PM, Andrew Purtell > > wrote: > > > > > > Also please be prepared to support forward evolution and maintenance of > > branch-1 for, potentially, years. Because it is used in production and > will > > continue to do so for a long time. Features may end up in 1.6.0 that only > > appear in 2.3 or 2.4. And in 1.7 that only appear in 2.5 or 2.6. This > > shouldn't be confusing. We just need to document it. JIRA helps some, > > release notes can help a lot more. Maybe in the future a feature to > version > > matrix in the book. > > > > > >> On Jan 18, 2019, at 8:59 PM, Andrew Purtell > > > wrote: > > >> > > >> This can't work, because we can put things into a new minor that > cannot > > go into a patch relesse. If you say instead 2.2.0 must have everything in > > 1.5.0, it can work. The alignment of features should happen at the minor > > releases. If we can also have alignment in patch releases too, that would > > be great, but can't be mandatory. > > >> > > >>> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang) > > wrote: > > >>> > > >>> Please see the red words carefully, I explicitly mentioned that, the > > newer > > >>> version should be released LATER, if you want to get all the > features. > > >>> > > >>> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1 > > >>> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 > > has, > > >>> but 2.1.1 should have all features which 1.5.0 has. > > >>> > > >>> Sergey Shelukhin > > 于2019年1月19日周六 > > >>> 上午10:23写道: > > >>> > > Consider that we actually cannot guarantee this without a time > > machine, > > because some "newer" versions are already released. > > > > If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, > > 2.1.2, etc. because they are already released... if the user > upgrades > > from > > 1.5 to 2.0.1 for example, they will lose the feature no matter what. > > The only way to ensure is to > > - always update to latest dot version, > > - also for us to make sure we never release before releasing every > > "later" > > dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so > > there's the latest release for every line). > > - and also for us to make sure that every single dot line actually > > has a > > release - when e.g. 2.0.X line is abandoned that may not happen, so > > the > > latest version of 2.0.X will precede latest 1.Y because 1.Y may > still > > be > > active (like as far as I recall 0.94 was getting dot releases even > > when > > 0.96 was abandoned) - so even if the user goes from 1.Y to the > latest > > 2.0.X > > they will lose the feature. > > > > I think this is kind of expected... I agree that it needs to be > > documented. To an extent it already is in JIRA where fixVersion may > be > > "3.0, 2.2, 1.5", but it makes sense to document explicitly. > > > > -Original Message- > > From: 张铎(Duo Zhang) > > Sent: Friday, January 18, 2019 5:50 PM > > To: HBase Dev List > > Subject: About how features are integrated to different HBase > versions > > > > I think we have a good discussion on HBASE-21034, where a feature is > > back > > ported to branch-1, but then folks think that we should not back > port > > them > > to
[jira] [Created] (HBASE-21746) RegionMover.stripServer will return the last server if the target server does not exist
Duo Zhang created HBASE-21746: - Summary: RegionMover.stripServer will return the last server if the target server does not exist Key: HBASE-21746 URL: https://issues.apache.org/jira/browse/HBASE-21746 Project: HBase Issue Type: Bug Reporter: Duo Zhang It should return null for this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: About how features are integrated to different HBase versions
Then we must have a upgrade path, for example, 1.5.x can only be upgraded to 2.2.x if you want all the features still there? Maybe we should have a release timeline for the first release of all the minor releases? So when user want to upgrade, they can choose the minor release which is released later than the current one. Andrew Purtell 于2019年1月19日 周六13:15写道: > Also I think branch-1 releases will be done on a monthly cadence > independent of any branch-2 releases. This is because there are different > RMs at work with different needs and schedules. > > I can certainly help out some with branch-2 releasing if you need it, > FWIW. > > It may also help if we begin talking about 1.x and 2.x as separate > "products". This can help avoid confusion about features in 1.5 not in 2.1 > but in 2.2. For all practical purposes they are separate products. Some of > our community develop and run branch-1. Others develop and run branch-2. > There is some overlap but the overlap is not total. The concerns will > diverge a bit. I think this is healthy. Everyone is attending to what they > need. Let's figure out how to make it work. > > > On Jan 18, 2019, at 9:04 PM, Andrew Purtell > wrote: > > > > Also please be prepared to support forward evolution and maintenance of > branch-1 for, potentially, years. Because it is used in production and will > continue to do so for a long time. Features may end up in 1.6.0 that only > appear in 2.3 or 2.4. And in 1.7 that only appear in 2.5 or 2.6. This > shouldn't be confusing. We just need to document it. JIRA helps some, > release notes can help a lot more. Maybe in the future a feature to version > matrix in the book. > > > >> On Jan 18, 2019, at 8:59 PM, Andrew Purtell > wrote: > >> > >> This can't work, because we can put things into a new minor that cannot > go into a patch relesse. If you say instead 2.2.0 must have everything in > 1.5.0, it can work. The alignment of features should happen at the minor > releases. If we can also have alignment in patch releases too, that would > be great, but can't be mandatory. > >> > >>> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang) > wrote: > >>> > >>> Please see the red words carefully, I explicitly mentioned that, the > newer > >>> version should be released LATER, if you want to get all the features. > >>> > >>> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1 > >>> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 > has, > >>> but 2.1.1 should have all features which 1.5.0 has. > >>> > >>> Sergey Shelukhin > 于2019年1月19日周六 > >>> 上午10:23写道: > >>> > Consider that we actually cannot guarantee this without a time > machine, > because some "newer" versions are already released. > > If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, > 2.1.2, etc. because they are already released... if the user upgrades > from > 1.5 to 2.0.1 for example, they will lose the feature no matter what. > The only way to ensure is to > - always update to latest dot version, > - also for us to make sure we never release before releasing every > "later" > dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so > there's the latest release for every line). > - and also for us to make sure that every single dot line actually > has a > release - when e.g. 2.0.X line is abandoned that may not happen, so > the > latest version of 2.0.X will precede latest 1.Y because 1.Y may still > be > active (like as far as I recall 0.94 was getting dot releases even > when > 0.96 was abandoned) - so even if the user goes from 1.Y to the latest > 2.0.X > they will lose the feature. > > I think this is kind of expected... I agree that it needs to be > documented. To an extent it already is in JIRA where fixVersion may be > "3.0, 2.2, 1.5", but it makes sense to document explicitly. > > -Original Message- > From: 张铎(Duo Zhang) > Sent: Friday, January 18, 2019 5:50 PM > To: HBase Dev List > Subject: About how features are integrated to different HBase versions > > I think we have a good discussion on HBASE-21034, where a feature is > back > ported to branch-1, but then folks think that we should not back port > them > to branch-2.1 and branch-2.0, as usually we should not add new > features to > minor release lines. > > I think the reason why we do not want the feature in branch-2.1 and > branch-2.0 is reasonable, but this will introduce another problem. As > later, we will release a 1.5.0 which has the feature, but when a user > later > upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature > is > gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is > released, > as we do not port the feature to these two branches. This will be very > confusing to users I'd say. > > So I think
Re: About how features are integrated to different HBase versions
Also I think branch-1 releases will be done on a monthly cadence independent of any branch-2 releases. This is because there are different RMs at work with different needs and schedules. I can certainly help out some with branch-2 releasing if you need it, FWIW. It may also help if we begin talking about 1.x and 2.x as separate "products". This can help avoid confusion about features in 1.5 not in 2.1 but in 2.2. For all practical purposes they are separate products. Some of our community develop and run branch-1. Others develop and run branch-2. There is some overlap but the overlap is not total. The concerns will diverge a bit. I think this is healthy. Everyone is attending to what they need. Let's figure out how to make it work. > On Jan 18, 2019, at 9:04 PM, Andrew Purtell wrote: > > Also please be prepared to support forward evolution and maintenance of > branch-1 for, potentially, years. Because it is used in production and will > continue to do so for a long time. Features may end up in 1.6.0 that only > appear in 2.3 or 2.4. And in 1.7 that only appear in 2.5 or 2.6. This > shouldn't be confusing. We just need to document it. JIRA helps some, release > notes can help a lot more. Maybe in the future a feature to version matrix in > the book. > >> On Jan 18, 2019, at 8:59 PM, Andrew Purtell wrote: >> >> This can't work, because we can put things into a new minor that cannot go >> into a patch relesse. If you say instead 2.2.0 must have everything in >> 1.5.0, it can work. The alignment of features should happen at the minor >> releases. If we can also have alignment in patch releases too, that would be >> great, but can't be mandatory. >> >>> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang) wrote: >>> >>> Please see the red words carefully, I explicitly mentioned that, the newer >>> version should be released LATER, if you want to get all the features. >>> >>> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1 >>> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has, >>> but 2.1.1 should have all features which 1.5.0 has. >>> >>> Sergey Shelukhin 于2019年1月19日周六 >>> 上午10:23写道: >>> Consider that we actually cannot guarantee this without a time machine, because some "newer" versions are already released. If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, 2.1.2, etc. because they are already released... if the user upgrades from 1.5 to 2.0.1 for example, they will lose the feature no matter what. The only way to ensure is to - always update to latest dot version, - also for us to make sure we never release before releasing every "later" dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so there's the latest release for every line). - and also for us to make sure that every single dot line actually has a release - when e.g. 2.0.X line is abandoned that may not happen, so the latest version of 2.0.X will precede latest 1.Y because 1.Y may still be active (like as far as I recall 0.94 was getting dot releases even when 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X they will lose the feature. I think this is kind of expected... I agree that it needs to be documented. To an extent it already is in JIRA where fixVersion may be "3.0, 2.2, 1.5", but it makes sense to document explicitly. -Original Message- From: 张铎(Duo Zhang) Sent: Friday, January 18, 2019 5:50 PM To: HBase Dev List Subject: About how features are integrated to different HBase versions I think we have a good discussion on HBASE-21034, where a feature is back ported to branch-1, but then folks think that we should not back port them to branch-2.1 and branch-2.0, as usually we should not add new features to minor release lines. I think the reason why we do not want the feature in branch-2.1 and branch-2.0 is reasonable, but this will introduce another problem. As later, we will release a 1.5.0 which has the feature, but when a user later upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released, as we do not port the feature to these two branches. This will be very confusing to users I'd say. So I think we should guarantee that, a higher version of HBase release will always contain all the features of a HBase release with a lower version which is released earlier, unless explicitly mentioned(for example, DLR). And this implies that, when we setup a new major release and make a new release on the first minor release line, then the develop branch for the previous major release will be useless, as said above, usually we do not want to port any new features to the minor
Re: About how features are integrated to different HBase versions
Also please be prepared to support forward evolution and maintenance of branch-1 for, potentially, years. Because it is used in production and will continue to do so for a long time. Features may end up in 1.6.0 that only appear in 2.3 or 2.4. And in 1.7 that only appear in 2.5 or 2.6. This shouldn't be confusing. We just need to document it. JIRA helps some, release notes can help a lot more. Maybe in the future a feature to version matrix in the book. > On Jan 18, 2019, at 8:59 PM, Andrew Purtell wrote: > > This can't work, because we can put things into a new minor that cannot go > into a patch relesse. If you say instead 2.2.0 must have everything in 1.5.0, > it can work. The alignment of features should happen at the minor releases. > If we can also have alignment in patch releases too, that would be great, but > can't be mandatory. > >> On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang) wrote: >> >> Please see the red words carefully, I explicitly mentioned that, the newer >> version should be released LATER, if you want to get all the features. >> >> For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1 >> tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has, >> but 2.1.1 should have all features which 1.5.0 has. >> >> Sergey Shelukhin 于2019年1月19日周六 >> 上午10:23写道: >> >>> Consider that we actually cannot guarantee this without a time machine, >>> because some "newer" versions are already released. >>> >>> If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, >>> 2.1.2, etc. because they are already released... if the user upgrades from >>> 1.5 to 2.0.1 for example, they will lose the feature no matter what. >>> The only way to ensure is to >>> - always update to latest dot version, >>> - also for us to make sure we never release before releasing every "later" >>> dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so >>> there's the latest release for every line). >>> - and also for us to make sure that every single dot line actually has a >>> release - when e.g. 2.0.X line is abandoned that may not happen, so the >>> latest version of 2.0.X will precede latest 1.Y because 1.Y may still be >>> active (like as far as I recall 0.94 was getting dot releases even when >>> 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X >>> they will lose the feature. >>> >>> I think this is kind of expected... I agree that it needs to be >>> documented. To an extent it already is in JIRA where fixVersion may be >>> "3.0, 2.2, 1.5", but it makes sense to document explicitly. >>> >>> -Original Message- >>> From: 张铎(Duo Zhang) >>> Sent: Friday, January 18, 2019 5:50 PM >>> To: HBase Dev List >>> Subject: About how features are integrated to different HBase versions >>> >>> I think we have a good discussion on HBASE-21034, where a feature is back >>> ported to branch-1, but then folks think that we should not back port them >>> to branch-2.1 and branch-2.0, as usually we should not add new features to >>> minor release lines. >>> >>> I think the reason why we do not want the feature in branch-2.1 and >>> branch-2.0 is reasonable, but this will introduce another problem. As >>> later, we will release a 1.5.0 which has the feature, but when a user later >>> upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is >>> gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released, >>> as we do not port the feature to these two branches. This will be very >>> confusing to users I'd say. >>> >>> So I think we should guarantee that, a higher version of HBase release >>> will always contain all the features of a HBase release with a lower >>> version which is released earlier, unless explicitly mentioned(for example, >>> DLR). >>> >>> And this implies that, when we setup a new major release and make a new >>> release on the first minor release line, then the develop branch for the >>> previous major release will be useless, as said above, usually we do not >>> want to port any new features to the minor release line of the new major >>> release, then the new features should not be ported to previous major >>> release, otherwise we will break the guarantee above. And this also means >>> that, we could just use the 'develop' branch to make new releases. >>>
Re: About how features are integrated to different HBase versions
This can't work, because we can put things into a new minor that cannot go into a patch relesse. If you say instead 2.2.0 must have everything in 1.5.0, it can work. The alignment of features should happen at the minor releases. If we can also have alignment in patch releases too, that would be great, but can't be mandatory. > On Jan 18, 2019, at 7:12 PM, 张铎(Duo Zhang) wrote: > > Please see the red words carefully, I explicitly mentioned that, the newer > version should be released LATER, if you want to get all the features. > > For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1 > tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has, > but 2.1.1 should have all features which 1.5.0 has. > > Sergey Shelukhin 于2019年1月19日周六 > 上午10:23写道: > >> Consider that we actually cannot guarantee this without a time machine, >> because some "newer" versions are already released. >> >> If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, >> 2.1.2, etc. because they are already released... if the user upgrades from >> 1.5 to 2.0.1 for example, they will lose the feature no matter what. >> The only way to ensure is to >> - always update to latest dot version, >> - also for us to make sure we never release before releasing every "later" >> dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so >> there's the latest release for every line). >> - and also for us to make sure that every single dot line actually has a >> release - when e.g. 2.0.X line is abandoned that may not happen, so the >> latest version of 2.0.X will precede latest 1.Y because 1.Y may still be >> active (like as far as I recall 0.94 was getting dot releases even when >> 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X >> they will lose the feature. >> >> I think this is kind of expected... I agree that it needs to be >> documented. To an extent it already is in JIRA where fixVersion may be >> "3.0, 2.2, 1.5", but it makes sense to document explicitly. >> >> -Original Message- >> From: 张铎(Duo Zhang) >> Sent: Friday, January 18, 2019 5:50 PM >> To: HBase Dev List >> Subject: About how features are integrated to different HBase versions >> >> I think we have a good discussion on HBASE-21034, where a feature is back >> ported to branch-1, but then folks think that we should not back port them >> to branch-2.1 and branch-2.0, as usually we should not add new features to >> minor release lines. >> >> I think the reason why we do not want the feature in branch-2.1 and >> branch-2.0 is reasonable, but this will introduce another problem. As >> later, we will release a 1.5.0 which has the feature, but when a user later >> upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is >> gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released, >> as we do not port the feature to these two branches. This will be very >> confusing to users I'd say. >> >> So I think we should guarantee that, a higher version of HBase release >> will always contain all the features of a HBase release with a lower >> version which is released earlier, unless explicitly mentioned(for example, >> DLR). >> >> And this implies that, when we setup a new major release and make a new >> release on the first minor release line, then the develop branch for the >> previous major release will be useless, as said above, usually we do not >> want to port any new features to the minor release line of the new major >> release, then the new features should not be ported to previous major >> release, otherwise we will break the guarantee above. And this also means >> that, we could just use the 'develop' branch to make new releases. >>
Re: About how features are integrated to different HBase versions
And yes we should document this. And also we need to have a web page to list all the releases in timeline? So user could know which version is safe to use when upgrading easily. 张铎(Duo Zhang) 于2019年1月19日周六 上午11:12写道: > Please see the red words carefully, I explicitly mentioned that, the newer > version should be released LATER, if you want to get all the features. > > For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1 > tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has, > but 2.1.1 should have all features which 1.5.0 has. > > Sergey Shelukhin 于2019年1月19日周六 > 上午10:23写道: > >> Consider that we actually cannot guarantee this without a time machine, >> because some "newer" versions are already released. >> >> If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, >> 2.1.2, etc. because they are already released... if the user upgrades from >> 1.5 to 2.0.1 for example, they will lose the feature no matter what. >> The only way to ensure is to >> - always update to latest dot version, >> - also for us to make sure we never release before releasing every >> "later" dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. >> so there's the latest release for every line). >> - and also for us to make sure that every single dot line actually has a >> release - when e.g. 2.0.X line is abandoned that may not happen, so the >> latest version of 2.0.X will precede latest 1.Y because 1.Y may still be >> active (like as far as I recall 0.94 was getting dot releases even when >> 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X >> they will lose the feature. >> >> I think this is kind of expected... I agree that it needs to be >> documented. To an extent it already is in JIRA where fixVersion may be >> "3.0, 2.2, 1.5", but it makes sense to document explicitly. >> >> -Original Message- >> From: 张铎(Duo Zhang) >> Sent: Friday, January 18, 2019 5:50 PM >> To: HBase Dev List >> Subject: About how features are integrated to different HBase versions >> >> I think we have a good discussion on HBASE-21034, where a feature is back >> ported to branch-1, but then folks think that we should not back port them >> to branch-2.1 and branch-2.0, as usually we should not add new features to >> minor release lines. >> >> I think the reason why we do not want the feature in branch-2.1 and >> branch-2.0 is reasonable, but this will introduce another problem. As >> later, we will release a 1.5.0 which has the feature, but when a user later >> upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is >> gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released, >> as we do not port the feature to these two branches. This will be very >> confusing to users I'd say. >> >> So I think we should guarantee that, a higher version of HBase release >> will always contain all the features of a HBase release with a lower >> version which is released earlier, unless explicitly mentioned(for example, >> DLR). >> >> And this implies that, when we setup a new major release and make a new >> release on the first minor release line, then the develop branch for the >> previous major release will be useless, as said above, usually we do not >> want to port any new features to the minor release line of the new major >> release, then the new features should not be ported to previous major >> release, otherwise we will break the guarantee above. And this also means >> that, we could just use the 'develop' branch to make new releases. >> >
Re: About how features are integrated to different HBase versions
Please see the red words carefully, I explicitly mentioned that, the newer version should be released LATER, if you want to get all the features. For example, you release 2.1.0 yesterday, 1.5.0 today, and then 2.1.1 tomorrow, it is OK that 2.1.0 does not have all feature which 1.5.0 has, but 2.1.1 should have all features which 1.5.0 has. Sergey Shelukhin 于2019年1月19日周六 上午10:23写道: > Consider that we actually cannot guarantee this without a time machine, > because some "newer" versions are already released. > > If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, > 2.1.2, etc. because they are already released... if the user upgrades from > 1.5 to 2.0.1 for example, they will lose the feature no matter what. > The only way to ensure is to > - always update to latest dot version, > - also for us to make sure we never release before releasing every "later" > dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so > there's the latest release for every line). > - and also for us to make sure that every single dot line actually has a > release - when e.g. 2.0.X line is abandoned that may not happen, so the > latest version of 2.0.X will precede latest 1.Y because 1.Y may still be > active (like as far as I recall 0.94 was getting dot releases even when > 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X > they will lose the feature. > > I think this is kind of expected... I agree that it needs to be > documented. To an extent it already is in JIRA where fixVersion may be > "3.0, 2.2, 1.5", but it makes sense to document explicitly. > > -Original Message- > From: 张铎(Duo Zhang) > Sent: Friday, January 18, 2019 5:50 PM > To: HBase Dev List > Subject: About how features are integrated to different HBase versions > > I think we have a good discussion on HBASE-21034, where a feature is back > ported to branch-1, but then folks think that we should not back port them > to branch-2.1 and branch-2.0, as usually we should not add new features to > minor release lines. > > I think the reason why we do not want the feature in branch-2.1 and > branch-2.0 is reasonable, but this will introduce another problem. As > later, we will release a 1.5.0 which has the feature, but when a user later > upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is > gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released, > as we do not port the feature to these two branches. This will be very > confusing to users I'd say. > > So I think we should guarantee that, a higher version of HBase release > will always contain all the features of a HBase release with a lower > version which is released earlier, unless explicitly mentioned(for example, > DLR). > > And this implies that, when we setup a new major release and make a new > release on the first minor release line, then the develop branch for the > previous major release will be useless, as said above, usually we do not > want to port any new features to the minor release line of the new major > release, then the new features should not be ported to previous major > release, otherwise we will break the guarantee above. And this also means > that, we could just use the 'develop' branch to make new releases. >
RE: About how features are integrated to different HBase versions
Consider that we actually cannot guarantee this without a time machine, because some "newer" versions are already released. If we backport to 1.5 now, we cannot do anything about 2.0.0, 2.1.0, 2.1.2, etc. because they are already released... if the user upgrades from 1.5 to 2.0.1 for example, they will lose the feature no matter what. The only way to ensure is to - always update to latest dot version, - also for us to make sure we never release before releasing every "later" dot release (e.g. we cannot release 1.5 before 2.1.3, 2.0.N, etc. so there's the latest release for every line). - and also for us to make sure that every single dot line actually has a release - when e.g. 2.0.X line is abandoned that may not happen, so the latest version of 2.0.X will precede latest 1.Y because 1.Y may still be active (like as far as I recall 0.94 was getting dot releases even when 0.96 was abandoned) - so even if the user goes from 1.Y to the latest 2.0.X they will lose the feature. I think this is kind of expected... I agree that it needs to be documented. To an extent it already is in JIRA where fixVersion may be "3.0, 2.2, 1.5", but it makes sense to document explicitly. -Original Message- From: 张铎(Duo Zhang) Sent: Friday, January 18, 2019 5:50 PM To: HBase Dev List Subject: About how features are integrated to different HBase versions I think we have a good discussion on HBASE-21034, where a feature is back ported to branch-1, but then folks think that we should not back port them to branch-2.1 and branch-2.0, as usually we should not add new features to minor release lines. I think the reason why we do not want the feature in branch-2.1 and branch-2.0 is reasonable, but this will introduce another problem. As later, we will release a 1.5.0 which has the feature, but when a user later upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released, as we do not port the feature to these two branches. This will be very confusing to users I'd say. So I think we should guarantee that, a higher version of HBase release will always contain all the features of a HBase release with a lower version which is released earlier, unless explicitly mentioned(for example, DLR). And this implies that, when we setup a new major release and make a new release on the first minor release line, then the develop branch for the previous major release will be useless, as said above, usually we do not want to port any new features to the minor release line of the new major release, then the new features should not be ported to previous major release, otherwise we will break the guarantee above. And this also means that, we could just use the 'develop' branch to make new releases.
Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.
https://issues.apache.org/jira/browse/HBASE-21745 张铎(Duo Zhang) 于2019年1月19日周六 上午9:51写道: > OK, the original issue is HBCK2 for AMv2, but here we need to do more, not > only for AMv2. > > Let me open a new issue and post what Andrew said above there. > > 张铎(Duo Zhang) 于2019年1月19日周六 上午9:26写道: > >> OK, let me find the original HBCK2 issue and see how can we make progress >> on it. >> >> BTW, on scan performance, Zheng Hu has done a work to get about 40% >> performance back in this issue for 100% scan case on ycsb >> >> https://issues.apache.org/jira/browse/HBASE-21657 >> >> Andrew Purtell 于2019年1月19日周六 上午8:14写道: >> >>> Lars was testing tip of branch-2 with Phoenix and said scans were 50% >>> slower than branch-1. I’ll try and get him to provide more details. >>> Anyway >>> after hbck2 is complete issues like that will come out in the testing >>> we’d >>> do as part of sanity checking a move of the pointer. >>> >>> On Fri, Jan 18, 2019 at 4:02 PM Zach York >>> wrote: >>> >>> > I agree with the sentiment around HBCK2. I think these kind of recovery >>> > tools are essential before marking something stable. >>> > >>> > I also remember when we did testing around HBase 2.x/2.1 that we were >>> > getting perf degradations and couldn't seem to get performance to be as >>> > good as we were getting in the 1.x line. >>> > >>> > - Zach >>> > >>> > On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr >>> wrote: >>> > >>> > > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate >>> old >>> > > version data to HBase-2. We have use cases where we are using these >>> tools >>> > > to rebuild the meta for further region assignment. >>> > > Similar discussion is going on HBASE-21665, after fixing the NPE and >>> > > rebuilding the meta, master don't assign the regions as we skip the >>> empty >>> > > regions while loading meta during master startup. >>> > > >>> > > A big +1 from my side on this... >>> > > >>> > > Regards, >>> > > Pankaj >>> > > >>> > > -Original Message- >>> > > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com] >>> > > Sent: 18 January 2019 11:55 >>> > > To: HBase Dev List >>> > > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get >>> the >>> > > 'stable' pointer. >>> > > >>> > > So the first priority is to make progress on HBCK2? If we all agree, >>> > let's >>> > > start to work. >>> > > >>> > > Andrew Purtell 于2019年1月18日周五 下午12:31写道: >>> > > >>> > > > Sorry, let me add... Check all the boxes on that list and I'm +1 >>> for >>> > > > moving the stable pointer (modulo some time to pound on the >>> candidate >>> > > > to really put it through its paces, like two weeks of chaos...) >>> > > > >>> > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell < >>> apurt...@apache.org> >>> > > > wrote: >>> > > > >>> > > > > I do not believe we should move the stable pointer to any 2.x >>> until >>> > > > > HBCK2 is feature complete. We can discuss what that milestone >>> should >>> > > look like. >>> > > > > At a minimum, I think we need: >>> > > > > >>> > > > >- Rebuild meta from region metadata in the filesystem, aka >>> offline >>> > > > >meta rebuild. >>> > > > >- Fix assignment errors (undeployed regions, double >>> assignments >>> > > (yes, >>> > > > >should not be possible), etc) >>> > > > >- Fix region holes, overlaps, and other errors in the region >>> chain >>> > > > >- Fix failed split and merge transactions that have failed to >>> roll >>> > > > >back due to some bug (related to previous) >>> > > > >- Enumerate store files to determine file level corruption and >>> > > > >sideline corrupt files >>> > > > >- Fix hfile link problems (dangling / broken) >>> > > > > >>> > > > > This is a list of the real problems I have had to fix in >>> production >>> > > > > at least once (in the past 10 years...). >>> > > > > >>> > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang) >>> > > > > >>> > > > > wrote: >>> > > > > >>> > > > >> There are still lots of small new features which we want to >>> > > > >> integrate >>> > > > into >>> > > > >> branch-2 so I'm -1 on making release directly from branch-2. >>> > > > >> Backporting at once before release is a pain I'd say, I've tried >>> > > > >> this many times recently, as we have to follow up the community >>> > > > >> version...Let's make a branch-2.2 when we want to release 2.2.0, >>> > > > >> and maybe also retire the branch-2.0? >>> > > > >> >>> > > > >> For the stable pointer, I think 2.1.x maybe a good candidate? >>> > > > >> Though we know that we may still have some bugs for the AMv2, >>> but >>> > > > >> actually we all know that the AMv1 for all the branch-1.x also >>> has >>> > > > >> lots of bugs, that's why hbck is very important. >>> > > > >> >>> > > > >> And also +! on making progress on HBCK2, we need to port he >>> useful >>> > > > >> features of HBCK1 to HBCK2. There is no software can guarantee >>> that >>> > > > >> there is no bug, so FWIW we should have a way to fix broken >>>
[jira] [Created] (HBASE-21745) Make HBCK2 be able to fix issues other than region assignment
Duo Zhang created HBASE-21745: - Summary: Make HBCK2 be able to fix issues other than region assignment Key: HBASE-21745 URL: https://issues.apache.org/jira/browse/HBASE-21745 Project: HBase Issue Type: Umbrella Components: hbase-operator-tools, hbck2 Reporter: Duo Zhang This is what [~apurtell] posted on mailing-list, HBCK2 should support {quote} - Rebuild meta from region metadata in the filesystem, aka offline meta rebuild. - Fix assignment errors (undeployed regions, double assignments (yes, should not be possible), etc) - Fix region holes, overlaps, and other errors in the region chain - Fix failed split and merge transactions that have failed to roll back due to some bug (related to previous) - Enumerate store files to determine file level corruption and sideline corrupt files - Fix hfile link problems (dangling / broken) {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.
OK, the original issue is HBCK2 for AMv2, but here we need to do more, not only for AMv2. Let me open a new issue and post what Andrew said above there. 张铎(Duo Zhang) 于2019年1月19日周六 上午9:26写道: > OK, let me find the original HBCK2 issue and see how can we make progress > on it. > > BTW, on scan performance, Zheng Hu has done a work to get about 40% > performance back in this issue for 100% scan case on ycsb > > https://issues.apache.org/jira/browse/HBASE-21657 > > Andrew Purtell 于2019年1月19日周六 上午8:14写道: > >> Lars was testing tip of branch-2 with Phoenix and said scans were 50% >> slower than branch-1. I’ll try and get him to provide more details. Anyway >> after hbck2 is complete issues like that will come out in the testing we’d >> do as part of sanity checking a move of the pointer. >> >> On Fri, Jan 18, 2019 at 4:02 PM Zach York >> wrote: >> >> > I agree with the sentiment around HBCK2. I think these kind of recovery >> > tools are essential before marking something stable. >> > >> > I also remember when we did testing around HBase 2.x/2.1 that we were >> > getting perf degradations and couldn't seem to get performance to be as >> > good as we were getting in the 1.x line. >> > >> > - Zach >> > >> > On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr >> wrote: >> > >> > > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate >> old >> > > version data to HBase-2. We have use cases where we are using these >> tools >> > > to rebuild the meta for further region assignment. >> > > Similar discussion is going on HBASE-21665, after fixing the NPE and >> > > rebuilding the meta, master don't assign the regions as we skip the >> empty >> > > regions while loading meta during master startup. >> > > >> > > A big +1 from my side on this... >> > > >> > > Regards, >> > > Pankaj >> > > >> > > -Original Message- >> > > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com] >> > > Sent: 18 January 2019 11:55 >> > > To: HBase Dev List >> > > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get the >> > > 'stable' pointer. >> > > >> > > So the first priority is to make progress on HBCK2? If we all agree, >> > let's >> > > start to work. >> > > >> > > Andrew Purtell 于2019年1月18日周五 下午12:31写道: >> > > >> > > > Sorry, let me add... Check all the boxes on that list and I'm +1 for >> > > > moving the stable pointer (modulo some time to pound on the >> candidate >> > > > to really put it through its paces, like two weeks of chaos...) >> > > > >> > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell > > >> > > > wrote: >> > > > >> > > > > I do not believe we should move the stable pointer to any 2.x >> until >> > > > > HBCK2 is feature complete. We can discuss what that milestone >> should >> > > look like. >> > > > > At a minimum, I think we need: >> > > > > >> > > > >- Rebuild meta from region metadata in the filesystem, aka >> offline >> > > > >meta rebuild. >> > > > >- Fix assignment errors (undeployed regions, double assignments >> > > (yes, >> > > > >should not be possible), etc) >> > > > >- Fix region holes, overlaps, and other errors in the region >> chain >> > > > >- Fix failed split and merge transactions that have failed to >> roll >> > > > >back due to some bug (related to previous) >> > > > >- Enumerate store files to determine file level corruption and >> > > > >sideline corrupt files >> > > > >- Fix hfile link problems (dangling / broken) >> > > > > >> > > > > This is a list of the real problems I have had to fix in >> production >> > > > > at least once (in the past 10 years...). >> > > > > >> > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang) >> > > > > >> > > > > wrote: >> > > > > >> > > > >> There are still lots of small new features which we want to >> > > > >> integrate >> > > > into >> > > > >> branch-2 so I'm -1 on making release directly from branch-2. >> > > > >> Backporting at once before release is a pain I'd say, I've tried >> > > > >> this many times recently, as we have to follow up the community >> > > > >> version...Let's make a branch-2.2 when we want to release 2.2.0, >> > > > >> and maybe also retire the branch-2.0? >> > > > >> >> > > > >> For the stable pointer, I think 2.1.x maybe a good candidate? >> > > > >> Though we know that we may still have some bugs for the AMv2, but >> > > > >> actually we all know that the AMv1 for all the branch-1.x also >> has >> > > > >> lots of bugs, that's why hbck is very important. >> > > > >> >> > > > >> And also +! on making progress on HBCK2, we need to port he >> useful >> > > > >> features of HBCK1 to HBCK2. There is no software can guarantee >> that >> > > > >> there is no bug, so FWIW we should have a way to fix broken >> > > > >> clusters. >> > > > >> >> > > > >> Sean Busbey 于2019年1月18日周五 上午11:47写道: >> > > > >> >> > > > >> > There are a few related topics I'd like to discuss and I >> figured >> > > > >> > this subject line is the most likely to get a bit of attention. >> > > > >>
About how features are integrated to different HBase versions
I think we have a good discussion on HBASE-21034, where a feature is back ported to branch-1, but then folks think that we should not back port them to branch-2.1 and branch-2.0, as usually we should not add new features to minor release lines. I think the reason why we do not want the feature in branch-2.1 and branch-2.0 is reasonable, but this will introduce another problem. As later, we will release a 1.5.0 which has the feature, but when a user later upgrades from 1.5.0 to 2.1.x or 2.0.x, it will find that the feature is gone, even though the 2.1.x or 2.0.x is released after 1.5.0 is released, as we do not port the feature to these two branches. This will be very confusing to users I'd say. So I think we should guarantee that, a higher version of HBase release will always contain all the features of a HBase release with a lower version which is released earlier, unless explicitly mentioned(for example, DLR). And this implies that, when we setup a new major release and make a new release on the first minor release line, then the develop branch for the previous major release will be useless, as said above, usually we do not want to port any new features to the minor release line of the new major release, then the new features should not be ported to previous major release, otherwise we will break the guarantee above. And this also means that, we could just use the 'develop' branch to make new releases.
Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.
OK, let me find the original HBCK2 issue and see how can we make progress on it. BTW, on scan performance, Zheng Hu has done a work to get about 40% performance back in this issue for 100% scan case on ycsb https://issues.apache.org/jira/browse/HBASE-21657 Andrew Purtell 于2019年1月19日周六 上午8:14写道: > Lars was testing tip of branch-2 with Phoenix and said scans were 50% > slower than branch-1. I’ll try and get him to provide more details. Anyway > after hbck2 is complete issues like that will come out in the testing we’d > do as part of sanity checking a move of the pointer. > > On Fri, Jan 18, 2019 at 4:02 PM Zach York > wrote: > > > I agree with the sentiment around HBCK2. I think these kind of recovery > > tools are essential before marking something stable. > > > > I also remember when we did testing around HBase 2.x/2.1 that we were > > getting perf degradations and couldn't seem to get performance to be as > > good as we were getting in the 1.x line. > > > > - Zach > > > > On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr wrote: > > > > > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate old > > > version data to HBase-2. We have use cases where we are using these > tools > > > to rebuild the meta for further region assignment. > > > Similar discussion is going on HBASE-21665, after fixing the NPE and > > > rebuilding the meta, master don't assign the regions as we skip the > empty > > > regions while loading meta during master startup. > > > > > > A big +1 from my side on this... > > > > > > Regards, > > > Pankaj > > > > > > -Original Message- > > > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com] > > > Sent: 18 January 2019 11:55 > > > To: HBase Dev List > > > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get the > > > 'stable' pointer. > > > > > > So the first priority is to make progress on HBCK2? If we all agree, > > let's > > > start to work. > > > > > > Andrew Purtell 于2019年1月18日周五 下午12:31写道: > > > > > > > Sorry, let me add... Check all the boxes on that list and I'm +1 for > > > > moving the stable pointer (modulo some time to pound on the candidate > > > > to really put it through its paces, like two weeks of chaos...) > > > > > > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell > > > > wrote: > > > > > > > > > I do not believe we should move the stable pointer to any 2.x until > > > > > HBCK2 is feature complete. We can discuss what that milestone > should > > > look like. > > > > > At a minimum, I think we need: > > > > > > > > > >- Rebuild meta from region metadata in the filesystem, aka > offline > > > > >meta rebuild. > > > > >- Fix assignment errors (undeployed regions, double assignments > > > (yes, > > > > >should not be possible), etc) > > > > >- Fix region holes, overlaps, and other errors in the region > chain > > > > >- Fix failed split and merge transactions that have failed to > roll > > > > >back due to some bug (related to previous) > > > > >- Enumerate store files to determine file level corruption and > > > > >sideline corrupt files > > > > >- Fix hfile link problems (dangling / broken) > > > > > > > > > > This is a list of the real problems I have had to fix in production > > > > > at least once (in the past 10 years...). > > > > > > > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang) > > > > > > > > > > wrote: > > > > > > > > > >> There are still lots of small new features which we want to > > > > >> integrate > > > > into > > > > >> branch-2 so I'm -1 on making release directly from branch-2. > > > > >> Backporting at once before release is a pain I'd say, I've tried > > > > >> this many times recently, as we have to follow up the community > > > > >> version...Let's make a branch-2.2 when we want to release 2.2.0, > > > > >> and maybe also retire the branch-2.0? > > > > >> > > > > >> For the stable pointer, I think 2.1.x maybe a good candidate? > > > > >> Though we know that we may still have some bugs for the AMv2, but > > > > >> actually we all know that the AMv1 for all the branch-1.x also has > > > > >> lots of bugs, that's why hbck is very important. > > > > >> > > > > >> And also +! on making progress on HBCK2, we need to port he useful > > > > >> features of HBCK1 to HBCK2. There is no software can guarantee > that > > > > >> there is no bug, so FWIW we should have a way to fix broken > > > > >> clusters. > > > > >> > > > > >> Sean Busbey 于2019年1月18日周五 上午11:47写道: > > > > >> > > > > >> > There are a few related topics I'd like to discuss and I figured > > > > >> > this subject line is the most likely to get a bit of attention. > > > > >> > :) > > > > >> > > > > > >> > First, I'd like us all to get on the same page wrt the current > > > > >> > state of branch-2. Personally, I don't think it can be released > > > > >> > as-is with a 2.y version because folks can't rolling upgrade > from > > > > >> > 2.0 or 2.1 to it due to the current implementation of > > > > >> > HBASE-20881.
[jira] [Created] (HBASE-21744) timeout for server list refresh calls
Sergey Shelukhin created HBASE-21744: Summary: timeout for server list refresh calls Key: HBASE-21744 URL: https://issues.apache.org/jira/browse/HBASE-21744 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Not sure why yet, but we are seeing the case when cluster is in overall a bad state, where after RS dies and deletes its znode, the notification looks like it's lost, so the master doesn't detect the failure. ZK itself appears to be healthy and doesn't report anything special. After some other change is made to the server list, master rescans the list and picks up the stale notification. Might make sense to add a config that would trigger the refresh if it hasn't happened for a while (e.g. 1 minute). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.
Lars was testing tip of branch-2 with Phoenix and said scans were 50% slower than branch-1. I’ll try and get him to provide more details. Anyway after hbck2 is complete issues like that will come out in the testing we’d do as part of sanity checking a move of the pointer. On Fri, Jan 18, 2019 at 4:02 PM Zach York wrote: > I agree with the sentiment around HBCK2. I think these kind of recovery > tools are essential before marking something stable. > > I also remember when we did testing around HBase 2.x/2.1 that we were > getting perf degradations and couldn't seem to get performance to be as > good as we were getting in the 1.x line. > > - Zach > > On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr wrote: > > > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate old > > version data to HBase-2. We have use cases where we are using these tools > > to rebuild the meta for further region assignment. > > Similar discussion is going on HBASE-21665, after fixing the NPE and > > rebuilding the meta, master don't assign the regions as we skip the empty > > regions while loading meta during master startup. > > > > A big +1 from my side on this... > > > > Regards, > > Pankaj > > > > -Original Message- > > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com] > > Sent: 18 January 2019 11:55 > > To: HBase Dev List > > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get the > > 'stable' pointer. > > > > So the first priority is to make progress on HBCK2? If we all agree, > let's > > start to work. > > > > Andrew Purtell 于2019年1月18日周五 下午12:31写道: > > > > > Sorry, let me add... Check all the boxes on that list and I'm +1 for > > > moving the stable pointer (modulo some time to pound on the candidate > > > to really put it through its paces, like two weeks of chaos...) > > > > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell > > > wrote: > > > > > > > I do not believe we should move the stable pointer to any 2.x until > > > > HBCK2 is feature complete. We can discuss what that milestone should > > look like. > > > > At a minimum, I think we need: > > > > > > > >- Rebuild meta from region metadata in the filesystem, aka offline > > > >meta rebuild. > > > >- Fix assignment errors (undeployed regions, double assignments > > (yes, > > > >should not be possible), etc) > > > >- Fix region holes, overlaps, and other errors in the region chain > > > >- Fix failed split and merge transactions that have failed to roll > > > >back due to some bug (related to previous) > > > >- Enumerate store files to determine file level corruption and > > > >sideline corrupt files > > > >- Fix hfile link problems (dangling / broken) > > > > > > > > This is a list of the real problems I have had to fix in production > > > > at least once (in the past 10 years...). > > > > > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang) > > > > > > > > wrote: > > > > > > > >> There are still lots of small new features which we want to > > > >> integrate > > > into > > > >> branch-2 so I'm -1 on making release directly from branch-2. > > > >> Backporting at once before release is a pain I'd say, I've tried > > > >> this many times recently, as we have to follow up the community > > > >> version...Let's make a branch-2.2 when we want to release 2.2.0, > > > >> and maybe also retire the branch-2.0? > > > >> > > > >> For the stable pointer, I think 2.1.x maybe a good candidate? > > > >> Though we know that we may still have some bugs for the AMv2, but > > > >> actually we all know that the AMv1 for all the branch-1.x also has > > > >> lots of bugs, that's why hbck is very important. > > > >> > > > >> And also +! on making progress on HBCK2, we need to port he useful > > > >> features of HBCK1 to HBCK2. There is no software can guarantee that > > > >> there is no bug, so FWIW we should have a way to fix broken > > > >> clusters. > > > >> > > > >> Sean Busbey 于2019年1月18日周五 上午11:47写道: > > > >> > > > >> > There are a few related topics I'd like to discuss and I figured > > > >> > this subject line is the most likely to get a bit of attention. > > > >> > :) > > > >> > > > > >> > First, I'd like us all to get on the same page wrt the current > > > >> > state of branch-2. Personally, I don't think it can be released > > > >> > as-is with a 2.y version because folks can't rolling upgrade from > > > >> > 2.0 or 2.1 to it due to the current implementation of > > > >> > HBASE-20881. As Duo has mentioned a couple of times, folks have > > > >> > to ensure there are no region transitions around during the > > > >> > upgrade. I think that will be prohibitive for folks looking to > > upgrade. What do other folks think? > > > >> > > > > >> > Second, I think our recent discussions around the need for > > > >> > shifting to more minor releases for HBase 1.y also applies to the > > 2.y branches. > > > >> > branch-2 hasn't had a release since 2.1.0 came out in July 2018. > > > >> > That's a scary long
[jira] [Created] (HBASE-21743) stateless assignment
Sergey Shelukhin created HBASE-21743: Summary: stateless assignment Key: HBASE-21743 URL: https://issues.apache.org/jira/browse/HBASE-21743 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Running HBase for only a few weeks we found dozen(s?) of bugs with assignment that all seem to have the same nature - split brain between 2 procedures; or between procedure and master startup (meta replica bugs); or procedure and master shutdown (HBASE-21742); or procedure and something else (when SCP had incorrect region list persisted, don't recall the bug#). To me, it starts to look like a pattern where, like in AMv1 where concurrent interactions were unclear and hard to reason about, despite the cleaner individual pieces in AMv2 the problem of unclear concurrent interactions has been preserved and in fact increased because of the operation state persistence and isolation. Procedures are great for multi-step operations that need rollback and stuff like that, e.g. creating a table or snapshot, or even region splitting. However I'm not so sure about assignment. We have the persisted information - region state in meta (incl transition states like opening, or closing), server list as WAL directory list. Procedure state is not any more reliable then those (we can argue that meta update can fail, but so can procv2 WAL flush, so we have to handle cases of out of date information regardless). So, we don't need any extra state to decide on assignment, whether for recovery and balancing. In fact, as mentioned in some bugs, deleting procv2 WAL is often the best way to recover the cluster, because master can already figure out what to do without additional state. I think there should be an option for stateless assignment that does that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [DISCUSS] Moving towards a branch-2 line that can get the 'stable' pointer.
I agree with the sentiment around HBCK2. I think these kind of recovery tools are essential before marking something stable. I also remember when we did testing around HBase 2.x/2.1 that we were getting perf degradations and couldn't seem to get performance to be as good as we were getting in the 1.x line. - Zach On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr wrote: > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate old > version data to HBase-2. We have use cases where we are using these tools > to rebuild the meta for further region assignment. > Similar discussion is going on HBASE-21665, after fixing the NPE and > rebuilding the meta, master don't assign the regions as we skip the empty > regions while loading meta during master startup. > > A big +1 from my side on this... > > Regards, > Pankaj > > -Original Message- > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com] > Sent: 18 January 2019 11:55 > To: HBase Dev List > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get the > 'stable' pointer. > > So the first priority is to make progress on HBCK2? If we all agree, let's > start to work. > > Andrew Purtell 于2019年1月18日周五 下午12:31写道: > > > Sorry, let me add... Check all the boxes on that list and I'm +1 for > > moving the stable pointer (modulo some time to pound on the candidate > > to really put it through its paces, like two weeks of chaos...) > > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell > > wrote: > > > > > I do not believe we should move the stable pointer to any 2.x until > > > HBCK2 is feature complete. We can discuss what that milestone should > look like. > > > At a minimum, I think we need: > > > > > >- Rebuild meta from region metadata in the filesystem, aka offline > > >meta rebuild. > > >- Fix assignment errors (undeployed regions, double assignments > (yes, > > >should not be possible), etc) > > >- Fix region holes, overlaps, and other errors in the region chain > > >- Fix failed split and merge transactions that have failed to roll > > >back due to some bug (related to previous) > > >- Enumerate store files to determine file level corruption and > > >sideline corrupt files > > >- Fix hfile link problems (dangling / broken) > > > > > > This is a list of the real problems I have had to fix in production > > > at least once (in the past 10 years...). > > > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang) > > > > > > wrote: > > > > > >> There are still lots of small new features which we want to > > >> integrate > > into > > >> branch-2 so I'm -1 on making release directly from branch-2. > > >> Backporting at once before release is a pain I'd say, I've tried > > >> this many times recently, as we have to follow up the community > > >> version...Let's make a branch-2.2 when we want to release 2.2.0, > > >> and maybe also retire the branch-2.0? > > >> > > >> For the stable pointer, I think 2.1.x maybe a good candidate? > > >> Though we know that we may still have some bugs for the AMv2, but > > >> actually we all know that the AMv1 for all the branch-1.x also has > > >> lots of bugs, that's why hbck is very important. > > >> > > >> And also +! on making progress on HBCK2, we need to port he useful > > >> features of HBCK1 to HBCK2. There is no software can guarantee that > > >> there is no bug, so FWIW we should have a way to fix broken > > >> clusters. > > >> > > >> Sean Busbey 于2019年1月18日周五 上午11:47写道: > > >> > > >> > There are a few related topics I'd like to discuss and I figured > > >> > this subject line is the most likely to get a bit of attention. > > >> > :) > > >> > > > >> > First, I'd like us all to get on the same page wrt the current > > >> > state of branch-2. Personally, I don't think it can be released > > >> > as-is with a 2.y version because folks can't rolling upgrade from > > >> > 2.0 or 2.1 to it due to the current implementation of > > >> > HBASE-20881. As Duo has mentioned a couple of times, folks have > > >> > to ensure there are no region transitions around during the > > >> > upgrade. I think that will be prohibitive for folks looking to > upgrade. What do other folks think? > > >> > > > >> > Second, I think our recent discussions around the need for > > >> > shifting to more minor releases for HBase 1.y also applies to the > 2.y branches. > > >> > branch-2 hasn't had a release since 2.1.0 came out in July 2018. > > >> > That's a scary long amount of time. I think it contributes to us > > >> > ending up with changes like the above since it's easy to think > > >> > about the branch as something that has a lot of time before the > > >> > next release. > > >> > > > >> > Personally, I'd like to see us skip making minor-release specific > > >> > branches for a bit unless a CVE fix or something comes up. > > >> > Ideally, that would mean we work towards a 2.2.0 release directly > > >> > from branch-2 and then 2.2.1, etc. When we have a feature that's > > >> > ready to
[jira] [Created] (HBASE-21742) master can create bad procedures during abort, making entire cluster unusable
Sergey Shelukhin created HBASE-21742: Summary: master can create bad procedures during abort, making entire cluster unusable Key: HBASE-21742 URL: https://issues.apache.org/jira/browse/HBASE-21742 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Some small HDFS hiccup causes master and meta RS to fail together. Master goes first: {noformat} 2019-01-18 08:09:46,790 INFO [KeepAlivePEWorker-311] zookeeper.MetaTableLocator: Setting hbase:meta (replicaId=0) location in ZooKeeper as meta-rs,17020,1547824792484 ... 2019-01-18 10:01:16,904 ERROR [PEWorker-11] master.HMaster: * ABORTING master master,17000,1547604554447: FAILED [blah] * ... 2019-01-18 10:01:17,087 INFO [master/master:17000] assignment.AssignmentManager: Stopping assignment manager {noformat} Bunch of stuff keeps happening, including procedure retries, which is also suspect, but not the point here: {noformat} 2019-01-18 10:01:21,598 INFO [PEWorker-3] procedure2.TimeoutExecutorThread: ADDED pid=104031, state=WAITING_TIMEOUT:REGION_STATE_TRANSITION_CLOSE, ... {noformat} {noformat} Then the meta RS decides it's time to go: {noformat} 2019-01-18 10:01:25,319 INFO [RegionServerTracker-0] master.RegionServerTracker: RegionServer ephemeral node deleted, processing expiration [meta-rs,17020,1547824792484] ... 2019-01-18 10:01:25,463 INFO [RegionServerTracker-0] assignment.AssignmentManager: Added meta-rs,17020,1547824792484 to dead servers which carryingMeta=false, submitted ServerCrashProcedure pid=104313 {noformat} This SCP gets persisted, so when the next master starts, it waits forever for meta to be onlined, while there's no SCP with meta=true to online it. The only way around this is to delete the procv2 WAL - master has all the information here, as it often does in bugs I've found recently, but some split brain procedures cause it to get stuck one way or another. I will file a separate bug about that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21741) Add a note in "HFile Tool" section regarding 'seqid=0'
Sakthi created HBASE-21741: -- Summary: Add a note in "HFile Tool" section regarding 'seqid=0' Key: HBASE-21741 URL: https://issues.apache.org/jira/browse/HBASE-21741 Project: HBase Issue Type: Improvement Components: documentation Reporter: Sakthi Assignee: Sakthi In few parts of the HFile, where the seqid is irrelevant such as: * firstKey=Optional[row0/cf:column/1547846312435/Put/seqid=0] * lastKey=Optional[row9/cf:column/1547846312490/Put/seqid=0] Let's make a note on the doc in the 'HFile Tool' section, that seqid=0 in such cases means seqid is irrelevant here because it's a 'KeyOnlyKeyValue'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Requesting access to hbase slack channel
Sent. On Fri, Jan 18, 2019 at 7:48 AM Abhishek Gupta wrote: > Hi, > > I would like an invitation too abhila...@gmail.com > > Thanks > > On Fri, Jan 18, 2019 at 11:38 AM Manjeet Singh > > wrote: > > > Done > > > > On Fri, 18 Jan 2019, 11:24 Buchi Reddy Busi Reddy > wrote: > > > > > Can you also invite mailtobu...@gmail.com please? > > > > > > On Thu, Jan 17, 2019 at 8:34 PM Manjeet Singh < > > manjeet.chand...@gmail.com> > > > wrote: > > > > > > > Seems someone else already did it > > > > > > > > Manjeet > > > > > > > > On Wed, 16 Jan 2019, 17:33 Nihal Jain > > > > > > > > Hi > > > > > > > > > > Could you please invite me: nihaljain...@gmail.com < > > > > nihaljain...@gmail.com > > > > > >? > > > > > > > > > > Regards, > > > > > Nihal > > > > > > > > > > On Wed, 16 Jan, 2019, 2:54 PM Peter Somogyi > > wrote: > > > > > > > > > > > Sent invitation to madhurpan...@gmail.com. > > > > > > > > > > > > On Wed, Jan 16, 2019 at 7:27 AM Madhur Pant < > > madhurpan...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > Hi Team, > > > > > > > > > > > > > > I was wondering if I could get access to the HBase user slack > > > channel > > > > > > > > > > > > > > https://apache-hbase.slack.com > > > > > > > > > > > > > > It says here < > https://issues.apache.org/jira/browse/HBASE-16413> > > > > > that I > > > > > > > should email you guys :) > > > > > > > > > > > > > > Thanks, > > > > > > > Madhur Pant > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: Requesting access to slack channel
Invitation has been sent. Best, Balazs On Fri, Jan 18, 2019 at 6:48 AM aman goyal wrote: > Hi Team, > > Pls provide access to hbase user slack channel > https://apache-hbase.slack.com > > Pls send invitation to aman...@gmail.com > > Thanks, > Aman >
[jira] [Created] (HBASE-21740) NPE happens while shutdown the RS
lujie created HBASE-21740: - Summary: NPE happens while shutdown the RS Key: HBASE-21740 URL: https://issues.apache.org/jira/browse/HBASE-21740 Project: HBase Issue Type: Bug Reporter: lujie while shutdown a NM, we meet the NPE: {code:java} 2019-01-18 16:52:05,500 INFO [Thread-4] regionserver.HRegionServer: STOPPED: Shutdown hook 2019-01-18 16:52:05,896 INFO [regionserver/hadoop15:16020] regionserver.MetricsRegionServerWrapperImpl: Computing regionserver metrics every 5000 milliseconds 2019-01-18 16:52:05,978 INFO [regionserver/hadoop15:16020.Chore.1] hbase.ScheduledChore: Chore: CompactedHFilesCleaner was stopped 2019-01-18 16:52:05,996 ERROR [regionserver/hadoop15:16020] regionserver.HRegionServer: Failed init java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.startServices(HRegionServer.java:1978) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1572) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:975) at java.lang.Thread.run(Thread.java:745) 2019-01-18 16:52:06,011 ERROR [regionserver/hadoop15:16020] regionserver.HRegionServer: * ABORTING region server hadoop15,16020,1547801516426: Unhandled: Region server startup failed * java.io.IOException: Region server startup failed at org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:3392) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1591) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:975) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.startServices(HRegionServer.java:1978) at org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1572) ... 2 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)