https://issues.apache.org/jira/browse/HBASE-21745
张铎(Duo Zhang) <palomino...@gmail.com> 于2019年1月19日周六 上午9:51写道: > OK, the original issue is HBCK2 for AMv2, but here we need to do more, not > only for AMv2. > > Let me open a new issue and post what Andrew said above there. > > 张铎(Duo Zhang) <palomino...@gmail.com> 于2019年1月19日周六 上午9:26写道: > >> OK, let me find the original HBCK2 issue and see how can we make progress >> on it. >> >> BTW, on scan performance, Zheng Hu has done a work to get about 40% >> performance back in this issue for 100% scan case on ycsb >> >> https://issues.apache.org/jira/browse/HBASE-21657 >> >> Andrew Purtell <apurt...@apache.org> 于2019年1月19日周六 上午8:14写道: >> >>> Lars was testing tip of branch-2 with Phoenix and said scans were 50% >>> slower than branch-1. I’ll try and get him to provide more details. >>> Anyway >>> after hbck2 is complete issues like that will come out in the testing >>> we’d >>> do as part of sanity checking a move of the pointer. >>> >>> On Fri, Jan 18, 2019 at 4:02 PM Zach York <zyork.contribut...@gmail.com> >>> wrote: >>> >>> > I agree with the sentiment around HBCK2. I think these kind of recovery >>> > tools are essential before marking something stable. >>> > >>> > I also remember when we did testing around HBase 2.x/2.1 that we were >>> > getting perf degradations and couldn't seem to get performance to be as >>> > good as we were getting in the 1.x line. >>> > >>> > - Zach >>> > >>> > On Thu, Jan 17, 2019 at 11:06 PM Pankaj kr <pankaj...@huawei.com> >>> wrote: >>> > >>> > > Yeah, HBCK2/ OfflineMetaRepair tools are really required to migrate >>> old >>> > > version data to HBase-2. We have use cases where we are using these >>> tools >>> > > to rebuild the meta for further region assignment. >>> > > Similar discussion is going on HBASE-21665, after fixing the NPE and >>> > > rebuilding the meta, master don't assign the regions as we skip the >>> empty >>> > > regions while loading meta during master startup. >>> > > >>> > > A big +1 from my side on this... >>> > > >>> > > Regards, >>> > > Pankaj >>> > > >>> > > -----Original Message----- >>> > > From: 张铎(Duo Zhang) [mailto:palomino...@gmail.com] >>> > > Sent: 18 January 2019 11:55 >>> > > To: HBase Dev List <dev@hbase.apache.org> >>> > > Subject: Re: [DISCUSS] Moving towards a branch-2 line that can get >>> the >>> > > 'stable' pointer. >>> > > >>> > > So the first priority is to make progress on HBCK2? If we all agree, >>> > let's >>> > > start to work. >>> > > >>> > > Andrew Purtell <apurt...@apache.org> 于2019年1月18日周五 下午12:31写道: >>> > > >>> > > > Sorry, let me add... Check all the boxes on that list and I'm +1 >>> for >>> > > > moving the stable pointer (modulo some time to pound on the >>> candidate >>> > > > to really put it through its paces, like two weeks of chaos...) >>> > > > >>> > > > On Thu, Jan 17, 2019 at 8:28 PM Andrew Purtell < >>> apurt...@apache.org> >>> > > > wrote: >>> > > > >>> > > > > I do not believe we should move the stable pointer to any 2.x >>> until >>> > > > > HBCK2 is feature complete. We can discuss what that milestone >>> should >>> > > look like. >>> > > > > At a minimum, I think we need: >>> > > > > >>> > > > > - Rebuild meta from region metadata in the filesystem, aka >>> offline >>> > > > > meta rebuild. >>> > > > > - Fix assignment errors (undeployed regions, double >>> assignments >>> > > (yes, >>> > > > > should not be possible), etc) >>> > > > > - Fix region holes, overlaps, and other errors in the region >>> chain >>> > > > > - Fix failed split and merge transactions that have failed to >>> roll >>> > > > > back due to some bug (related to previous) >>> > > > > - Enumerate store files to determine file level corruption and >>> > > > > sideline corrupt files >>> > > > > - Fix hfile link problems (dangling / broken) >>> > > > > >>> > > > > This is a list of the real problems I have had to fix in >>> production >>> > > > > at least once (in the past 10 years...). >>> > > > > >>> > > > > On Thu, Jan 17, 2019 at 8:19 PM 张铎(Duo Zhang) >>> > > > > <palomino...@gmail.com> >>> > > > > wrote: >>> > > > > >>> > > > >> There are still lots of small new features which we want to >>> > > > >> integrate >>> > > > into >>> > > > >> branch-2 so I'm -1 on making release directly from branch-2. >>> > > > >> Backporting at once before release is a pain I'd say, I've tried >>> > > > >> this many times recently, as we have to follow up the community >>> > > > >> version...Let's make a branch-2.2 when we want to release 2.2.0, >>> > > > >> and maybe also retire the branch-2.0? >>> > > > >> >>> > > > >> For the stable pointer, I think 2.1.x maybe a good candidate? >>> > > > >> Though we know that we may still have some bugs for the AMv2, >>> but >>> > > > >> actually we all know that the AMv1 for all the branch-1.x also >>> has >>> > > > >> lots of bugs, that's why hbck is very important. >>> > > > >> >>> > > > >> And also +! on making progress on HBCK2, we need to port he >>> useful >>> > > > >> features of HBCK1 to HBCK2. There is no software can guarantee >>> that >>> > > > >> there is no bug, so FWIW we should have a way to fix broken >>> > > > >> clusters. >>> > > > >> >>> > > > >> Sean Busbey <bus...@apache.org> 于2019年1月18日周五 上午11:47写道: >>> > > > >> >>> > > > >> > There are a few related topics I'd like to discuss and I >>> figured >>> > > > >> > this subject line is the most likely to get a bit of >>> attention. >>> > > > >> > :) >>> > > > >> > >>> > > > >> > First, I'd like us all to get on the same page wrt the current >>> > > > >> > state of branch-2. Personally, I don't think it can be >>> released >>> > > > >> > as-is with a 2.y version because folks can't rolling upgrade >>> from >>> > > > >> > 2.0 or 2.1 to it due to the current implementation of >>> > > > >> > HBASE-20881. As Duo has mentioned a couple of times, folks >>> have >>> > > > >> > to ensure there are no region transitions around during the >>> > > > >> > upgrade. I think that will be prohibitive for folks looking to >>> > > upgrade. What do other folks think? >>> > > > >> > >>> > > > >> > Second, I think our recent discussions around the need for >>> > > > >> > shifting to more minor releases for HBase 1.y also applies to >>> the >>> > > 2.y branches. >>> > > > >> > branch-2 hasn't had a release since 2.1.0 came out in July >>> 2018. >>> > > > >> > That's a scary long amount of time. I think it contributes to >>> us >>> > > > >> > ending up with changes like the above since it's easy to think >>> > > > >> > about the branch as something that has a lot of time before >>> the >>> > > > >> > next release. >>> > > > >> > >>> > > > >> > Personally, I'd like to see us skip making minor-release >>> specific >>> > > > >> > branches for a bit unless a CVE fix or something comes up. >>> > > > >> > Ideally, that would mean we work towards a 2.2.0 release >>> directly >>> > > > >> > from branch-2 and then 2.2.1, etc. When we have a feature >>> that's >>> > > > >> > ready to backport from the master branch for a release we then >>> > > > >> > update branch-2's version to be 2.3.0. >>> > > > >> > >>> > > > >> > Or maybe we try set a regular cadence to feature releases by >>> > > > >> > having >>> > > > >> > branch-2 release a new minor, two months of new maintenance >>> > > > >> > releases, followed by a new minor. That would mean after the >>> last >>> > > > >> > of the maintenance releases we'd have a window of a few weeks >>> > > > >> > where we can all decide which features in master are mature >>> > > > >> > enough to backport for the new minor release. >>> > > > >> > >>> > > > >> > Lastly, what would it take for folks to feel confident moving >>> the >>> > > > >> > 'stable' pointer to a HBase 2.y? Is there a major gap still on >>> > > > >> > assignment stability? Is it a more thorough look at >>> performance? >>> > > > >> > More time to ensure HBCK2 has good coverage of failure modes >>> that >>> > > need it? >>> > > > >> > >>> > > > >> >>> > > > > >>> > > > > >>> > > > > -- >>> > > > > Best regards, >>> > > > > Andrew >>> > > > > >>> > > > > Words like orphans lost among the crosstalk, meaning torn from >>> > > > > truth's decrepit hands >>> > > > > - A23, Crosstalk >>> > > > > >>> > > > >>> > > > >>> > > > -- >>> > > > Best regards, >>> > > > Andrew >>> > > > >>> > > > Words like orphans lost among the crosstalk, meaning torn from >>> truth's >>> > > > decrepit hands >>> > > > - A23, Crosstalk >>> > > > >>> > > >>> > >>> -- >>> Best regards, >>> Andrew >>> >>> Words like orphans lost among the crosstalk, meaning torn from truth's >>> decrepit hands >>> - A23, Crosstalk >>> >>