bq. Our team is preparing to upgrade our production cluster from 0.98 to 1.1.3. @Jacob I'd suggest to pickup at least 1.1.4 because of HBASE-14460 (an important fix on performance regression), just a note irrelative to the topic here (sorry for the disturbing guys...)
Best Regards, Yu On 5 November 2016 at 14:08, Dima Spivak <[email protected]> wrote: > FYI, once INFRA-12849 is done, I plan on having ITBLL (w/ 1 billion linked > list nodes) running on a nightly basis in GCE via clusterdock. I think we'd > trust our testing more if we didn't always have to go through the exercise > of validating whether a failure we see is new or not and having a history > of such for different branches should help with that. > > On Friday, November 4, 2016, Andrew Purtell <[email protected]> > wrote: > > > That wasn't my question. At all. > > > > > On Nov 4, 2016, at 7:27 PM, Ted Yu <[email protected] > <javascript:;>> > > wrote: > > > > > > I looked at AssignmentManager#onRegionMerge() between branch-1.1 > > > and branch-1.2 > > > > > > AFAICT, there is no obvious divergence. > > > > > > Later on, I plan to compare the diff between output for 'git log > > > hbase-server/src/main/java/org/apache/hadoop/hbase/ > > master/AssignmentManager.java' > > > and see which JIRAs were unique to branch-1.2 > > > > > > Cheers > > > > > >> On Fri, Nov 4, 2016 at 6:37 PM, Andrew Purtell <[email protected] > > <javascript:;>> wrote: > > >> > > >> I'm not deeply familiar with the AssignmentManager. I see when we > > process > > >> split rollbacks in onRegionSplit() we only call regionOffline() on > > >> daughters if they are known to exist. However when processing merge > > >> rollbacks in the else case of onRegionMerge() we unconditionally call > > >> regionOffline() on the parent-being-merged. Shouldn't that likewise be > > >> conditional on regionStates holding a state for the > parent-being-merged? > > >> Pardon if I've missed something. > > >> > > >> > > >> On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <[email protected] > > <javascript:;>> > > >> wrote: > > >> > > >>> Thanks. Yes I have been eyeing HBASE-16093. There might be another > > corner > > >>> case there. > > >>> > > >>> > > >>> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <[email protected] > > <javascript:;>> > > >> wrote: > > >>> > > >>>>> > > >>>>> The behavior: Looks like failed split/compaction rollback: row(s) > in > > >>>> META > > >>>>> without HRegionInfo, regions deployed without valid meta entries > (at > > >>>>> first), regions on HDFS without valid meta entries (later, after RS > > >>>>> carrying them are killed by chaos), holes in the region chain > leading > > >> to > > >>>>> timeouts and job failure. > > >>>>> > > >>>>> > > >>>> The empty regioninfo in meta sounds like HBASE-16093, though that > fix > > is > > >>>> in > > >>>> 1.2. Interested to see if there are other problems around splits > > >> though. > > >>>> Do you have a JIRA yet for tracking? > > >>>> > > >>>> > > >>>>> > > >>>>> You'll know you have found it when on the ITBLL console its meta > > >> scanner > > >>>>> starts complaining about rows in meta without serialized > HRegionInfo. > > >>>>> > > >>>>> > > >>>> Will keep an eye out for this in our ITBLL runs here. > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Best regards, > > >>> > > >>> - Andy > > >>> > > >>> Problems worthy of attack prove their worth by hitting back. - Piet > > Hein > > >>> (via Tom White) > > >>> > > >> > > >> > > >> > > >> -- > > >> Best regards, > > >> > > >> - Andy > > >> > > >> Problems worthy of attack prove their worth by hitting back. - Piet > Hein > > >> (via Tom White) > > >> > > > > > -- > -Dima >
