FYI, once INFRA-12849 is done, I plan on having ITBLL (w/ 1 billion linked
list nodes) running on a nightly basis in GCE via clusterdock. I think we'd
trust our testing more if we didn't always have to go through the exercise
of validating whether a failure we see is new or not and having a history
of such for different branches should help with that.

On Friday, November 4, 2016, Andrew Purtell <[email protected]>
wrote:

> That wasn't my question. At all.
>
> > On Nov 4, 2016, at 7:27 PM, Ted Yu <[email protected] <javascript:;>>
> wrote:
> >
> > I looked at AssignmentManager#onRegionMerge() between branch-1.1
> > and branch-1.2
> >
> > AFAICT, there is no obvious divergence.
> >
> > Later on, I plan to compare the diff between output for 'git log
> > hbase-server/src/main/java/org/apache/hadoop/hbase/
> master/AssignmentManager.java'
> > and see which JIRAs were unique to branch-1.2
> >
> > Cheers
> >
> >> On Fri, Nov 4, 2016 at 6:37 PM, Andrew Purtell <[email protected]
> <javascript:;>> wrote:
> >>
> >> I'm not deeply familiar with the AssignmentManager. I see when we
> process
> >> split rollbacks in onRegionSplit() we only call regionOffline() on
> >> daughters if they are known to exist. However when processing merge
> >> rollbacks in the else case of onRegionMerge() we unconditionally call
> >> regionOffline() on the parent-being-merged. Shouldn't that likewise be
> >> conditional on regionStates holding a state for the parent-being-merged?
> >> Pardon if I've missed something.
> >>
> >>
> >> On Fri, Nov 4, 2016 at 5:05 PM, Andrew Purtell <[email protected]
> <javascript:;>>
> >> wrote:
> >>
> >>> Thanks. Yes I have been eyeing HBASE-16093. There might be another
> corner
> >>> case there.
> >>>
> >>>
> >>> On Fri, Nov 4, 2016 at 4:41 PM, Gary Helmling <[email protected]
> <javascript:;>>
> >> wrote:
> >>>
> >>>>>
> >>>>> The behavior: Looks like failed split/compaction rollback: row(s) in
> >>>> META
> >>>>> without HRegionInfo, regions deployed without valid meta entries (at
> >>>>> first), regions on HDFS without valid meta entries (later, after RS
> >>>>> carrying them are killed by chaos), holes in the region chain leading
> >> to
> >>>>> timeouts and job failure.
> >>>>>
> >>>>>
> >>>> The empty regioninfo in meta sounds like HBASE-16093, though that fix
> is
> >>>> in
> >>>> 1.2.  Interested to see if there are other problems around splits
> >> though.
> >>>> Do you have a JIRA yet for tracking?
> >>>>
> >>>>
> >>>>>
> >>>>> You'll know you have found it when on the ITBLL console its meta
> >> scanner
> >>>>> starts complaining about rows in meta without serialized HRegionInfo.
> >>>>>
> >>>>>
> >>>> Will keep an eye out for this in our ITBLL runs here.
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>>
> >>>   - Andy
> >>>
> >>> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> >>> (via Tom White)
> >>>
> >>
> >>
> >>
> >> --
> >> Best regards,
> >>
> >>   - Andy
> >>
> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> >> (via Tom White)
> >>
>


-- 
-Dima

Reply via email to