There was 47 second gap in region server log (where the calls to subList()
might have happened):


   1. 2014-05-29 19:09:02,257 INFO
   org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting
   the expired store file by compaction:
   
hdfs://cluster/hbase/IngestProcessing/bf754ed8764ca705a2acc0058e13b69c/data/22b41ad9388f488cb672cca3de0614e9
   whose maxTimeStamp is -1 while the max expired timestamp is 1401318542257
   2. 2014-05-29 19:09:49,324 INFO
   org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
   -6708632874853984071 lease expired on region
   WorldcatCrossref,4333961705,1334582131683.90c82e6c71dd99f21a18df41df28e5b0.


Good practice would be, instead of assigning subList() to the same member
variable, to clear the sublist which is not needed.

Cheers

On Fri, May 30, 2014 at 9:52 AM, Andrew Purtell <[email protected]> wrote:

> Maybe we can kill the zookeeper connection in the abort handler.
>
>
> On Fri, May 30, 2014 at 9:38 AM, Buckley,Ron <[email protected]> wrote:
>
> > Thanks Ted. I should have seen that.
> >
> > I finally had to 'kill -9' the rs, as I couldnt get it to shut down any
> > other way.
> >
> > It seems like, the Region Server shouldnt have kept telling ZooKeeper
> that
> > all was well, even though it was trying to abort with a fatal error.
> >
> >
> > -----Original Message-----
> > From: Ted Yu [mailto:[email protected]]
> > Sent: Friday, May 30, 2014 12:11 PM
> > To: [email protected]
> > Subject: Re: Region Server hung during shutdown after StackOverflow error
> >
> > Looking at the StackOverflowError in pastebin, the cause was too many
> > calls to subList().
> > J-D fixed one similar bug in HBASE-10312
> >
> > I searched for '\.subList(' in 0.94 codebase but haven't pinpointed which
> > class was the source of such calls.
> >
> > Will dig deeper when I have time.
> >
> > Cheers
> >
> >
> > On Fri, May 30, 2014 at 8:24 AM, Buckley,Ron <[email protected]> wrote:
> >
> > > Interesting case happened out dev HBase cluster overnight.  (We're
> > > running HBase 0.94.15 from CDH 4.6.0)
> > >
> > > A region server took a StackOverflow error, it looks like during
> > > during a minor compaction.
> > >
> > > The region server is trying to shut down with a Fatal, but is now hung
> > > during shutdown.
> > >
> > > The particularly troublesome thing is that the RS is alive enough to
> > > keep zookeeper happy.
> > >
> > > So, the regions arent moving off, but our apps cant get to them
> > > because the RS is mostly dead.
> > >
> > > I put some of the details on pastebin.
> > >
> > > JStack -> http://pastebin.com/hnLtaG54 Outfile ->
> > > http://pastebin.com/5F1UcGjg Logfile -> http://pastebin.com/TBL1YSZM
> > >
> > >
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Reply via email to