Re: PENDING_CLOSE for too long

Geoff Hendrey Sat, 29 Oct 2011 19:09:15 -0700

Stuart -

Have you disabled splitting? I believe you can work around the issue of 
PENDInG_CLOSE by presplitting your table and disabling splitting. Worked for us.


Sent from my iPhone

On Oct 29, 2011, at 4:19 PM, "Ted Yu" <[email protected]> wrote:

> In 0.92 (to be released in 2 weeks), you can expect improvement in this
> regard.
> See HBASE-3368.
> 
> Geoff:
> Can you publish your tool on HBASE JIRA ?
> 
> Thanks
> 
> On Sat, Oct 29, 2011 at 2:35 PM, Geoff Hendrey <[email protected]> wrote:
> 
> > Sure. I posted the code many weeks back for a tool that will repair holes
> > in .mETA.
> >
> > If you do a check on the list, you should find it. I'll send you the
> > latest code for that. Maybe I made some fixes after I posted the code.
> > Please ping me if I forget. I've used it to repair huge tables  (and fixed
> > subtle bugs in the process) so I'm confident it works.
> >
> > No matter what anyone tells me, I know hbase is horribly broken for the
> > use case of doing bulk writes from an mr job. It shits the bed every time
> > you pass a certain scale. For this reason we've completely rewritten our
> > code so that we use bulkloading. It's way more efficient and always work.
> >
> > Please ping me until I send you the code. Otherwise I will forget.
> >
> > Sent from my iPhone
> >
> > On Oct 29, 2011, at 1:39 PM, "Stuart Smith" <[email protected]> wrote:
> >
> > > Hello Geoff,
> > >
> > >   I usually don't show up here, since I use CDH, and good form means I
> > should stay on CDH-users,
> > > But!
> > >   I've been seeing the same issues for months:
> > >
> > >  - PENDING_CLOSE too long, master tries to reassign - I see an
> > continuous stream of these.
> > >  - WrongRegionExceptions due to overlapping regions & holes in the
> > regions.
> > >
> > > I just spent all day yesterday cribbing off of St.Ack's check_meta.rb
> > script to write a java program to fix up overlaps & holes in an offline
> > fashion (hbase down, directly on hdfs), and will start testing next week
> > (cross my fingers!).
> > >
> > > It seems like the pending close messages can be ignored?
> > > And once I test my tool, and confirm I know a little bit about what I'm
> > doing, maybe we could share notes?
> > >
> > > Take care,
> > >   -stu
> > >
> > >
> > >
> > > ________________________________
> > > From: Geoff Hendrey <[email protected]>
> > > To: [email protected]
> > > Cc: [email protected]
> > > Sent: Saturday, September 3, 2011 12:11 AM
> > > Subject: RE: PENDING_CLOSE for too long
> > >
> > > "Are you having trouble getting to any of your data out in tables?"
> > >
> > > depends what you mean. We see corruptions from time to time that prevent
> > > us from getting data, one way or another. Today's corruption was regions
> > > with duplicate start and end rows. We fixed that by deleting the
> > > offending regions from HDFS, and running add_table.rb to restore the
> > > meta. The other common corruption is the holes in ".META." that we
> > > repair with a little tool we wrote. We'd love to learn why we see these
> > > corruptions with such regularity (seemingly much higher than others on
> > > the list).
> > >
> > > We will implement timeout you suggest, and see how it goes.
> > >
> > > Thanks,
> > > Geoff
> > >
> > > -----Original Message-----
> > > From: [email protected] [mailto:[email protected]] On Behalf Of
> > > Stack
> > > Sent: Friday, September 02, 2011 10:51 PM
> > > To: [email protected]
> > > Cc: [email protected]
> > > Subject: Re: PENDING_CLOSE for too long
> > >
> > > Are you having trouble getting to any of your data out in tables?
> > >
> > > To get rid of them, try restarting your master.
> > >
> > > Before you restart your master, do "HBASE-4126  Make timeoutmonitor
> > > timeout after 30 minutes instead of 3"; i.e. set
> > > "hbase.master.assignment.timeoutmonitor.timeout" to 1800000 in
> > > hbase-site.xml.
> > >
> > > St.Ack
> > >
> > > On Fri, Sep 2, 2011 at 1:40 PM, Geoff Hendrey <[email protected]>
> > > wrote:
> > > > In the master logs, I am seeing "regions in transition timed out" and
> > > > "region has been PENDING_CLOSE for too long, running forced unasign".
> > > > Both of these log messages occur at INFO level, so I assume they are
> > > > innocuous. Should I be concerned?
> > > >
> > > >
> > > >
> > > > -geoff
> > > >
> > > >
> >

Re: PENDING_CLOSE for too long

Reply via email to