Re: HRegionServer: Failed openScanner

stack Fri, 15 May 2009 14:16:40 -0700

Andy's base point is that you've probably overloaded your setup.  What are
you hoping to achieve with this setup of one machine?


You've followed the 'Getting Started' section in hbase documentation?  Has
some configuration you need.  Enable the troubleshooting suggested
configurations too if you want to remove lack of resources or incorrect
timeouts as cause.  You should enable DEBUG too.  Will make your logs richer
in detail and will help with the diagnosis.

Thanks,
St.Ack


On Fri, May 15, 2009 at 11:58 AM, Sasha Dolgy <[email protected]> wrote:

> Hi Andy,
> I've sent you an email with a link to a tar file with the logs.  To be
> honest, for the most part this is default out of the box.  To this point
> this is the first problem with over 150k writes to HBase.  After i stopped
> /
> started HBase again everything is going fine...
>
> I haven't looked at the troubleshooting page yet, because well, i'm not
> quite sure what to trouble shoot.  Finding it hard to identify an actual
> problem....other then seeing stack traces and it not working.
>
> -sd
>
> On Fri, May 15, 2009 at 7:54 PM, Andrew Purtell <[email protected]>
> wrote:
>
> > This is almost surely resource overcommitment as cause: CPU and/or
> memory,
> > leading to thread starvation. We observe the JVM scheduler is unfair at
> high
> > load, and swap, especially if JVM heap is paged out when a GC cycle
> happens,
> > can also be similarly deadly. Give other details in this thread, I
> suspect
> > swap. What JVM options are you running with? Have you looked at the GC
> > related tips on the troubleshooting page up on the wiki?
> > http://wiki.apache.org/hadoop/Hbase/Troubleshooting
> >
> > Best regards,
> >
> >   - Andy
> >
> >
> >
> >
> > ________________________________
> > From: Sasha Dolgy <[email protected]>
> > To: [email protected]
> > Sent: Friday, May 15, 2009 11:38:01 AM
> > Subject: Re: HRegionServer: Failed openScanner
> >
> > In the region server logs I see messages from the 14th:
> > 2009-05-14 22:47:28,840 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > starting  compaction on region syslog,,1242260881586
> > 2009-05-14 22:47:43,976 INFO
> org.apache.hadoop.hbase.regionserver.HRegion:
> > compaction completed on region syslog,,1242260881586 in 15sec
> >
> > then no log entries until the 15th when the error happens:
> >
> > 2009-05-15 00:55:51,568 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 189138ms, ten times longer than scheduled: 10000
> > 2009-05-15 00:55:52,334 WARN org.apache.hadoop.hbase.util.Sleeper: We
> slept
> > 188348ms, ten times longer than scheduled: 3000
> > 2009-05-15 00:55:53,090 WARN
> > org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
> > master for 189261 milliseconds - retrying
> > 2009-05-15 00:55:56,789 INFO
> > org.apache.hadoop.hbase.regionserver.HRegionServer:
> > MSG_CALL_SERVER_STARTUP:
> > safeMode=false
> > 2009-05-15 00:55:57,249 ERROR
> > org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScanner
> > org.apache.hadoop.hbase.NotServingRegionException: .META.,,1
> >        at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2076)
> >        at
> >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1710)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
> >        at
> > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> >
> >
> >
> > On Fri, May 15, 2009 at 7:32 PM, Sasha Dolgy <[email protected]> wrote:
> >
> > > Ok, i'll go take a look.  They are both on the local server so network
> > > issues shouldn't be a cause.  Cheers though, i'll go look at the JIRA
> > link.
> > > If I find anything else i'll post here.
> > > thanks
> > > -sd
> > >
> > > On Fri, May 15, 2009 at 6:18 PM, Andrew Purtell <[email protected]
> > >wrote:
> > >
> > >> The region server hosting META could not communicate with the master
> for
> > a
> > >> very long time. Some kind of network issue? Any entries in the region
> > server
> > >> logs above this one
> > >>
> > >> > 2009-05-15 00:55:53,090 WARN
> > >> > org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report
> > to
> > >> > master for 189261 milliseconds - retrying
> > >>
> > >> which may be relevant? Anything about sleeping too long?
> > >>
> > >> Related, there were some bugs that I am aware of preventing recovery
> if
> > >> META in particular goes away but they should be fixed for 0.20 as of
> > >> https://issues.apache.org/jira/browse/HBASE-1362 .
> > >>
> > >>   - Andy
> > >>
> > >
> >
> >
> >
> >
> >
>
>
>
> --
> Sasha Dolgy
> [email protected]
>

Re: HRegionServer: Failed openScanner

Reply via email to