If they are both on the same server then almost surely the issue is swapping or
CPU overcommitment. That said, additional information from your logs, if
germane, may shed more light on this incident.
Best regards,
- Andy
________________________________
From: Sasha Dolgy <[email protected]>
To: [email protected]
Sent: Friday, May 15, 2009 11:32:56 AM
Subject: Re: HRegionServer: Failed openScanner
Ok, i'll go take a look. They are both on the local server so network
issues shouldn't be a cause. Cheers though, i'll go look at the JIRA link.
If I find anything else i'll post here.
thanks
-sd
On Fri, May 15, 2009 at 6:18 PM, Andrew Purtell <[email protected]> wrote:
> The region server hosting META could not communicate with the master for a
> very long time. Some kind of network issue? Any entries in the region server
> logs above this one
>
> > 2009-05-15 00:55:53,090 WARN
> > org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
> > master for 189261 milliseconds - retrying
>
> which may be relevant? Anything about sleeping too long?
>
> Related, there were some bugs that I am aware of preventing recovery if
> META in particular goes away but they should be fixed for 0.20 as of
> https://issues.apache.org/jira/browse/HBASE-1362 .
>
> - Andy
>