Re: Region server shutting down due to HDFS error

Eran Kutner Wed, 28 Mar 2012 08:10:43 -0700

Hi Jimmy,
HBase is built from latest sources of 0.90 branch (0.90.7-SNAPSHOT), I had
the same problem with 0.90.4
Hadoop 0.20.2 from Cloudera CDH3u1


This failure happens during large M/R jobs, I have 10 servers and usually
no more than 1 would fail like this, sometimes none.
One thing worth mentioning is that the table it is trying to write to has
over 5000 regions.

-eran



On Wed, Mar 28, 2012 at 16:17, Jimmy Xiang <[email protected]> wrote:

> Which version of HDFS and HBase are you using?
>
> When the problem happens, can you access the HDFS, for example, from
> hadoop dfs?
>
> Thanks,
> Jimmy
>
> On Wed, Mar 28, 2012 at 4:28 AM, Eran Kutner <[email protected]> wrote:
> > Hi,
> >
> > We have region server sporadically stopping under load due supposedly to
> > errors writing to HDFS. Things like:
> >
> > 2012-03-28 00:37:11,210 WARN org.apache.hadoop.hdfs.DFSClient: Error
> while
> > syncing
> > java.io.IOException: All datanodes 10.1.104.10:50010 are bad. Aborting..
> >
> > It's happening with a different region server and data node every time,
> so
> > it's not a problem with one specific server and there doesn't seem to be
> > anything really wrong with either of them. I've already increased the
> file
> > descriptor limit, datanode xceivers and data node handler count. Any idea
> > what can be causing these errors?
> >
> >
> > A more complete log is here: http://pastebin.com/wC90xU2x
> >
> > Thanks.
> >
> > -eran
>

Re: Region server shutting down due to HDFS error

Reply via email to