Any chance we can see what happened before that too? Usually you should see a lot more HDFS spam before getting that all the datanodes are bad.
J-D On Wed, Mar 28, 2012 at 4:28 AM, Eran Kutner <[email protected]> wrote: > Hi, > > We have region server sporadically stopping under load due supposedly to > errors writing to HDFS. Things like: > > 2012-03-28 00:37:11,210 WARN org.apache.hadoop.hdfs.DFSClient: Error while > syncing > java.io.IOException: All datanodes 10.1.104.10:50010 are bad. Aborting.. > > It's happening with a different region server and data node every time, so > it's not a problem with one specific server and there doesn't seem to be > anything really wrong with either of them. I've already increased the file > descriptor limit, datanode xceivers and data node handler count. Any idea > what can be causing these errors? > > > A more complete log is here: http://pastebin.com/wC90xU2x > > Thanks. > > -eran
