Re: What's causing "No Temporary file"?

Erik Forsberg Thu, 02 Dec 2010 23:40:55 -0800

On Wed, 24 Nov 2010 10:30:09 +0100
Erik Forsberg <forsb...@opera.com> wrote:


> Hi!
> 
> I'm having some trouble with Map/Reduce jobs failing due to HDFS
> errors. I've been digging around the logs trying to figure out what's
> happening, and I see the following in the datanode logs:
> 
> 2010-11-19 10:27:01,059 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: IOException in
> BlockReceiver.lastNodeRun: java.io.IOException: No temporary
> file /opera/log4/hadoop/dfs/data/tmp/blk_-8143694940938019938 for
> block blk_-8143694940938019938_6144372 at

<snip>
> 
> What would be the possible causes of such exceptions?

It seems the cause of this was my puppetd not being able to detect that
the datanode was already running, which caused it to try to start a
second datanode. That in turn seems to cause tmp directories to be
cleaned before the second datanode finds out that the storage
directories are locked. Some kind of race condition I would guess,
because it only happens on systems with high load.

More details here:
https://groups.google.com/a/cloudera.org/group/cdh-user/browse_frm/thread/d4572d2d1191be91#

\EF
-- 
Erik Forsberg <forsb...@opera.com>
Developer, Opera Software - http://www.opera.com/

Re: What's causing "No Temporary file"?

Reply via email to