Re: getting DiskErrorException during map

Alex Loddengaard Thu, 16 Apr 2009 10:45:12 -0700

Have you set hadoop.tmp.dir away from /tmp as well?  If hadoop.tmp.dir is
set somewhere in /scratch vs. /tmp, then I'm not sure why Hadoop would be
writing to /tmp.


Hope this helps!

Alex

On Wed, Apr 15, 2009 at 2:37 PM, Jim Twensky <[email protected]> wrote:

> Alex,
>
> Yes, I bounced the Hadoop daemons after I changed the configuration files.
>
> I also tried setting  $HADOOP_CONF_DIR to the directory where my
> hadop-site.xml file resides but it didn't work.
> However, I'm sure that HADOOP_CONF_DIR is not the issue because other
> properties that I changed in hadoop-site.xml
> seem to be properly set. Also, here is a section from my hadoop-site.xml
> file:
>
>    <property>
>        <name>hadoop.tmp.dir</name>
>         <value>/scratch/local/jim/hadoop-${user.name}</value>
>     </property>
>    <property>
>        <name>mapred.local.dir</name>
>         <value>/scratch/local/jim/hadoop-${user.name}/mapred/local</value>
>    </property>
>
> I also created /scratch/local/jim/hadoop-jim/mapred/local on each task
> tracker since I know
> directories that do not exist are ignored.
>
> When I manually ssh to the task trackers, I can see the directory
> /scratch/local/jim/hadoop-jim/dfs
> is automatically created so is it seems like  hadoop.tmp.dir is set
> properly. However, hadoop still creates
> /tmp/hadoop-jim/mapred/local and uses that directory for the local storage.
>
> I'm starting to suspect that mapred.local.dir is overwritten to a default
> value of /tmp/hadoop-${user.name}
> somewhere inside the binaries.
>
> -jim
>
> On Tue, Apr 14, 2009 at 4:07 PM, Alex Loddengaard <[email protected]>
> wrote:
>
> > First, did you bounce the Hadoop daemons after you changed the
> > configuration
> > files?  I think you'll have to do this.
> >
> > Second, I believe 0.19.1 has hadoop-default.xml baked into the jar.  Try
> > setting $HADOOP_CONF_DIR to the directory where hadoop-site.xml lives.
>  For
> > whatever reason your hadoop-site.xml (and the hadoop-default.xml you
> tried
> > to change) are probably not being loaded.  $HADOOP_CONF_DIR should fix
> > this.
> >
> > Good luck!
> >
> > Alex
> >
> > On Mon, Apr 13, 2009 at 11:25 AM, Jim Twensky <[email protected]>
> > wrote:
> >
> > > Thank you Alex, you are right. There are quotas on the systems that I'm
> > > working. However, I tried to change mapred.local.dir as follows:
> > >
> > > --inside hadoop-site.xml:
> > >
> > >    <property>
> > >        <name>mapred.child.tmp</name>
> > >        <value>/scratch/local/jim</value>
> > >    </property>
> > >    <property>
> > >        <name>hadoop.tmp.dir</name>
> > >        <value>/scratch/local/jim</value>
> > >    </property>
> > >    <property>
> > >        <name>mapred.local.dir</name>
> > >        <value>/scratch/local/jim</value>
> > >    </property>
> > >
> > >  and observed that the intermediate map outputs are still being written
> > > under /tmp/hadoop-jim/mapred/local
> > >
> > > I'm confused at this point since I also tried setting these values
> > directly
> > > inside the hadoop-default.xml and that didn't work either. Is there any
> > > other property that I'm supposed to change? I tried searching for
> "/tmp"
> > in
> > > the hadoop-default.xml file but couldn't find anything else.
> > >
> > > Thanks,
> > > Jim
> > >
> > >
> > > On Tue, Apr 7, 2009 at 9:35 PM, Alex Loddengaard <[email protected]>
> > > wrote:
> > >
> > > > The getLocalPathForWrite function that throws this Exception assumes
> > that
> > > > you have space on the disks that mapred.local.dir is configured on.
> >  Can
> > > > you
> > > > verify with `df` that those disks have space available?  You might
> also
> > > try
> > > > moving mapred.local.dir off of /tmp if it's configured to use /tmp
> > right
> > > > now; I believe some systems have quotas on /tmp.
> > > >
> > > > Hope this helps.
> > > >
> > > > Alex
> > > >
> > > > On Tue, Apr 7, 2009 at 7:22 PM, Jim Twensky <[email protected]>
> > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm using Hadoop 0.19.1 and I have a very small test cluster with 9
> > > > nodes,
> > > > > 8
> > > > > of them being task trackers. I'm getting the following error and my
> > > jobs
> > > > > keep failing when map processes start hitting 30%:
> > > > >
> > > > > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not
> find
> > > any
> > > > > valid local directory for
> > > > >
> > > > >
> > > >
> > >
> >
> taskTracker/jobcache/job_200904072051_0001/attempt_200904072051_0001_m_000000_1/output/file.out
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:335)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61)
> > > > >        at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1209)
> > > > >        at
> > > > >
> > >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:867)
> > > > >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> > > > >        at org.apache.hadoop.mapred.Child.main(Child.java:158)
> > > > >
> > > > >
> > > > > I googled many blogs and web pages but I could neither understand
> why
> > > > this
> > > > > happens nor found a solution to this. What does that error message
> > mean
> > > > and
> > > > > how can avoid it, any suggestions?
> > > > >
> > > > > Thanks in advance,
> > > > > -jim
> > > > >
> > > >
> > >
> >
>

Re: getting DiskErrorException during map

Reply via email to