So its not just at 16%, but depends on the task:
2008-10-30 13:58:29,702 INFO org.apache.hadoop.mapred.TaskTracker: attempt_200810301345_0001_r_000000_0 0.25675678% reduce > copy (57 of 74 at 13.58 MB/s) >

2008-10-30 13:58:29,357 WARN org.apache.hadoop.mapred.TaskTracker: getMapOutput(attempt_200810301345_0001_m_000048_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200810301345_0001/ attempt_200810301345_0001_m_000048_0/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator $AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org .apache .hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java: 138) at org.apache.hadoop.mapred.TaskTracker $MapOutputServlet.doGet(TaskTracker.java:2402)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

I'm out of thoughts on what the problem could be..


On Oct 30, 2008, at 12:35 PM, Scott Whitecross wrote:

I'm growing very frustrated with a simple cluster setup. I can get the cluster setup on two machines, but have troubles when trying to extend the installation to 3 or more boxes. I keep seeing the below errors. It seems the reduce tasks can't get access to the data.

I can't seem to figure out how to fix this error. What amazes me is that file not found issues appear on the master box, as well as the slaves. What causes the reduce tasks to not read find information via the localhost?

Setup/Errors:

My basic setup comes from: http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster) (Michael Noll's setup). I've put the following in the my /etc/ hosts file:

127.0.0.1       localhost
10.1.1.12       master
10.1.1.10       slave
10.1.1.13       slave1

And have setup transparent ssh to all boxes (and it works). All boxes can see each other, etc.

My base level hadoop-site.xml is:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
       <property>
               <name>hadoop.tmp.dir</name>
               <value>/opt/hadoop-datastore</value>
       </property>
       <property>
               <name>fs.default.name</name>
               <value>hdfs://master:54310</value>
       </property>
       <property>
               <name>mapred.job.tracker</name>
               <value>master:54311</value>
       </property>
       <property>
               <name>dfs.replication</name>
               <value>3</value>
       </property>
</configuration>


Errors:

WARN org.apache.hadoop.mapred.TaskTracker: getMapOutput(attempt_200810301206_0004_m_000001_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200810301206_0004/ attempt_200810301206_0004_m_000001_0/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator $AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org .apache .hadoop .fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java: 138)...

and in the userlog of the attempt:

2008-10-30 12:28:00,806 WARN org.apache.hadoop.mapred.ReduceTask: java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_200810301206_0004&map=attempt_200810301206_0004_m_000001_0&reduce=0 at sun.reflect.GeneratedConstructorAccessor3.newInstance(Unknown Source) at sun .reflect .DelegatingConstructorAccessorImpl .newInstance(DelegatingConstructorAccessorImpl.java:27)


Reply via email to