I'm growing very frustrated with a simple cluster setup. I can get the cluster setup on two machines, but have troubles when trying to extend the installation to 3 or more boxes. I keep seeing the below errors. It seems the reduce tasks can't get access to the data.

I can't seem to figure out how to fix this error. What amazes me is that file not found issues appear on the master box, as well as the slaves. What causes the reduce tasks to not read find information via the localhost?

Setup/Errors:

My basic setup comes from: http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster) (Michael Noll's setup). I've put the following in the my /etc/hosts file:

127.0.0.1       localhost
10.1.1.12       master
10.1.1.10       slave
10.1.1.13       slave1

And have setup transparent ssh to all boxes (and it works). All boxes can see each other, etc.

My base level hadoop-site.xml is:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/opt/hadoop-datastore</value>
        </property>
        <property>
                <name>fs.default.name</name>
                <value>hdfs://master:54310</value>
        </property>
        <property>
                <name>mapred.job.tracker</name>
                <value>master:54311</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
</configuration>


Errors:

WARN org.apache.hadoop.mapred.TaskTracker: getMapOutput(attempt_200810301206_0004_m_000001_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_200810301206_0004/ attempt_200810301206_0004_m_000001_0/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator $AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:359) at org .apache .hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java: 138)...

and in the userlog of the attempt:

2008-10-30 12:28:00,806 WARN org.apache.hadoop.mapred.ReduceTask: java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_200810301206_0004&map=attempt_200810301206_0004_m_000001_0&reduce=0 at sun.reflect.GeneratedConstructorAccessor3.newInstance(Unknown Source) at sun .reflect .DelegatingConstructorAccessorImpl .newInstance(DelegatingConstructorAccessorImpl.java:27)

Reply via email to