Re: mapred.ReduceTask - java.io.FileNotFoundException

tittutomen Tue, 06 Oct 2009 04:19:33 -0700

bhavin pandya-3 wrote:
> 
> Hi,
> 
> I am trying to configure nutch and hadoop on 2 node. But while trying
> to fetch, i am getting this exception. (same exception i am getting
> sometime while injecting new seed)
> 
> 2009-10-06 14:56:51,609 WARN  mapred.ReduceTask -
> java.io.FileNotFoundException: http://127.0.0.1:50060/mapOutput?
> job=job_200910061454_0001&map=attempt_200910061454_0001_m_000000_0&reduce=3
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>         at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>         at
> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1345)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at
> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1339)
>         at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:993)
>         at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1293)
>         at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1231)
>         at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1144)
>         at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1084)
> Caused by: java.io.FileNotFoundException:
> http://127.0.0.1:50060/mapOutput?job=job_200910061454_0001&map
> 
> 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTrack
> er/jobcache/job_200910061454_0001/attempt_200910061454_0001_m_000000_0/output/f
> ile.out.index in any of the configured local directories
>         at
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalP
> athToRead(LocalDirAllocator.java:381)
>         at
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAl
> locator.java:138)
>         at
> org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTrac
> ker.java:2840)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>         at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:42
> 7)
>         at
> org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicat
> ionHandler.java:475)
>         at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:
> 567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at
> org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicatio
> nContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at
> org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>         at
> org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>         at
> org.mortbay.http.SocketListener.handleConnection(SocketListener.java
> :244)
>         at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> 
> 
> And then continueous message in hadooop.log like
> 2009-10-06 15:56:43,918 WARN  mapred.ReduceTask -
> attempt_200910061538_0005_r_000001_0 adding host 127.0.0.1 to penalty
> box, next contact in 150 seconds
> 
> 
> Here is my hadoop-site.xml content:
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://crawler1.mydomain.com:9000/</value>
>   <description>
>     The name of the default file system. Either the literal string
>     "local" or a host:port for NDFS.
>   </description>
> </property>
> 
> <property>
>   <name>mapred.job.tracker</name>
>   <value>crawler1.mydomain.com:9001</value>
>   <description>
>     The host and port that the MapReduce job tracker runs at. If
>     "local", then jobs are run in-process as a single map and
>     reduce task.
>   </description>
> </property>
> 
> <property>
>   <name>mapred.map.tasks</name>
>   <value>2</value>
>   <description>
>     define mapred.map tasks to be number of slave hosts
>   </description>
> </property>
> 
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>2</value>
>   <description>
>     define mapred.reduce tasks to be number of slave hosts
>   </description>
> </property>
> 
> <property>
>   <name>dfs.name.dir</name>
>   <value>/nutch/filesystem/name</value>
> </property>
> 
> <property>
>   <name>dfs.data.dir</name>
>   <value>/nutch/filesystem/data</value>
> </property>
> 
> <property>
>   <name>mapred.system.dir</name>
>   <value>/nutch/filesystem/mapreduce/system</value>
> </property>
> 
> <property>
>   <name>mapred.local.dir</name>
>   <value>/nutch/filesystem/mapreduce/local</value>
> </property>
> 
> <property>
>   <name>dfs.replication</name>
>   <value>1</value>
> </property>
> 
> <property>
>    <name>hadoop.tmp.dir</name>
>    <value>/tmp</value>
>    <description>A base for other temporary directories</description>
> </property>
> 
> netstat -antp shows program listening on port 50060
> tcp        0      0 0.0.0.0:50090           0.0.0.0:*
> LISTEN      12855/java
> tcp        0      0 0.0.0.0:50060           0.0.0.0:*
> LISTEN      13014/java
> tcp        0      0 0.0.0.0:50030           0.0.0.0:*
> LISTEN      12923/java
> tcp        0      0 0.0.0.0:50010           0.0.0.0:*
> LISTEN      12765/java
> tcp        0      0 0.0.0.0:50075           0.0.0.0:*
> LISTEN      12765/java
> 
> masters:
> crawler1.mydomain.com
> 
> slaves:
> crawler1.mydomain.com
> crawler2.mydomain.com
> 
> 
> It works perfect with signle machine configuration.
> I am using nutch 1.0. Any pointer?
> 
> Thanks.
> - Bhavin
> 
> 



Please try with the following configuration change
<property> 
  <name>dfs.replication</name> 
  <value>2</value> 
</property>
 


-- 
View this message in context: 
http://www.nabble.com/mapred.ReduceTask---java.io.FileNotFoundException-tp25766523p25766880.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: mapred.ReduceTask - java.io.FileNotFoundException

Reply via email to