I ran into some similar issues with firewalls and ended up completely
turning them off. That took care of some of the problems but allowed me to
figure out that if DNS / HOST files aren't configured correctly, weird
things will happen during the communication between daemons. I have a small
cluster and configured a hosts file that I copied everywhere, including
workstation for HDFS browsing. This made things run much smoother, hope that
helps.

-Daniel

On Wed, Jun 18, 2008 at 12:53 PM, Alexander Arimond <
[EMAIL PROTECTED]> wrote:

>
> Got a similar error when doing a mapreduce job on the master machine.
> Mapping job is ok and in the end there are the right results in my
> output folder, but the reduce hangs at 17% a very long time. Found this
> in one of the task logs a view times:
>
> ...
> 2008-06-18 17:31:02,297 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0: Got 0 new map-outputs & 0 obsolete
> map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-06-18 17:31:02,297 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Got 0 known map output location(s);
> scheduling...
> 2008-06-18 17:31:02,297 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Scheduled 0 of 0 known outputs (0 slow
> hosts and 0 dup hosts)
> 2008-06-18 17:31:03,276 WARN org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 copy failed:
> task_200806181716_0001_m_000001_0 from koeln
> 2008-06-18 17:31:03,276 WARN org.apache.hadoop.mapred.ReduceTask:
> java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>        at
> java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>        at java.net.Socket.connect(Socket.java:519)
>        at sun.net.NetworkClient.doConnect(NetworkClient.java:152)
>        at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
>        at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
>        at sun.net.www.http.HttpClient.<init>(HttpClient.java:233)
>        at sun.net.www.http.HttpClient.New(HttpClient.java:306)
>        at sun.net.www.http.HttpClient.New(HttpClient.java:323)
>        at
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:788)
>        at
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:729)
>        at
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:654)
>        at
> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:977)
>        at
> org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:139)
>        at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:815)
>        at
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:764)
>
> 2008-06-18 17:31:03,276 INFO org.apache.hadoop.mapred.ReduceTask: Task
> task_200806181716_0001_r_000000_0: Failed fetch #7 from
> task_200806181716_0001_m_000001_0
> 2008-06-18 17:31:03,276 INFO org.apache.hadoop.mapred.ReduceTask: Failed to
> fetch map-output from task_200806181716_0001_m_000001_0 even after
> MAX_FETCH_RETRIES_PER_MAP retries...  reporting to the JobTracker
> 2008-06-18 17:31:03,276 WARN org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 adding host koeln to penalty box, next
> contact in 150 seconds
> 2008-06-18 17:31:03,277 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Need 1 map output(s)
> 2008-06-18 17:31:03,317 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0: Got 0 new map-outputs & 0 obsolete
> map-outputs from tasktracker and 1 map-outputs from previous failures
> 2008-06-18 17:31:03,317 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Got 1 known map output location(s);
> scheduling...
> 2008-06-18 17:31:03,317 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Scheduled 0 of 1 known outputs (1 slow
> hosts and 0 dup hosts)
> 2008-06-18 17:31:08,336 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Need 1 map output(s)
> 2008-06-18 17:31:08,337 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0: Got 0 new map-outputs & 0 obsolete
> map-outputs from tasktracker and 0 map-outputs from previous failures
> 2008-06-18 17:31:08,337 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Got 1 known map output location(s);
> scheduling...
> 2008-06-18 17:31:08,337 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Scheduled 0 of 1 known outputs (1 slow
> hosts and 0 dup hosts)
> 2008-06-18 17:31:13,356 INFO org.apache.hadoop.mapred.ReduceTask:
> task_200806181716_0001_r_000000_0 Need 1 map output(s)
> ...
>
>
> Did i forget to open some ports? i opened 50010 for datanode and the
> ports for dfs and jobtracker as specified in hadoop-site.xml.
> If its a firewall problem, woudnt hadoop recognize that at startup, i.e.
> that connections would be refused?
>
>
>
> On Wed, 2008-06-18 at 11:32 +0200, Alexander Arimond wrote:
> > Thank you, first tried the put from the master machine, which leads to
> > the error. The put from the slave machine works. Guess youre right with
> > the configuration parameters. Appears a bit strange to me, because the
> > firewall settings and the hadoop-site.xml on both machines are equal.
> >
> > On Tue, 2008-06-17 at 14:08 -0700, Konstantin Shvachko wrote:
> > > Looks like the client machine from which you call -put cannot connect
> to the data-nodes.
> > > It could be firewall or wrong configuration parameters that you use for
> the client.
> > >
> > > Alexander Arimond wrote:
> > > > hi,
> > > >
> > > > i'm new in hadoop and im just testing it at the moment.
> > > > i set up a cluster with 2 nodes and it seems like they are running
> > > > normally,
> > > > the log files of the namenode and the datanodes dont show errors.
> > > > Firewall should be set right.
> > > > but when i try to upload a file to the dfs i get following message:
> > > >
> > > > [EMAIL PROTECTED]:~/hadoop$ bin/hadoop dfs -put file.txt file.txt
> > > > 08/06/12 14:44:19 INFO dfs.DFSClient: Exception in
> > > > createBlockOutputStream java.net.ConnectException: Connection refused
> > > > 08/06/12 14:44:19 INFO dfs.DFSClient: Abandoning block
> > > > blk_5837981856060447217
> > > > 08/06/12 14:44:28 INFO dfs.DFSClient: Exception in
> > > > createBlockOutputStream java.net.ConnectException: Connection refused
> > > > 08/06/12 14:44:28 INFO dfs.DFSClient: Abandoning block
> > > > blk_2573458924311304120
> > > > 08/06/12 14:44:37 INFO dfs.DFSClient: Exception in
> > > > createBlockOutputStream java.net.ConnectException: Connection refused
> > > > 08/06/12 14:44:37 INFO dfs.DFSClient: Abandoning block
> > > > blk_1207459436305221119
> > > > 08/06/12 14:44:46 INFO dfs.DFSClient: Exception in
> > > > createBlockOutputStream java.net.ConnectException: Connection refused
> > > > 08/06/12 14:44:46 INFO dfs.DFSClient: Abandoning block
> > > > blk_-8263828216969765661
> > > > 08/06/12 14:44:52 WARN dfs.DFSClient: DataStreamer Exception:
> > > > java.io.IOException: Unable to create new block.
> > > > 08/06/12 14:44:52 WARN dfs.DFSClient: Error Recovery for block
> > > > blk_-8263828216969765661 bad datanode[0]
> > > >
> > > >
> > > > dont know what that means and didnt found something about that..
> > > > Hope somebody can help with that.
> > > >
> > > > Thank you!
> > > >
> > > >
> > >
>
>
>
>

Reply via email to