Re: nutch-2.0-fetcher fails in reduce stage

Tejas Patil Wed, 17 Oct 2012 20:42:25 -0700

Hi Alex,

There is a possibility that this might be solely related to the hadoop
cluster configuration and the node capacity. This can happen for example
when a client is killed when it has some files open for write.(see [1])


As per [2], can you try these things,
- Reducing the number of tasks per tasktracker to work with the host with
the fewest cores can fix the problem
- increase the java heap size and the nodes

[1]
http://mail-archives.apache.org/mod_mbox/hadoop-common-user/200905.mbox/%[email protected]%3E
[2]
https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/8IAVsg2E-2k

thanks,
Tejas Patil

On Tue, Oct 16, 2012 at 11:41 PM, <[email protected]> wrote:

>
> Hello,
>
> Today, I closely followed all hbase and hadoop logs. As soon as map
> reached 100% reduce was 33%. Then when reduce reached 66% I saw in hadoop's
> datanode log the following error
>
> 2012-10-16 22:44:54,634 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
> 127.0.0.1:50010, storageID=DS-179532189-192.168.1.4-50010-1349640973409,
> infoPort=50075, ipcPort=50020):DataXceiver
> java.io.EOFException: while trying to read 65557 bytes
>     at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:268)
>     at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:312)
>     at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:376)
>     at
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:532)
>     at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:398)
>     at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:107)
>     at java.lang.Thread.run(Thread.java:662)
>
>
>
>
>
> And hbase's regionserver stopped without any errors. I do not see any
> errors in hbase master and hadoop namenode logs.
>
>
> @Lewis
> Not sure what do you mean about configuration to run behind proxy. I
> closely followed hbase configuration at
> http://hbase.apache.org/book/configuration.html
>
> box1 --is a local fedora linux box with dynamic ip
> box2 --is a dedicated fedora server with static ip.
>
> In box 2 fetcher runs without any errors, but the generated set is 100,000
> times less than the set in box1
>
> Thanks in advance.
> Alex.
>
>
>
> -----Original Message-----
> From: Lewis John Mcgibbney <[email protected]>
> To: user <[email protected]>
> Sent: Tue, Oct 16, 2012 2:40 am
> Subject: Re: nutch-2.0-fetcher fails in reduce stage
>
>
>
>
>
>
> Hi Alex,
>
>
>
>
>
> I've seen similar exceptions numerous times [0] when running the Gora
>
>
> test suite against HBase however this _always_ occurred against an
>
>
> HBase version other than the officially supported version of HBase
>
>
> (which is 0.90.4) when behind a local proxy so I am immediately
>
>
> tempted to speculate that this may be the source of the problem.
>
>
>
>
>
> On Tue, Oct 16, 2012 at 3:50 AM,  <[email protected]> wrote:
>
>
>
>
>
> >         at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>
>
>
>
>
> > org.apache.gora.util.GoraException:
> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>
>
> Failed setting up proxy interface
> org.apache.hadoop.hbase.ipc.HRegionInterface
>
>
> to master/192.168.1.4:60020 after attempts=1
>
>
>
>
>
> The above two slices of the stack would also indicate that this is the
> case.
>
>
>
>
>
> >
>
>
> > bin/nutch inject works fine. Also, I have a different linux, box.
> fetcher with
>
>
> the same config runs fine, but the generated set is much less than in the
> first
>
>
> linux box.
>
>
>
>
>
> I don't really understand this very well it is quite ambiguous. Can
>
>
> you clearly define between box1 and box2... and which one works and
>
>
> which one doesn't? Also how are your HBase configurations across these
>
>
> boxes and how are you running Nutch?
>
>
> >
>
>
> > Any ideas how to fix this issue and what is the benefit running fetcher
> in
>
>
> pseudo distributed mode against the local one?
>
>
> >
>
>
>
>
>
> Finally, is your Nutch deployment configured to run behind a proxy? I
>
>
> know there is no mention of this but maybe there is more to this than
>
>
> simply disabling iptables! I am not however HBase literate enough to
>
>
> comment further on what configuration causes this, therefore I've
>
>
> copied in the user@ gora list as well.
>
>
>
>
>
> @user@
>
>
>
>
>
> The original thread for this topic can be found below [1]
>
>
>
>
>
> [0] http://www.mail-archive.com/[email protected]/msg00485.html
>
>
> [1] http://www.mail-archive.com/[email protected]/msg07823.html
>
>
>
>
>
> hth
>
>
>
>
>
> Lewis
>
>
>
>
>
>
>
>
>

Re: nutch-2.0-fetcher fails in reduce stage

Reply via email to