Re: exception during fetch using hadoop

Mike Smith Fri, 24 Feb 2006 13:59:34 -0800

Hi Doug,

Unfortunately my core limit was 0 at that time and I am running another
configuration with smaller number of threads since last night and it has not
crashed yet. As soon as it fails I will send you more detailed log
information.


But this is usually happen:
I'v been using three three machines through DFS

Machine A:    namenode, jobtracker, datanode, tasktracker
 Machine B:    datanode, tasktracker
Machine C:    datanode, tasktracker

Once one task fail with (bad file descriptor exception) that machine will be
try to reconnect couple times (like 20 secs) then will be disconnected from
the jobtracker and jobtracker only sees two machines after that. After a
while something strange happens in the map/reduce status report like this:

map 80%  ---> reduce 13%
 map 81%  ---> reduce 13%
 map 82%  ---> reduce 13%
 map 5%  ---> reduce 13%
 map 6%  ---> reduce 13%

I mean map status goes down and couple after iterations jobtracker fails
completely.

By the way, the fetching speed using hadoop will not exceed 12 pages/sec,
but I used to fetch with almost 50 pages/sec using nutch 0.7 with almost the
same configuration.


Thanks, Mike



On 2/24/06, Doug Cutting <[EMAIL PROTECTED]> wrote:

> It looks like the child JVM is silently exiting.  The "error reading
> child output" just shows that the child's standard output has been
> closed, and the "child error" says the JVM exited with non-zero.
>
> Perhaps you can get a core dump by setting 'ulimit -c' to something big.
>    JVM core dumps can be informative.
>
> This doesn't look like something that should kill a crawl, though.  Are
> you using a tasktracker & jobtrackers, or running things with a "local"
> jobtracker?  With a tasktracker this task would be retried.  Are you
> seeing this?  Does a given task consistently fail when retried?
>
> Doug
>
> Mike Smith wrote:
> > I have been getting this exception during fetching for almost a month.
> This
> > exception stops the whole crawl. It happens on and off! Any Idea?? We
> are
> > really stocked with this problem.
> >
> > I am using 3 data node and 1 name server.
> >
> > 060223 173809 task_m_b8ibww  fetching
> http://www.heartcenter.com/94fall.pdf
> > 060223 173809 task_m_b8ibww  fetching
> > http://www.medinfo.co.uk/conditions/tenosynovitis.html
> > 060223 173809 task_m_b8ibww  fetching
> > http://www.boncholesterol.com/whatsnew/index.shtml
> > 060223 173809 task_m_b8ibww  fetching
> > http://www.drcranton.com/hrt/promise_of_longevity.htm
> > 060223 173809 task_m_b8ibww  fetching
> > http://www.drcranton.com/hrt/promise_of_longevity.htm
> > 060223 173809 task_m_b8ibww Error reading child output
> > java.io.IOException: Bad file descriptor
> >         at java.io.FileInputStream.readBytes(Native Method)
> >         at java.io.FileInputStream.read(FileInputStream.java:194)
> >         at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(
> StreamDecoder.java
> > :411)
> >         at sun.nio.cs.StreamDecoder$CharsetSD.implRead(
> StreamDecoder.java
> > :453)
> >         at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:183)
> >         at java.io.InputStreamReader.read(InputStreamReader.java:167)
> >         at java.io.BufferedReader.fill(BufferedReader.java:136)
> >         at java.io.BufferedReader.readLine(BufferedReader.java:299)
> >         at java.io.BufferedReader.readLine(BufferedReader.java:362)
> >         at org.apache.hadoop.mapred.TaskRunner.logStream(TaskRunner.java
> > :170)
> >         at org.apache.hadoop.mapred.TaskRunner.access$100(
> TaskRunner.java
> > :29)
> >         at org.apache.hadoop.mapred.TaskRunner$1.run(TaskRunner.java
> :137)
> > 060223 173809 task_r_3h1pex 0.16666667% reduce > copy >
> > 060223 173809 Server connection on port 50050 from xxxxxx: exiting
> > 060223 173809 Server connection on port 50050 from xxxxxx: exiting
> > 060223 173809 task_m_b8ibww Child Error
> > java.io.IOException: Task process exit with nonzero status.
> >         at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java
> :144)
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:97)
> > 060223 173812 task_m_b8ibww done; removing files.
> >
>

Re: exception during fetch using hadoop

Reply via email to