I just got it again and I noticed something strange. One of the task
trackers seems to have finished the task: task_m_4b1l34 which you can
see is not done yet.

Than it gets a new task: task_r_1f4jb0. It completes it immediately and
when this task is done all 4 other trackers die with that dreadful
message :) ....

The story is here: (I removed all the parsing messages for shorter
message)

060220 102853 task_m_4b1l34 0.039386094% 93 pages, 0 errors, 0.6
pages/s, 98 kb/s,
060220 102853 Task task_m_4b1l34 is done.
060220 102853 task_r_1f4jb0 0.2% reduce > copy >
060220 102853 Server connection on port 50050 from 127.0.0.1: exiting
060220 102853 Server connection on port 50050 from 127.0.0.1: exiting
060220 102854 task_r_1f4jb0 0.2% reduce > copy >
060220 102854 task_r_1f4jb0 Got 1 map output locations.
060220 102855 task_r_1f4jb0  Child starting
060220 102855 Server connection on port 50050 from 127.0.0.1: starting
060220 102855 task_r_1f4jb0  Client connection to 0.0.0.0:50050:
starting
060220 102856 task_r_1f4jb0
parsing /tmp/hadoop/mapred/local/taskTracker/task_r_1f4jb0/job.xml
060220 102856 task_r_1f4jb0  Client connection to 212.143.22.185:9000:
starting
060220 102856 task_r_1f4jb0  Using URL normalizer:
org.apache.nutch.net.RegexUrlNormalizer
060220 102856 task_r_1f4jb0  loading
file:/home/nutchuser/trunk/conf/regex-normalize.xml
060220 102856 task_r_1f4jb0  Plugins: looking
in: /tmp/hadoop/mapred/local/taskTracker/task_r_1f4jb0/work/plugins
060220 102857 Server connection on port 50050 from 127.0.0.1: starting
060220 102857 task_r_1f4jb0  Client connection to 0.0.0.0:50050:
starting
060220 102857 task_r_1f4jb0  Plugin Auto-activation mode: [true]
060220 102857 task_r_1f4jb0  Registered Plugins:
060220 102857 task_r_1f4jb0     CyberNeko HTML Parser (lib-nekohtml)
060220 102857 task_r_1f4jb0     Site Query Filter (query-site)
060220 102857 task_r_1f4jb0     Http / Https Protocol Plug-in
(protocol-httpclient)
060220 102857 task_r_1f4jb0     Html Parse Plug-in (parse-html)
060220 102857 task_r_1f4jb0     Jakarta Commons HTTP Client
(lib-commons-httpclient)
060220 102857 task_r_1f4jb0     Basic Indexing Filter (index-basic)
060220 102857 task_r_1f4jb0     Text Parse Plug-in (parse-text)
060220 102857 task_r_1f4jb0     Regex URL Filter (urlfilter-regex)
060220 102857 task_r_1f4jb0     Basic Query Filter (query-basic)
060220 102857 task_r_1f4jb0     HTTP Framework (lib-http)
060220 102857 task_r_1f4jb0     Speedbit Parse Filter plugin
(parse-speedbit)
060220 102857 task_r_1f4jb0     URL Query Filter (query-url)
060220 102857 task_r_1f4jb0     Speedbit Query Filter (query-speedbit)
060220 102857 task_r_1f4jb0     the nutch core extension points
(nutch-extensionpoints)
060220 102857 task_r_1f4jb0     More Indexing Filter (index-more)
060220 102857 task_r_1f4jb0     Speedbit Indexing Filter
(index-speedbit)
060220 102857 task_r_1f4jb0  Registered Extension-Points:
060220 102857 task_r_1f4jb0     Nutch Protocol
(org.apache.nutch.protocol.Protocol)
060220 102857 task_r_1f4jb0     Nutch URL Filter
(org.apache.nutch.net.URLFilter)
060220 102857 task_r_1f4jb0     HTML Parse Filter
(org.apache.nutch.parse.HtmlParseFilter)
060220 102857 task_r_1f4jb0     Nutch Online Search Results Clustering
Plugin (org.apache.nutch.clustering.OnlineClusterer)
060220 102857 task_r_1f4jb0     Nutch Indexing Filter
(org.apache.nutch.indexer.IndexingFilter)
060220 102857 task_r_1f4jb0     Nutch Content Parser
(org.apache.nutch.parse.Parser)
060220 102857 task_r_1f4jb0     Ontology Model Loader
(org.apache.nutch.ontology.Ontology)
060220 102857 task_r_1f4jb0     Nutch Analysis
(org.apache.nutch.analysis.NutchAnalyzer)
060220 102857 task_r_1f4jb0     Nutch Query Filter
(org.apache.nutch.searcher.QueryFilter)
060220 102857 task_r_1f4jb0  found resource regex-urlfilter.txt at
file:/home/nutchuser/trunk/conf/regex-urlfilter.txt
060220 102857 task_r_1f4jb0 0.75000536% reduce > reduce
060220 102858 task_r_1f4jb0 0.75769955% reduce > reduce
060220 102859 task_r_1f4jb0 0.77046275% reduce > reduce
060220 102900 task_r_1f4jb0 0.7867212% reduce > reduce
060220 102902 task_r_1f4jb0 0.80398625% reduce > reduce
060220 102903 task_r_1f4jb0 0.8100066% reduce > reduce
060220 102904 task_r_1f4jb0 0.8282599% reduce > reduce
060220 102905 task_r_1f4jb0 0.8453781% reduce > reduce
060220 102906 task_r_1f4jb0 0.8640073% reduce > reduce
060220 102907 task_r_1f4jb0 0.8819447% reduce > reduce
060220 102908 task_r_1f4jb0 0.89916956% reduce > reduce
060220 102909 task_r_1f4jb0 0.91789067% reduce > reduce
060220 102910 task_r_1f4jb0 0.93869305% reduce > reduce
060220 102911 task_r_1f4jb0 0.9614557% reduce > reduce
060220 102912 task_r_1f4jb0 0.98324126% reduce > reduce
060220 102915 task_r_1f4jb0 1.0% reduce > reduce
060220 102915 Task task_r_1f4jb0 is done.
060220 102916 Server connection on port 50050 from 127.0.0.1: exiting
060220 102916 Server connection on port 50050 from 127.0.0.1: exiting
060220 102945 task_m_1dgza done; removing files.
060220 102948 task_m_4b1l34 done; removing files.
060220 102951 task_m_8t33q5 done; removing files.

All the other trackers got:

060220 102946 task_m_11tcmy Child Error
java.io.IOException: Task process exit with nonzero status.
        at
org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:144)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:97)


Hope it helps.

Gal.





On Sun, 2006-02-19 at 12:27 -0800, Mike Smith wrote:
> Hi,
> 
> This problem is killer! I've been strugelling with this for about a month!
> It doesn't happen all the time, because of this problem the largest crawl I
> could ever done is about 1 million pages.  I have three machines, 3
> datanode, 1 data replicate, 1 job tracker, here is what I get:
> 
> nameserver tasktracker log file:
> 
> 060219 142405 task_r_125kgt 0.14583334% reduce > copy >
> 060219 142406 task_r_125kgt 0.14583334% reduce > copy >
> 060219 142407 task_m_grycae  Error running child
> 060219 142407 task_m_grycae java.io.IOException: timed out waiting for
> response
> 060219 142407 task_m_grycae     at org.apache.hadoop.ipc.Client.call(
> Client.java:303)
> 060219 142407 task_m_grycae     at org.apache.hadoop.ipc.RPC$Invoker.invoke(
> RPC.java:141)
> 060219 142407 task_m_grycae     at
> org.apache.hadoop.mapred.$Proxy0.progress(Unknown
> Source)
> 060219 142407 task_m_grycae     at
> org.apache.hadoop.mapred.Task.reportProgress(Task.java:112)
> 060219 142407 task_m_grycae     at org.apache.hadoop.mapred.Task$1.setStatus
> (Task.java:93)
> 060219 142407 task_m_grycae     at
> org.apache.nutch.fetcher.Fetcher.reportStatus(Fetcher.java:276)
> 060219 142407 task_m_grycae     at org.apache.nutch.fetcher.Fetcher.run(
> Fetcher.java:325)
> 060219 142407 task_m_grycae     at org.apache.hadoop.mapred.MapTask.run(
> MapTask.java:129)
> 060219 142407 task_m_grycae     at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:637)
> 060219 142407 task_m_grycae 0.825607% 108745 pages, 5259 errors,
> 15.6pages/s, 2418 kb/s,
> 060219 142407 task_r_125kgt 0.14583334% reduce > copy >
> 060219 142408 task_m_grycae  Parent died.  Exiting task_m_grycae
> 060219 142408 task_r_125kgt 0.14583334% reduce > copy >
> 060219 142408 Server connection on port 50050 from xxxxxx: exiting
> 060219 142408 Server connection on port 50050 from xxxxxx: exiting
> 060219 142408 task_m_grycae Child Error
> java.io.IOException: Task process exit with nonzero status.
>         at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:144)
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:97)
> 060219 142411 task_m_grycae done; removing files.
> 060219 142413 task_r_125kgt 0.14583334% reduce > copy >
> 
> 
> One of the datanode tasktracker log file:
> 
> 060219 142611 task_m_2yfbgf  fetching
> http://codex.wordpress.org/Managing_Plugins
> 060219 142611 task_m_2yfbgf  fetching
> http://www.scubaboard.com/cms/search.php
> 060219 142611 task_m_2yfbgf Error reading child output
> java.io.IOException: Bad file descriptor
>         at java.io.FileInputStream.readBytes(Native Method)
>         at java.io.FileInputStream.read(FileInputStream.java:194)
>         at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(StreamDecoder.java
> :411)
>         at sun.nio.cs.StreamDecoder$CharsetSD.implRead(StreamDecoder.java
> :453)
>         at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:183)
>         at java.io.InputStreamReader.read(InputStreamReader.java:167)
>         at java.io.BufferedReader.fill(BufferedReader.java:136)
>         at java.io.BufferedReader.readLine(BufferedReader.java:299)
>         at java.io.BufferedReader.readLine(BufferedReader.java:362)
>         at org.apache.hadoop.mapred.TaskRunner.logStream(TaskRunner.java
> :170)
>         at org.apache.hadoop.mapred.TaskRunner.access$100(TaskRunner.java
> :29)
>         at org.apache.hadoop.mapred.TaskRunner$1.run(TaskRunner.java:137)
> 060219 142611 task_m_2yfbgf 0.019530244% 2170 pages, 61 errors,
> 12.3pages/s, 1975 kb/s,
> 060219 142612 Server connection on port 50051 from xxxxxx: exiting
> 060219 142612 Server connection on port 50051 from xxxxxx: exiting
> 060219 142612 task_m_2yfbgf Child Error
> java.io.IOException: Task process exit with nonzero status.
>         at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:144)
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:97)
> 060219 142615 task_m_2yfbgf done; removing files.
> 
> The other datanode looks fine.
> 
> 
> Thanks, Mike
> 
> 
> On 2/16/06, Doug Cutting <[EMAIL PROTECTED]> wrote:
> >
> > Gal Nitzan wrote:
> > > During fetch all tasktrackers aborting the fetch with:
> > >
> > > task_m_b45ma2 Child Error
> > > java.io.IOException: Task process exit with nonzero status.
> > >         at
> > > org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:144)
> > >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:97)
> > >
> >
> > What's reported just before this in this tasktracker's log?
> >
> > What's reported around this time in the jobtracker's log?
> >
> > Doug
> >


Reply via email to