[Nutch-general] Re: Map Reduce Errors

Stefan Groschupf Tue, 13 Dec 2005 12:05:05 -0800

It do see stuff about crc's being ignored sometimes at the end ofan operation. Is there a setting for this?


Isn't it in the nutch-default.xml?

I have also just learned that the box i've been using as the jobtracker and ndfs name node has a wonky system timer. So maybe thatis the problem. I'm currently getting a test running using only theother two machines.

Well this could may be the problem, I'm not sure how java handles thesystem current millis behind the sense. Since you got a time outexception may this could be a problem.

Stefan

-Matt



Stefan Groschupf wrote:
hmm, sounds strange, but I'm interested to dig to find theproblem source, since I'm very much interested to get 0.8 stabileasap.
However find such a problem source is a pain in the neck.
Do you use the latest sources from svn?
Do you ignore crc errors? Doug mentioned that he notice oftenproblems with this.
Stefan


Am 13.12.2005 um 20:28 schrieb Matt Zytaruk:
I dont think the network settings are the problem, as I havebeen able to parse other segments using map reduce no problem.If it was the network configuration, wouldn't it never work?However, things do not seem to be stable, as some operations inndfs will error, and then I do the same thing 5 minutes laterand it works fine. Same with other things, some crawls workfine, others throw exceptions and crash (I actually had a crawlcrash with the same problem as below). This is using 3 Opteronboxes running Suse Linux.
-Matt Zytaruk

Stefan Groschupf wrote:
Looks like a problem with the tcp ip communication.
Any firewalls running on the boxes? May any ports closed?
Are the dns names correct configured?

Is your job tracker running stable?

Stefan

Am 13.12.2005 um 19:46 schrieb Matt Zytaruk:
Hello all, I've been trying to parse a segment of data(probably around 500k pages) I previously fetched, andeverytime I try, I get an error. Below is the error given bythe slaves. The master gives a similar error. This usuallyhappens late in the reduce phase, but has also happenedduring the map phase once. Any ideas what might be going onhere? Network issues? bugs in the tracker?
Thanks for any help you might be able to give.
-matt zytaruk

Slaves:

060102 200647 task_m_bvkze5 Child Error
java.io.IOException: Task process exit with nonzero status.
at org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139)at org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92)
060102 200833 task_m_bvkze5 done; removing files.
060102 200855 Client connection to 64.141.15.126:8050: closing
java.lang.reflect.UndeclaredThrowableException
       at $Proxy0.pollForClosedTask(Unknown Source)
at org.apache.nutch.mapred.TaskTracker.offerService(TaskTracker.java:241)at org.apache.nutch.mapred.TaskTracker.run(TaskTracker.java: 268)at org.apache.nutch.mapred.TaskTracker.main(TaskTracker.java: 633)
Caused by: java.io.IOException: timed out waiting for response
       at org.apache.nutch.ipc.Client.call(Client.java:296)
       at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
       ... 4 more
060102 201229 Lost connection to JobTracker [crawler-d-01.internal.wavefire.ca/64.141.15.126:8050]. Retrying...
Master:
Exception in thread "main"java.lang.reflect.UndeclaredThrowableException
       at $Proxy0.getJobStatus(Unknown Source)
at org.apache.nutch.mapred.JobClient.getJob(JobClient.java: 272)at org.apache.nutch.mapred.JobClient.runJob(JobClient.java: 295)at org.apache.nutch.crawl.ParseSegment.parse(ParseSegment.java:91)at org.apache.nutch.crawl.ParseSegment.main(ParseSegment.java:110)
Caused by: java.io.IOException: timed out waiting for response
       at org.apache.nutch.ipc.Client.call(Client.java:296)
       at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net


---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net

[Nutch-general] Re: Map Reduce Errors

Reply via email to