Teruhiko Kurosaka wrote:
From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: 2006-10-24 18:27

There was a bug in some versions of 0.8, so that if you ran it with "local" FS & jobtracker it would generate too many parts of the fetchlist, and then process only one randomly selected part. If that's the case, and you are indeed running in "local" mode, try setting the number of map and reduce tasks in your hadoop-site.xml to 1.

Thank you for the information.
I'm doing an intranet crawling using protocol-http.  I guess
you mean protocol-file by "local"? Or do you mean setting
"local" for fs.defaul.name in hadoop-site.xml?

I meant setting the fs.default.name to local, and mapred.job.tracker to local.


Could you also explain what you mean by "setting the
number of map"  and how?

Set mapred.map.tasks to 1, and mapred.reduce.tasks to 1. You can achieve the same effect for "generate" if you use -numFetchers 1.


--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to