Re: mapred questions

Doug Cutting Fri, 04 Nov 2005 13:47:51 -0800

Ken van Mulder wrote:

First is that the fetcher slows down over time and continues to use moreand more memory as it goes (which I think is eventually hanging theprocess).

What parser plugins do you have enabled? These are usually the culprit.Try using 'kill -QUIT' to see what various threads are doing, both atthe start and later, when it slows and grows.

Second problem is trying to use the crawl. I've tried with a seeds/urlfile contain 4, 2000 and then 100k urls in it. Using:
$ bin/nutch crawl seeds
Which goes through its processing and completes, but doesn't visit anyof the urls in the seeds file. What am I missing to get it to actuallydo the crawl?

Are you using NDFS? If so, the seeds directory needs to be stored inNDFS. Use 'bin/nutch ndfs -put seeds seeds'.


Doug

Re: mapred questions

Reply via email to