Re: at the end of fetching, hung threads

2009-11-17 Thread Julien Nioche
https://issues.apache.org/jira/browse/NUTCH-719 could be relevant as well 2009/11/16 MilleBii mille...@gmail.com Just apply the following patch. https://issues.apache.org/jira/browse/NUTCH-721 2009/11/15 MilleBii mille...@gmail.com Yes had it in the past and one needs to apply a certain

Re: MergeSegments - java.lang.OutOfMemoryError

2009-11-17 Thread Subhojit Roy
We had encountered a similar issue once that got solved by increasing the swap space on out Linux machine. Did you try doing that? -sroy On Sun, Nov 8, 2009 at 10:01 AM, kevin chen kevinc...@bdsing.com wrote: Hi, I have using a trunk version of nutch since Jul 2007. It's being running fine

Re: crawling / data aggregation - is nutch the right tool?

2009-11-17 Thread no spam
This is exactly what I want to do, extract a selective portion. I'd love to see that code example and how it's wired up. Thanks, Mark

Re: crawling / data aggregation - is nutch the right tool?

2009-11-17 Thread no spam
This was a great write up by Andrzej Bialecki about the future of Nutch and for small crawls he summed it up here: - Nutch is too complex and too heavy for those that need to crawl up to a few thousand pages. Now that the Droids project exists it's probably not worth the effort to attempt a

total hits after dedup

2009-11-17 Thread Fadzi Ushewokunze
Hi all, Whats the best way to get the total hit count returned excluding deduped documents? At the moment nutch bean returns only the full total.

Nutch 0.19.2 and Ganglia 3.1.3

2009-11-17 Thread John Martyniak
Has anybody else had any trouble running nutch 0.19.2 with Ganglia 3.1.3? I was surfing through Jira and it seems that there where some issues but they have been resolved. Any thoughts would be helpful. Thank you, -John John Martyniak President/CEO Before Dawn Solutions, Inc. 9457 S.

Re: Nutch 0.19.2 and Ganglia 3.1.3

2009-11-17 Thread Dennis Kubes
Nutch is currently at 1.0. Maybe you mean Hadoop 0.19.2? If so that would be better addressed to the Hadoop mailing list. Dennis John Martyniak wrote: Has anybody else had any trouble running nutch 0.19.2 with Ganglia 3.1.3? I was surfing through Jira and it seems that there where some

Re: Nutch 0.19.2 and Ganglia 3.1.3

2009-11-17 Thread John Martyniak
Yep, that was my mistake. Sorry about that everyone. Nutch 1.0, on Hadoop 0.19.2, Ganglia 3.1.3. -John On Nov 17, 2009, at 8:03 PM, Dennis Kubes wrote: Nutch is currently at 1.0. Maybe you mean Hadoop 0.19.2? If so that would be better addressed to the Hadoop mailing list. Dennis