date:20091117

Re: at the end of fetching, hung threads

2009-11-17 Thread Julien Nioche

https://issues.apache.org/jira/browse/NUTCH-719 could be relevant as well 2009/11/16 MilleBii mille...@gmail.com Just apply the following patch. https://issues.apache.org/jira/browse/NUTCH-721 2009/11/15 MilleBii mille...@gmail.com Yes had it in the past and one needs to apply a certain

Re: MergeSegments - java.lang.OutOfMemoryError

2009-11-17 Thread Subhojit Roy

We had encountered a similar issue once that got solved by increasing the swap space on out Linux machine. Did you try doing that? -sroy On Sun, Nov 8, 2009 at 10:01 AM, kevin chen kevinc...@bdsing.com wrote: Hi, I have using a trunk version of nutch since Jul 2007. It's being running fine

Re: crawling / data aggregation - is nutch the right tool?

2009-11-17 Thread no spam

This is exactly what I want to do, extract a selective portion. I'd love to see that code example and how it's wired up. Thanks, Mark

Re: crawling / data aggregation - is nutch the right tool?

2009-11-17 Thread no spam

This was a great write up by Andrzej Bialecki about the future of Nutch and for small crawls he summed it up here: - Nutch is too complex and too heavy for those that need to crawl up to a few thousand pages. Now that the Droids project exists it's probably not worth the effort to attempt a

total hits after dedup

2009-11-17 Thread Fadzi Ushewokunze

Hi all, Whats the best way to get the total hit count returned excluding deduped documents? At the moment nutch bean returns only the full total.

Nutch 0.19.2 and Ganglia 3.1.3

2009-11-17 Thread John Martyniak

Has anybody else had any trouble running nutch 0.19.2 with Ganglia 3.1.3? I was surfing through Jira and it seems that there where some issues but they have been resolved. Any thoughts would be helpful. Thank you, -John John Martyniak President/CEO Before Dawn Solutions, Inc. 9457 S.

Re: Nutch 0.19.2 and Ganglia 3.1.3

2009-11-17 Thread Dennis Kubes

Nutch is currently at 1.0. Maybe you mean Hadoop 0.19.2? If so that would be better addressed to the Hadoop mailing list. Dennis John Martyniak wrote: Has anybody else had any trouble running nutch 0.19.2 with Ganglia 3.1.3? I was surfing through Jira and it seems that there where some

Re: Nutch 0.19.2 and Ganglia 3.1.3

2009-11-17 Thread John Martyniak

Yep, that was my mistake. Sorry about that everyone. Nutch 1.0, on Hadoop 0.19.2, Ganglia 3.1.3. -John On Nov 17, 2009, at 8:03 PM, Dennis Kubes wrote: Nutch is currently at 1.0. Maybe you mean Hadoop 0.19.2? If so that would be better addressed to the Hadoop mailing list. Dennis

Re: at the end of fetching, hung threads

Re: MergeSegments - java.lang.OutOfMemoryError

Re: crawling / data aggregation - is nutch the right tool?

Re: crawling / data aggregation - is nutch the right tool?

total hits after dedup

Nutch 0.19.2 and Ganglia 3.1.3

Re: Nutch 0.19.2 and Ganglia 3.1.3

Re: Nutch 0.19.2 and Ganglia 3.1.3

8 matches

Site Navigation

Mail list logo

Footer information