Crawler Behavior (2 questions)

2005-05-26 Thread Ian Reardon
I have been crawling rather large sites ( larger then 10k pages) with the crawl command. It seems like it crawls all the pages twice. Is that normal? I thought it was just removing the segments but it looks like it crawls all the pages, does some update to the DB and then crawls them again.

Re: 2 questions

2005-05-04 Thread Leonardo Barbosa
: Tue, 3 May 2005 17:25:14 -0300 Subject: Re: 2 questions I have one answer and one question. :-) A: I created a Servlet that returns the search result as a XML. And another that creates the cluster cache. So one can navigate in the cluster withou re-quering. This servlet has a thread

Re: 2 questions

2005-05-03 Thread Leonardo Barbosa
. -byron -Original Message- From: Richard Anderson [EMAIL PROTECTED] To: nutch-user@incubator.apache.org Date: Mon, 02 May 2005 12:13:35 -0400 Subject: Re: 2 questions I have the same interest. Where can we find to todo list for the webapp? --Rick Vincent wrote

Re: 2 questions

2005-05-03 Thread Byron Miller
as patches, many more people will be able to use test it :) Look forward to your contributions! -Original Message- From: Leonardo Barbosa [EMAIL PROTECTED] To: nutch-user@incubator.apache.org Date: Tue, 3 May 2005 17:25:14 -0300 Subject: Re: 2 questions I have one answer and one question

2 questions

2005-05-02 Thread Vincent
I am finally getting my feet wet and building some nice searches. On the tail end of a big webapp project so I have not had the time to look into the source of the nutch tools, so excuse the first question. 1. Can searches be done from non servlet based web servers (ie Apache httpd) by posting

Re: 2 questions

2005-05-02 Thread Richard Anderson
I have the same interest. Where can we find to todo list for the webapp? --Rick Vincent wrote: I am finally getting my feet wet and building some nice searches. On the tail end of a big webapp project so I have not had the time to look into the source of the nutch tools, so excuse the first

Re: 2 questions

2005-05-02 Thread Byron Miller
the comments field to post comments or upload attachments/diffs. -byron -Original Message- From: Richard Anderson [EMAIL PROTECTED] To: nutch-user@incubator.apache.org Date: Mon, 02 May 2005 12:13:35 -0400 Subject: Re: 2 questions I have the same interest. Where can we find to todo list