I have been crawling rather large sites ( larger then 10k pages) with
the crawl command. It seems like it crawls all the pages twice. Is
that normal? I thought it was just removing the segments but it looks
like it crawls all the pages, does some update to the DB and then
crawls them again.
: Tue, 3 May 2005 17:25:14 -0300
Subject: Re: 2 questions
I have one answer and one question. :-)
A: I created a Servlet that returns the search result as a XML.
And another that creates the cluster cache. So one can navigate in the
cluster withou re-quering. This servlet has a thread
.
-byron
-Original Message-
From: Richard Anderson [EMAIL PROTECTED]
To: nutch-user@incubator.apache.org
Date: Mon, 02 May 2005 12:13:35 -0400
Subject: Re: 2 questions
I have the same interest. Where can we find to todo list for the
webapp?
--Rick
Vincent wrote
as patches, many
more people will be able to use test it :)
Look forward to your contributions!
-Original Message-
From: Leonardo Barbosa [EMAIL PROTECTED]
To: nutch-user@incubator.apache.org
Date: Tue, 3 May 2005 17:25:14 -0300
Subject: Re: 2 questions
I have one answer and one question
I am finally getting my feet wet and building some nice searches.
On the tail end of a big webapp project so I have not had the time
to look into the source of the nutch tools, so excuse the first
question.
1. Can searches be done from non servlet based web servers (ie Apache
httpd) by posting
I have the same interest. Where can we find to todo list for the webapp?
--Rick
Vincent wrote:
I am finally getting my feet wet and building some nice searches.
On the tail end of a big webapp project so I have not had the time
to look into the source of the nutch tools, so excuse the first
the comments
field to post comments or upload attachments/diffs.
-byron
-Original Message-
From: Richard Anderson [EMAIL PROTECTED]
To: nutch-user@incubator.apache.org
Date: Mon, 02 May 2005 12:13:35 -0400
Subject: Re: 2 questions
I have the same interest. Where can we find to todo list