Hi Jotta,

Nothing wrong with that as long as you don't need to change your parsing
and/or indexing filters. People used to keep (and merge) the segments
before when the search was provided by Nutch in order to get the cached
version of a page. It is not indispensable anymore.

Julien

On 1 December 2011 15:52, jotta <[email protected]> wrote:

> Hi,
>
> After crawling I'm removing segments from my crawldb. I'm doing this
> because
> I'm sending fetched urls into Solr and I think that there is no need to
> keep
> files in two places (nutch crawldb and solr index).
>
> But maybe I'm wrong and there are some cases when crawldb's segments are
> indispensable?
>
> -----
> Regards,
> Jotta
>
> PS. Sorry for my English :)
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Removing-crawldb-segments-tp3551875p3551875.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to