Hi Jotta, Nothing wrong with that as long as you don't need to change your parsing and/or indexing filters. People used to keep (and merge) the segments before when the search was provided by Nutch in order to get the cached version of a page. It is not indispensable anymore.
Julien On 1 December 2011 15:52, jotta <[email protected]> wrote: > Hi, > > After crawling I'm removing segments from my crawldb. I'm doing this > because > I'm sending fetched urls into Solr and I think that there is no need to > keep > files in two places (nutch crawldb and solr index). > > But maybe I'm wrong and there are some cases when crawldb's segments are > indispensable? > > ----- > Regards, > Jotta > > PS. Sorry for my English :) > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Removing-crawldb-segments-tp3551875p3551875.html > Sent from the Nutch - User mailing list archive at Nabble.com. > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com

