I think better way Matthias idea: dedup segments. In the older then 30 days segments you can found not changed pages, thats are not exists in the new segments.

Stefan Groschupf wrote:

Segment folder older then 30 days can be deleted.

Am 31.03.2005 um 11:29 schrieb [EMAIL PROTECTED]:

Dear Nutch Users!

I have a question with continous use of nutch:
- When I refetch pages (after 30 days), I think if the pages is modified, these will put into new segments, and the old version will be live in the old segments dir. And the db will be larger and larger in every new page versions? It is live problem or only I think this is problem?


Best Regrards,
   Ferenc

Reply via email to