I think better way Matthias idea: dedup segments. In the older then 30
days segments you can found not changed pages, thats are not exists in
the new segments.
Stefan Groschupf wrote:
Segment folder older then 30 days can be deleted.
Am 31.03.2005 um 11:29 schrieb [EMAIL PROTECTED]:
Dear Nutch Users!
I have a question with continous use of nutch:
- When I refetch pages (after 30 days), I think if the pages is
modified, these will put into new segments, and the old version will
be live in the old segments dir. And the db will be larger and larger
in every new page versions? It is live problem or only I think this
is problem?
Best Regrards,
Ferenc