We frequently recrawl urls (adaptive fetch from 3 to 30 days). So seems no
harm in deleting older than month segments.

Thank you.

On Wed, Apr 7, 2021 at 5:24 AM Markus Jelsma <[email protected]>
wrote:

> Hello Abhay,
>
> You only need to keep or merge old segments if you 'quickly' need to
> reindex the data, and are unable to start with a fresh crawl. If you
> frequently recrawl all urls, e.g. a month, then segments older than a month
> can safely be removed.
>
> You can also do daily an monthly merges, like we do. This makes it possible
> to revisit old data for research, in case websites change layout, or are no
> longer customer and not being crawled anymore.
>
> Regards,
> Markus
>
> Op di 6 apr. 2021 om 21:54 schreef Abhay Ratnaparkhi <
> [email protected]>:
>
> > Hello,
> >
> > I have a large number of segments occupying disk space. It is a good
> > strategy to delete old segments or it's better to merge them.
> >
> > Thank you
> > Abhay
> >
>

Reply via email to