Hi,

Can someone please explain how the fetcher behaves with respect to
modified/unmodified content, in the current trunk version?

My requirement is basically this - I
have one page (seed url) which has links to other urls. The links in
this page, keeps getting changed on a daily basis.
I want nutch to keep refetching this page, as it changes regularly,
but not refetch the outlinks on this page since they are more or less
static.

I have set both "db.fetch.interval.default" and "db.fetch.interval.max" to a
high value of apprx 1 year and am using the DefaultFetchSchedule
class. Does this imply that even for pages which have been modified,
the next fetch would be after an year? Or
do I need to use the AdaptiveFetchSchedule?

I would be really thankful if someone could help me with my fetcher
settings.

Regards,
Chris

Reply via email to