Matt Kangas wrote:
#2 should be a pluggable/hookable parameter. "high-scoring" sounds like a reasonable default basis for choosing recrawl intervals, but I'm sure that nearly everyone will think of a way to improve upon that for their particular system.e.g. "high-scoring" ain't gonna cut it for my needs. (0.5 wink ;)
In NUTCH-61, Andrzej has a pluggable FetchSchedule. That looks like a good idea.
http://issues.apache.org/jira/browse/NUTCH-61 Doug
