Byron Miller wrote:
I think it sounds like a great idea. You got my vote :)
A vote is good, but not enough :-) I'm trying to solicit comments on the proposed algorithm, and some confirmation whether the way of handling unchanged content in Fetcher will work the way I suspect...
I know it would make a world of difference for mozdex and other big installations (not to mention the poor sites that Nutch is pounding).
--- Andrzej Bialecki <[EMAIL PROTECTED]> wrote:
Hi,
Reading the other day the searchenginewatch forum I
came to conclusion that currently Nutch is rather careless about the
bandwidth - it always fetches pages after their getNextFetchTime()
arrived, no matter if the pages are really changed or not.
-- Best regards, Andrzej Bialecki
------------------------------------------------- Software Architect, System Integration Specialist CEN/ISSS EC Workshop, ECIMF project chair EU FP6 E-Commerce Expert/Evaluator ------------------------------------------------- FreeBSD developer (http://www.freebsd.org)
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers
