Byron Miller wrote:

I think it sounds like a great idea.  You got my vote
:)

A vote is good, but not enough :-) I'm trying to solicit comments on the proposed algorithm, and some confirmation whether the way of handling unchanged content in Fetcher will work the way I suspect...


I know it would make a world of difference for mozdex and other big installations (not to mention the poor sites that Nutch is pounding).


--- Andrzej Bialecki <[EMAIL PROTECTED]> wrote:

Hi,

Reading the other day the searchenginewatch forum I
came to conclusion that currently Nutch is rather careless about the
bandwidth - it always fetches pages after their getNextFetchTime()
arrived, no matter if the pages are really changed or not.


--
Best regards,
Andrzej Bialecki

-------------------------------------------------
Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
-------------------------------------------------
FreeBSD developer (http://www.freebsd.org)



-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to