Andrzej Bialecki wrote:
Olaf Thiele wrote:
1. Exceptions, as far as I know, should not be used
to exchange reqular information, just exceptional states.

True. However, if you look at Fetcher.FetcherThread.run() you'll see that that's precisely the method it uses now, I was just following the trend... erhm, minimizing the patches, that is... ;-)

I agree that this is a misuse of exceptions. Would someone would like to improve this design? A patch would be appreciated.


would be better. Furthermore, there should be a minimum otherwise
a page could be fetched continously.

Yes, that's true - 1 day would probably be a sensible default minimum...

There should also be a maximum. Segments older than the maximum may be discarded, as all of their pages should have been refetched by now.


This approach requires that you store the content checksum in the index - fairly small cost for the gain it gives.

The content checksum is already stored in the Page and in the index.

Doug


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to