[EMAIL PROTECTED] wrote:
Andrzej,
Does the blockAddr(url) also apply when Nutch is used as an Intranet crawler?
Is the number of concurrent threads different in the Intranet setting?
Thanks
Isabelle
Not automatically - I posted a warning about this a while ago. Currently
you are responsible yourself to remember that you need to put
threads.per.host greater or equal to the number of fetcher threads -
otherwise you will get frequent "http.max.retry limit exceeded".
A patch would be welcome to add an option to plugins to switch into
"intranet" mode.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general