Hi,
We're finding that we've got one or two domains that are providing
excessive retries - and that's drastically slowing our fetch process
down by hours.
Any general guidance on how to fix the problem? we've upped our max
retries variable to 3 from 1 I believe, still getting the problem.
Here's some example URL's:
http://www.ama.ab.ca/cps/rde/xchg/SID-53ED365B-D426F221/ama/web/travel_Group-Travel.htm
http://www.ama.ab.ca/cps/rde/xchg/SID-53ED365B-CEF90BB0/ama/web/everything_auto_driver_ed.htm
http://www.ama.ab.ca/cps/rde/xchg/SID-53ED365B-DEA4DDC2/ama/web/everything_auto_Vehicle-Safety.htm
http://www.plentyoffish.com/personals/3147onlinedating.htm
http://www.plentyoffish.com/personals/1032onlinedating27.htm
http://www.ama.ab.ca/cps/rde/xchg/SID-53ED365B-CBE7B5E0/ama/web/insurance_Insurance-News.htm
Also, it seems like we're trying to access 100's of thousands of pages from
some of these domains - shouldn't it be limiting the number of pages from a
specific url? (I guess that's two questions).
Off hand, it looks like we've got a session variable in there. My first guess
is that somehow those may be part of the problem. These two domains are making
up something like 80-90% of our retries. Clearly we need to stop the excessive
retries, and at the same time be a bit more polite with those domains.
Thanks,
Glenn