Stefan Groschupf wrote:
- "job.setPartitionerClass(PartitionUrlByHost.class);" in the generate methodyes, this line is the one you need to change. The other stuff can be as it is for now.
I don't recommend this change. It makes your crawler impolite, since multiple tasks may reference each host. Perhaps you simply need to increase http.max.delays? What is this set to?
Doug