Hello I'm running nutch-0.9 on a 3 node hadoop cluster. The generator seems to be ignoring the "generate.max.per.host" parameter that I set in nutch-site.xml(I set it to 2000). When I run the generator in local mode, on a single node, it actually does limit the number of urls per host to 2000. However, in distributed mode, when I check the fetchlist generated, there are hosts with way more than 2000 urls.
Does anyone have pointers on what might be causing this? I looked in the log files, and there seems to be nothing relevant. Any help is greatly appreciated. Thanks, Sandeep
