Hi, In our configuration of nutch, we have 5 sites registered in the regex-urlfilter. So in the seed file, we have loaded url´s extracted of sitemap for every site (by groups of domains). In the nutch-site.xml we updated the next configuration:
<property> <name>generate.max.count</name> <value>72</value> <description>The maximum number of urls in a single fetchlist. -1 if unlimited. The urls are counted according to the value of the parameter generator.count.mode. </description> </property> <property> <name>generate.count.mode</name> <value>domain</value> <description>Determines how the URLs are counted for generator.max.count. Default value is 'host' but can be 'domain'. Note that we do not count per IP in the new version of the Generator. </description> </property> Aditionally in the crawl script we changed the next line: sizeFetchlist=`expr $numSlaves \* 50000` by the following line: sizeFetchlist=`expr $numSlaves \* 360` With this configuration we get the next message in the log when start the crawl process: Host or domain site1 has more than 72 URLs for all 1 segments. Additional URLs won't be included in the fetchlist. Host or domain site2 has more than 72 URLs for all 1 segments. Additional URLs won't be included in the fetchlist. Host or domain site3 has more than 72 URLs for all 1 segments. Additional URLs won't be included in the fetchlist. Host or domain site4 has more than 72 URLs for all 1 segments. Additional URLs won't be included in the fetchlist. So we are watching that only four sites are fetching: -activeThreads=50, spinWaiting=48, fetchQueues.totalSize=277, fetchQueues.getQueueCount=4 These four sites are those with more url's in seed. After some time the fetchQueues.getQueueCount value decrease to 1 prioritizing and fetching that site with more url in the seed file. -activeThreads=50, spinWaiting=49, fetchQueues.totalSize=43, fetchQueues.getQueueCount=1 What is the correct configuration for fetch url´s simultaneously for every site configured in the regex-urlfilter.txt file? Thanks. Paul Escobar Mossos

