Hello people,

can someone explain me how the generator genrates the fetch lists?

In particular:

I don't understand why it generates fetch lists which very different amounts of urls.

Sometimes it generates > 25k urls and somestimes > 1k.

In every case there were more than >25k urls unfetched in the crawldb. So I was expecting that it always generates ~ 25k urls. But as I said before, sometimes only ~ 1k.

In my nutch-site.xml I have defined following values:

<property>
  <name>generate.max.count</name>
  <value>-1</value>
  <description>The maximum number of urls in a single
  fetchlist.  -1 if unlimited. The urls are counted according
  to the value of the parameter generator.count.mode.
  </description>
</property>

<property>
  <name>generate.max.count</name>
  <value>-1</value>
  <description>The maximum number of urls in a single
  fetchlist.  -1 if unlimited. The urls are counted according
  to the value of the parameter generator.count.mode.
  </description>
</property>

Any ideas?

Thanks

Reply via email to