Jack, That is max outlinks per html page. All your example pages have less than 100 outlinks, right?! Stefan
Am 07.09.2005 um 18:43 schrieb Jack Tang:
Hi All Here is the "db.max.outlinks.per.page" property and its description in nutch-default.xml <property> <name>db.max.outlinks.per.page</name> <value>100</value><description>The maximum number of outlinks that we'll process for a page.</description> </property> I don't think the description is right. Say, my crawler feeds are: http://www.a.com/index.php (90 outlinks) http://www.b.com/index.jsp (80 outlinks) http://www.c.com/index.html (50 outlinks) and the number of crawler thread is 30. Do you think the reminder URLs ( (80 -10) outlinks + 50 outlinks) will be fetched? I think the description should be "The maximum number of outlinks in one fecthing phase." Regards /Jack -- Keep Discovering ... ... http://www.jroller.com/page/jmars
--------------------------------------------------------------- company: http://www.media-style.com forum: http://www.text-mining.org blog: http://www.find23.net
