Here is the output from that: TOTAL urls: 297 retry 0: 297 min score: 0.0 avg score: 0.023377104 max score: 2.009 status 2 (db_fetched): 295 status 5 (db_redir_perm):
reinhard schwab wrote: > > try > > bin/nutch readdb crawl/crawldb -stats > > are there any unfetched pages? > > nutchcase schrieb: >> My crawl always stops at depth=3. It gets documents but does not continue >> any >> further. >> Here is my nutch-site.xml >> <?xml version="1.0"?> >> <configuration> >> <property> >> <name>http.agent.name</name> >> <value>nutch-solr-integration</value> >> </property> >> <property> >> <name>generate.max.per.host</name> >> <value>1000</value> >> </property> >> <property> >> <name>plugin.includes</name> >> <value>protocol-http|urlfilter-(crawl|regex)|parse-html|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnorma\ >> lizer-(pass|regex|basic)</value> >> </property> >> <property> >> <name>db.max.outlinks.per.page</name> >> <value>1000</value> >> </property> >> </configuration> >> >> >> > > > -- View this message in context: http://www.nabble.com/crawl-always-stops-at-depth%3D3-tp25981603p25998652.html Sent from the Nutch - User mailing list archive at Nabble.com.