try bin/nutch readdb crawl/crawldb -stats
are there any unfetched pages? nutchcase schrieb: > My crawl always stops at depth=3. It gets documents but does not continue any > further. > Here is my nutch-site.xml > <?xml version="1.0"?> > <configuration> > <property> > <name>http.agent.name</name> > <value>nutch-solr-integration</value> > </property> > <property> > <name>generate.max.per.host</name> > <value>1000</value> > </property> > <property> > <name>plugin.includes</name> > <value>protocol-http|urlfilter-(crawl|regex)|parse-html|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnorma\ > lizer-(pass|regex|basic)</value> > </property> > <property> > <name>db.max.outlinks.per.page</name> > <value>1000</value> > </property> > </configuration> > > >