try

bin/nutch readdb crawl/crawldb -stats

are there any unfetched pages?

nutchcase schrieb:
> My crawl always stops at depth=3. It gets documents but does not continue any
> further.
> Here is my nutch-site.xml
> <?xml version="1.0"?>
> <configuration>
> <property>
> <name>http.agent.name</name>
> <value>nutch-solr-integration</value>
> </property>
> <property>
> <name>generate.max.per.host</name>
> <value>1000</value>
> </property>
> <property>
> <name>plugin.includes</name>
> <value>protocol-http|urlfilter-(crawl|regex)|parse-html|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnorma\
> lizer-(pass|regex|basic)</value>
> </property>
> <property>
> <name>db.max.outlinks.per.page</name>
> <value>1000</value>
> </property>
> </configuration>
>
>
>   

Reply via email to