Hello,

we use apache-nutch-1.10 in combination with solr-5.3.0.
Our problem is, that if we get a 404 status while recrawling, then the status 
of the document switch to "db_gone" and gets deleted in solr.
Should it not be possible with "db.fetch.retry.max" to set the document 
temporary to "not_fetched" and after a few retries to  "db_gone"?


Thanks,
Axel

-- 
M. Sc. Axel Schöner
Hochschule Kaiserslautern in Zweibrücken
FB Informatik / MST
Amerikastraße 1
D-66482 Zweibrücken
phone: 0631-3724 5544
email: [email protected]
http://hs-kl.de/~axel.schoener/
-------------------------------------------------------------

Reply via email to