Hi, is there a way to be more tolerant when crawling is done by nutch and solr does do the index job. Nutch does its job and after some time (lets say 1h) its done. Now the results are pushed to solr, but if one SolrDocument (out of >1000) does have some errors - e.g. a date field has an invalid value - the complete operation is stopped and lost - not committed (that what i see now with nutch 1.2 - maybe i missed some configure option ...).
Is it possible to configure nutch to skip only those document und continue to push the rest to solr and issue some warning or error which document did fail? thx Torsten -- Bitte senden Sie mir keine Word- oder PowerPoint-Anhänge. Siehe http://www.gnu.org/philosophy/no-word-attachments.de.html Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect." -- Linus Torvalds
smime.p7s
Description: S/MIME cryptographic signature

