Continue indexing doc after error
I need to index documents from a csv file that will have 1000s of rows and 100+ columns. To help the user loading the file I must return useful errors when indexing fails (schema violations). I'm using SolrJ to read the files line by line, build the document, and index/commit. This approach allows me to index the docs that have no schema validation errors, skipping over the docs that do. However, I really want to report errors field by field. As the user makes corrections to the file, this would prevent the same doc from failing multiple times if there are several fields that are busted.I have not seen a configuration setting that tells solr to keep indexing the doc after it encounters the first error, reporting back all the field errors (multiple exceptions). Does anyone know if that's possible?Using Solr 4.8.1 -- View this message in context: http://lucene.472066.n3.nabble.com/Continue-indexing-doc-after-error-tp4145081.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Continue indexing doc after error
I think what you want is what’s described in https://issues.apache.org/jira/browse/SOLR-445 This has not been committed because it still doesn’t work with SolrCloud. Hoss gave me the hint to look at DistributingUpdateProcessorFactory to solve the problem described in the last comments, but I haven’t had time to get back to this yet. On Tue, Jul 1, 2014 at 1:37 PM, tedsolr tsm...@sciquest.com wrote: I need to index documents from a csv file that will have 1000s of rows and 100+ columns. To help the user loading the file I must return useful errors when indexing fails (schema violations). I'm using SolrJ to read the files line by line, build the document, and index/commit. This approach allows me to index the docs that have no schema validation errors, skipping over the docs that do. However, I really want to report errors field by field. As the user makes corrections to the file, this would prevent the same doc from failing multiple times if there are several fields that are busted.I have not seen a configuration setting that tells solr to keep indexing the doc after it encounters the first error, reporting back all the field errors (multiple exceptions). Does anyone know if that's possible?Using Solr 4.8.1 -- View this message in context: http://lucene.472066.n3.nabble.com/Continue-indexing-doc-after-error-tp4145081.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Continue indexing doc after error
Thank you. That's a useful link. Maybe not quite what I'm looking for, as it appears to do with bulk loads of docs - returning an error for each bad doc. My question is more about getting all the errors for a single doc. I'm probably taking a performance hit by adding docs one at a time. I haven't tested super big files yet (1M+ rows). -- View this message in context: http://lucene.472066.n3.nabble.com/Continue-indexing-doc-after-error-tp4145081p4145087.html Sent from the Solr - User mailing list archive at Nabble.com.