Hi Otis, I was just looking at how to implement that, but was hoping for a cleaner method - it seems like I will have to actually parse the error as text to find the field that caused it, then remove/mangle that field and attempt re-adding the document - which seems less than ideal.
I would think there would be a flag or an easy way to override the add method that would just drop (or set to default value) any field that didn't meet expectations. Thanks for the suggestion, Aaron On Mon, Sep 24, 2012 at 9:24 PM, Otis Gospodnetic <otis.gospodne...@gmail.com> wrote: > Hi Aaron, > > You could catch the error on the client, fix/clean/remove, and retry, no? > > Otis > -- > Search Analytics - http://sematext.com/search-analytics/index.html > Performance Monitoring - http://sematext.com/spm/index.html > > > On Mon, Sep 24, 2012 at 9:21 PM, Aaron Daubman <daub...@gmail.com> wrote: >> Greetings, >> >> Is there a way to configure more graceful handling of field formatting >> exceptions when indexing documents? >> >> Currently, there is a field being generated in some documents that I >> am indexing that is supposed to be a float but some times slips >> through as an empty string. (I know, fix the docs, but sometimes bad >> values slip through, and it would be nice to handle them in a more >> forgiving manner). >> >> Here's an example of the exception - when this happens, the entire doc >> is thrown out due to the one malformed field: >> ---snip--- >> ERROR org.apache.solr.core.SolrCore - >> org.apache.solr.common.SolrException: ERROR: [doc=docidstr] Error >> adding field 'f_floatfield'='' >> ... >> Caused by: java.lang.NumberFormatException: empty String >> >> 00:56:46,288 [SI] WARN com.company.IndexerThread - BAD DOC: >> a82a2f6a6a42ad3c98a05ddb3f2c382c >> 01:02:12,713 [SI] ERROR org.apache.solr.core.SolrCore - >> org.apache.solr.common.SolrException: ERROR: >> [doc=6ff90020f9ec0f6dd623e9879c3e024d] Error adding field >> 'f_afloatfield'='' >> at >> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:333) >> at >> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) >> at >> org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:157) >> at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79) >> at >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58) >> at >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) >> at >> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:142) >> at >> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) >> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:121) >> at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:106) >> at com.company.IndexerThread.run(IndexerThread.java:55) >> at java.lang.Thread.run(Thread.java:722) >> Caused by: java.lang.NumberFormatException: empty String >> at >> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1011) >> at java.lang.Float.parseFloat(Float.java:452) >> at org.apache.solr.schema.TrieField.createField(TrieField.java:410) >> at >> org.apache.solr.schema.SchemaField.createField(SchemaField.java:103) >> at >> org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:203) >> at >> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:286) >> ... 12 more >> >> 01:02:12,713 [SI] WARN com.company.IndexerThread - BAD DOC: >> 6ff90020f9ec0f6dd623e9879c3e024d >> ---snip--- >> >> In my thinking (and for this situation), it would be much better to >> just ignore the malformed field and keep the doc - is there any way to >> configure this or enable this behavior instead? >> >> Thanks, >> Aaron