Hi Aaron, You could catch the error on the client, fix/clean/remove, and retry, no?
Otis -- Search Analytics - http://sematext.com/search-analytics/index.html Performance Monitoring - http://sematext.com/spm/index.html On Mon, Sep 24, 2012 at 9:21 PM, Aaron Daubman <daub...@gmail.com> wrote: > Greetings, > > Is there a way to configure more graceful handling of field formatting > exceptions when indexing documents? > > Currently, there is a field being generated in some documents that I > am indexing that is supposed to be a float but some times slips > through as an empty string. (I know, fix the docs, but sometimes bad > values slip through, and it would be nice to handle them in a more > forgiving manner). > > Here's an example of the exception - when this happens, the entire doc > is thrown out due to the one malformed field: > ---snip--- > ERROR org.apache.solr.core.SolrCore - > org.apache.solr.common.SolrException: ERROR: [doc=docidstr] Error > adding field 'f_floatfield'='' > ... > Caused by: java.lang.NumberFormatException: empty String > > 00:56:46,288 [SI] WARN com.company.IndexerThread - BAD DOC: > a82a2f6a6a42ad3c98a05ddb3f2c382c > 01:02:12,713 [SI] ERROR org.apache.solr.core.SolrCore - > org.apache.solr.common.SolrException: ERROR: > [doc=6ff90020f9ec0f6dd623e9879c3e024d] Error adding field > 'f_afloatfield'='' > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:333) > at > org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) > at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:157) > at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) > at > org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:142) > at > org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:121) > at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:106) > at com.company.IndexerThread.run(IndexerThread.java:55) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.lang.NumberFormatException: empty String > at > sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1011) > at java.lang.Float.parseFloat(Float.java:452) > at org.apache.solr.schema.TrieField.createField(TrieField.java:410) > at > org.apache.solr.schema.SchemaField.createField(SchemaField.java:103) > at > org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:203) > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:286) > ... 12 more > > 01:02:12,713 [SI] WARN com.company.IndexerThread - BAD DOC: > 6ff90020f9ec0f6dd623e9879c3e024d > ---snip--- > > In my thinking (and for this situation), it would be much better to > just ignore the malformed field and keep the doc - is there any way to > configure this or enable this behavior instead? > > Thanks, > Aaron