https://issues.apache.org/jira/browse/NUTCH-1016
The patch applies  to a 1.3 checkout.

> Hello,
> 
> I have been trying to index some segments using solrindex with nutch 1.3
> and solr 3.1.
> 
> Most of the time the indexing goes well, but sometimes I get an error:
> 
> SEVERE: java.lang.RuntimeException: [was class
> java.io.CharConversionException] Invalid UTF-8 character 0xffff at char
> #172317, byte #175887)
> 
> There is an interesting thread on the solr lists, but it doesn't really
> address the root issue:
> http://lucene.472066.n3.nabble.com/Solr-3-1-indexing-error-Invalid-UTF-8-ch
> aracter-0xffff-td3113191.html
> 
> How can I fix this?  Is this something that could be filtered out by the
> solrindex class, or alternately be filtered out with perl or sed?  It seems
> like this is a problem with solr, has anyone else experienced this?  Is
> this fixed on solr 3.2?
> 
> Thanks in advance,
> 
> Jason

Reply via email to