Hello, I have been trying to index some segments using solrindex with nutch 1.3 and solr 3.1.
Most of the time the indexing goes well, but sometimes I get an error: SEVERE: java.lang.RuntimeException: [was class java.io.CharConversionException] Invalid UTF-8 character 0xffff at char #172317, byte #175887) There is an interesting thread on the solr lists, but it doesn't really address the root issue: http://lucene.472066.n3.nabble.com/Solr-3-1-indexing-error-Invalid-UTF-8-character-0xffff-td3113191.html How can I fix this? Is this something that could be filtered out by the solrindex class, or alternately be filtered out with perl or sed? It seems like this is a problem with solr, has anyone else experienced this? Is this fixed on solr 3.2? Thanks in advance, Jason

