I'm indexing some mail archives and within the various formats/
encodings etc, some messages have invalid control characters.
doc.setField( body, content.toString() );
In the solr logs, I get:
[java] SEVERE: java.io.IOException: Illegal character ((CTRL-
CHAR, code 22))
[java] at
On Sat, Dec 13, 2008 at 1:45 PM, Ryan McKinley ryan...@gmail.com wrote:
Is there any standard way to escape invalid xml control characters?
Not that I know of... it's a shame that XML can't carry the full unicode range.
Good reason to get binary or JSON indexing interface at some point...
I