On 6/20/06, Mike Richmond <[EMAIL PROTECTED]> wrote:
I have a application that I recently ported to Solr and am running into a few problems with the XML responses from Solr. An XML response which came from a Solr query, returned XML data that was not properly escaped (no CDATA tag, or entity substitution). In particular the "summary" field contains '<' characters. An example of such a response can be found here: http://www.willetts.com/mike/response.xml
Hmmm, that is interesting... I haven't seen that before. I'll try and duplicate it with your example "summary" field.
On another note: I also noticed that I get non-utf8 characters in the response even though the encoding line at the top of the XML document specifies utf8 encoding.
Are you using the bundled version of Jetty? People have been having problems with international chars with that. You might try using Tomcat.
I did not see anywhere in the XMLWriter code that checked the encoding of the output. Is this by design, or am I missing something?
By design... XMLWriter writes java characters and strings, and the servlet container handles encoding to UTF-8. -Yonik