On 9/22/06, Walter Underwood <[EMAIL PROTECTED]> wrote:
This might be a Solr bug. Solr should be able to accept XML in any
of the required encodings (ASCII, Latin 1, UTF-8, and UTF-16).
Getting XML content types exactly right is tricky, see RFC 3023.

Right now Solr pays attention to Content-type in the HTTP-headers (it
lets the servlet container handle charset conversions), and ignores
any charset declaration in the XML itself.

What I think might be ideal: I
 f there is a charset definition, then let the servlet handle it by
requesting a Writer.  If there isn't a charset definition, request a
byte-oriented InputStream from the container and let the XML parser
try and figure out the encoding.

-Yonik

Reply via email to