On 2/1/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: The problem is that the charset isn't defined to be UTF-8 in the
: headers, so the bytes are assumed to be latin-1.
:
: Is this a problem we can fix in solr, or is it purely container config?

umm... we already fixed this the best way i know how in SOLR-35 ... all of
the JSPs that have forms should have this in them...

<%@ page contentType="text/html; charset=utf-8" pageEncoding="UTF-8"%>

...is resin not respecting that?

The form that gets sent to the browser is in UTF8, and the browser
correctly sends back UTF8 in the post body.  *But* the browser doesn't
tell the container what the charset of the body is, so it's up to the
container to guess.  By default, resin seems to pick latin-1.

It seems like we should assume UTF-8 if no charset is sent for a text
content type.

-Yonik

Reply via email to