Jeff Schnitzer schrieb:
> I wrote up a quick blurb on the issues surrounding character encoding
> on the Resteasy list recently:
>
> http://sourceforge.net/mailarchive/message.php?msg_name=540eb7210908281001r6aafaa55u78615debb704e4c1%40mail.gmail.com

Good blurb!

> The main problem is that POSTed form data will be sent by the browser
> in whatever charset encoding was used on the host page, and this
> information is not sent along with the request.  So the server must
> guess... and that usually means going with the platform default.

On the page you're referring to:

   * If there is an "acceptcharset" element on the form, it should
     submit with that encoding. I've never tested this.
   * Otherwise the browser will submit with whatever encoding the page
     was rendered in.

That's not quite what the HTML spec says:

   accept-charset = charset list [CI]
    This attribute specifies the list of character encodings for input
    data that is accepted by the server processing this form. The value
    is a space- and/or comma-delimited list of charset values. The client
    must interpret this list as an exclusive-or list, i.e., the server is
    able to accept any single character encoding per entity received.

    The default value for this attribute is the reserved string
    "UNKNOWN". User agents may interpret this value as the character
    encoding that was used to transmit the document containing this FORM
    element.

http://www.w3.org/TR/html401/interact/forms.html#adef-accept-charset

So there is "must" and "may" instead of "should" and "will". But I've
never tested it either. And tests would have to be done against multiple
implementations, not against the spec :-/ Maybe worth trying, though.

-- 
Michael Ludwig


_______________________________________________
resin-interest mailing list
resin-interest@caucho.com
http://maillist.caucho.com/mailman/listinfo/resin-interest

Reply via email to