I am seeing behavior in Resin where unicode characters are being replaced by
HTML entity references in the page response.
For example, when the unicode character ç (ccedil if it's not appearing
correctly) appears in the JSP source, when it's compiled into a servlet, it
appears in the servlet source as the escaped Unicode reference "\u00e7", and
then when the page is served by Resin, it is again transformed in the page
source into the HTML entity reference "ç" (with semicolon of course).
Another example: using JSF, a JSP source will contain a Faces tag that gets a
string from a backing bean. If the string contains the Unicode character ç
(ccedil), when Resin serves the page it will transform the character into the
HTML entity reference.
Does anyone know if there is some setting that is causing this entity reference
transformation to occur? Is it possible to configure Resin to leave the
original Unicode character unmolested?
I have messed with a few "character-encoding" and "encoding" settings in
various places in resin.conf, but I may be missing something. I suppose I can
say that the page is "correctly" served as UTF-8 encoded -- the content type
header specifies UTF-8 -- but actually if Resin is replacing any characters
beyond the first 128 code points with HTML entity references anyway, it's a bit
of a moot point.
resin-interest mailing list