On 9/24/06, Gary VanMatre <[EMAIL PROTECTED]> wrote:
>From: "Craig McClanahan" <[EMAIL PROTECTED]> > > On 9/24/06, Gary VanMatre wrote: > > > > [snip] > > > Does clay use request's characterEncoding when loading those files? This > > > way one could set the request's characterEncoding in a filter and then > > > clay would use this when loading those files. As for now setting utf-8 > > > encoding for the request object in a filter doesn't work (the characters > > > do not display correct). > > > > No, it loads the file as an InputStream and then writes it to a > > StringBuffer. > > This might be the problem area. You can find the code in the loadTemplate > > method here: > > > http://svn.apache.org/viewvc/shale/framework/trunk/shale-clay/src/main/java/org/ > apache/shale/clay/config/ClayTemplateParser.java?view=markup > > > This will definitely cause the problem observed here. Treating the input as > a byte stream effectively means it expects ASCII 7 bit encoding in the > template files. Thanks for taking a look... > > For templates that are in XML syntax, we could deal with this by using an > XML parse to read the input, which will automtically obey any encoding > declaration specified inside the file. I'd like to do this for XHMTL template parsing in the future but it will require considerable refactoring. The clay markup parser will read both old school html and well-formed XML. It's not a validating parser but gets the job done (almost in this case). >For non-XML templates, it's probably > best to declare a default encoding like UTF-8 (perhaps with an override via > context init parameter) to cover localized cases. You'll need to use a > Reader instead of an input stream for this. That's a good tip. Do you think we can use the encoding type of the request, #{request.characterEncoding}
We'll probably want to go over to the dev list for a deep dive on this (and open a JIRA issue), but I don't think this is a very good idea. Ideally a template (which by definition can only be in one encoding) should be viewable in a response in any requested encoding. I guess we have two encoding considerations. The type of encoding of the
template file and the target encoding of the page. How do you think we should handle this?
JSP has a precedent that seems to make sense, but will need adaptation: * JSP pages declare their pageEncoding attribute (we'd likely need something external for templates), and (when compiled) are internally converted into Unicode characters (Java's internal representation of strings). * When a request is received, the requested encoding is used on the Writer, which converts from Unicode to the appropriate encoding. In this way, a JSP source page can be rendered in any encoding that is desired. The trick for Clay is going to be how to declare the encoding of the templates themselves.
> Craig Gary
Craig
