On 9/24/06, Gary VanMatre <[EMAIL PROTECTED]> wrote:

>From: "Craig McClanahan" <[EMAIL PROTECTED]>
>
> On 9/24/06, Gary VanMatre wrote:
> >
> > [snip]
> > > Does clay use request's characterEncoding when loading those files?
This
> > > way one could set the request's characterEncoding in a filter and
then
> > > clay would use this when loading those files. As for now setting
utf-8
> > > encoding for the request object in a filter doesn't work (the
characters
> > > do not display correct).
> >
> > No, it loads the file as an InputStream and then writes it to a
> > StringBuffer.
> > This might be the problem area. You can find the code in the
loadTemplate
> > method here:
> >
>
http://svn.apache.org/viewvc/shale/framework/trunk/shale-clay/src/main/java/org/
> apache/shale/clay/config/ClayTemplateParser.java?view=markup
>
>
> This will definitely cause the problem observed here. Treating the input
as
> a byte stream effectively means it expects ASCII 7 bit encoding in the
> template files.

Thanks for taking a look...

>
> For templates that are in XML syntax, we could deal with this by using
an
> XML parse to read the input, which will automtically obey any encoding
> declaration specified inside the file.

I'd like to do this for XHMTL template parsing in the future but it will
require considerable refactoring.  The clay markup parser will read both old
school html and well-formed XML.  It's not a validating parser but gets the
job done (almost in this case).


>For non-XML templates, it's probably
> best to declare a default encoding like UTF-8 (perhaps with an override
via
> context init parameter) to cover localized cases. You'll need to use a
> Reader instead of an input stream for this.

That's a good tip.  Do you think we can use the encoding type of the
request, #{request.characterEncoding}


We'll probably want to go over to the dev list for a deep dive on this (and
open a JIRA issue), but I don't think this is a very good idea.  Ideally a
template (which by definition can only be in one encoding) should be
viewable in a response in any requested encoding.

I guess we have two encoding considerations.  The type of encoding of the
template file and the target encoding of the page.  How do you think we
should handle this?


JSP has a precedent that seems to make sense, but will need adaptation:

* JSP pages declare their pageEncoding attribute (we'd likely
 need something external for templates), and (when compiled)
 are internally converted into Unicode characters (Java's internal
 representation of strings).

* When a request is received, the requested encoding is used on
 the Writer, which converts from Unicode to the appropriate encoding.

In this way, a JSP source page can be rendered in any encoding that is
desired.  The trick for Clay is going to be how to declare the encoding of
the templates themselves.


> Craig

Gary



Craig

Reply via email to