taking one step at the time (what am I not seeing?):
- suppose a sax stream (producing xhtml) before serialization has a @href holding an eurosign (\u20AC unicode char)


- I hear you guys saying that xalan will recognize the uri-type attribute and serialize this character out as %E2%82%AC regardless of the chosen output encoding (didn't catch it but I am assuming that the output-encoding is set to UTF-8 anyways, and matches the form-encoding setting)

- so we get an html page out telling the browser it is utf-8 encoded
- so the browser will apply utf-8 encoding to form-values (and names) if this were about a form, but it's about this ready @href


- now this @href already has this same encoding (thx xalan) in place: so things should work the same as for the form (as long as the mentioned eurosign is strictly in the parameter-values)


So assuming all this reasoning is ok, what could never work is this:

- change your form-encoding (and matching setting of serialization) to anything else then UTF-8, cos then request-params in forms and pre-built ones in url's get encoded differently and we have no way to make a distinction over at cocoon's side


You're right.


It's sad news for Tuomo, but I can't see why it wouldn't be just working if (and only if)


- this is about parameter-values and NOT about URL's or parameter-names (because there we *need* to do some work)

Yes, I was talking about parameter values all the time, but didn't show it clear enough in the example. It should be:


<a href="someurl?foo=��" foo="��">��</a>

Where the foo's value gets UTF-8 encoded by Xalan during serialization, no matter what the settings are where ever.

- container-encoding is traditionally set to ISO-8859-1 (unless using a container like jetty where you can modify it's internal behaviour)

Mine is set to ISO-8859-1.

- form-encoding is strictly kept to 'utf-8' (thx for the lesson) and the serializer follows that (meta-equiv and all)

These don't help either, since the UTF-8 encoded parameter values are read in as ISO-8859-1 and the output is invalid. If these parameter values are now put for example in database, there are several '?'-marks where those special characters should appear.


Maybe I just have to send the parameters within a form (as Joerg had done it), which is not a very practical when you only need to do a simple HTTP-GET with parameters. Or then I use a XSL-stylesheet which converts all the special characters in parameter values to ISO-8859-1 before Xalan serialization. This works, but is also inpractical, since I have to write a long xsl:choose-section. Doing it this way also decreases the performance of my application.

Can we come up with a better solution?

Thank you guys for taking interest in this issue.

-Tuomo



regards,
-marc=
--
Marc Portier                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at                http://blogs.cocoondev.org/mpo/
[EMAIL PROTECTED]                              [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to