taking one step at the time (what am I not seeing?):
- suppose a sax stream (producing xhtml) before serialization has a @href holding an eurosign (\u20AC unicode char)
- I hear you guys saying that xalan will recognize the uri-type attribute and serialize this character out as %E2%82%AC regardless of the chosen output encoding (didn't catch it but I am assuming that the output-encoding is set to UTF-8 anyways, and matches the form-encoding setting)
- so we get an html page out telling the browser it is utf-8 encoded
- so the browser will apply utf-8 encoding to form-values (and names) if this were about a form, but it's about this ready @href
- now this @href already has this same encoding (thx xalan) in place: so things should work the same as for the form (as long as the mentioned eurosign is strictly in the parameter-values)
So assuming all this reasoning is ok, what could never work is this:
- change your form-encoding (and matching setting of serialization) to anything else then UTF-8, cos then request-params in forms and pre-built ones in url's get encoded differently and we have no way to make a distinction over at cocoon's side
You're right.
It's sad news for Tuomo, but I can't see why it wouldn't be just working if (and only if)
- this is about parameter-values and NOT about URL's or parameter-names (because there we *need* to do some work)
Yes, I was talking about parameter values all the time, but didn't show it clear enough in the example. It should be:
<a href="someurl?foo=��" foo="��">��</a>
Where the foo's value gets UTF-8 encoded by Xalan during serialization, no matter what the settings are where ever.
- container-encoding is traditionally set to ISO-8859-1 (unless using a container like jetty where you can modify it's internal behaviour)
Mine is set to ISO-8859-1.
- form-encoding is strictly kept to 'utf-8' (thx for the lesson) and the serializer follows that (meta-equiv and all)
These don't help either, since the UTF-8 encoded parameter values are read in as ISO-8859-1 and the output is invalid. If these parameter values are now put for example in database, there are several '?'-marks where those special characters should appear.
Maybe I just have to send the parameters within a form (as Joerg had done it), which is not a very practical when you only need to do a simple HTTP-GET with parameters. Or then I use a XSL-stylesheet which converts all the special characters in parameter values to ISO-8859-1 before Xalan serialization. This works, but is also inpractical, since I have to write a long xsl:choose-section. Doing it this way also decreases the performance of my application.
Can we come up with a better solution?
Thank you guys for taking interest in this issue.
-Tuomo
regards, -marc= -- Marc Portier http://outerthought.org/ Outerthought - Open Source, Java & XML Competence Support Center Read my weblog at http://blogs.cocoondev.org/mpo/ [EMAIL PROTECTED] [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
