Hello, I have a problem with non-English letters in URLs. I suspect it's a bug, but I'm not sure where exactly the problem is. So I think it's best to give a description of what I'm doing:
I want to automatically create text buttons with Batik: First, I create the graphics in StarOffice and save it as .svg file. The Cocoon pipeline then reads this, a simple XSLT script exchanges the text in the .svg file with part of the URL, and then Batik renders it as JPEG. The corresponding pipeline definition looks like this: <map:pipeline> <map:match pattern="xxx/auto-img/*/*.jpg"> <map:generate src="xxx/auto-img/{1}.svg"/> <map:transform src="xxx/auto-img/auto-img.xsl" type="xslt"> <map:parameter name="text" value="{2}"/> </map:transform> <map:serialize type="svg2jpeg"/> </map:match> The auto-img.xsl is a dead simple script consisting of the well-known XSLT copy rule, and one other rules which exchanges the text "REPLACE" with "{$text}". Result: The url xxx/auto-img/button/Hello%20World.jpg delivers a fancy graphical button based on button.svg, saying "Hello World". I use this from another style sheet which reads elements <menu href="target.html">description</menu> and translates them into: <a href="target.html"> <img src="xxx/auto-img/button/description.jpg"/> </a> Result: Nice looking graphical menus with very little effort. All of this actually works, and took me only 1.5 hours. :-))) But the problem is... it doesn't work with non-ASCII letters. E.g. if my text contains German Umlauts (vowels a,o,u with two dots on them), the resulting button displays two arbitrary characters. I suspect what goes bad is the URL encoding (i.e. encoding 'special' characters as %xx escape sequences). I think at some point the string gets converted into URLs using UTF-8, but elsewhere gets decoded in some 8-bit character set. Thus, I get two garbage characters where I expected my Umlaut. My questions are: - How does Cocoon encode URLs? As UTF-8, with %xx escapes? - I would think this to be a common problem. Are there URL-encoding/decoding methods available in XSLT that I could use to manually solve the problem? (I checked the XSLT standard, and it doesn't have this. I also checked the library of extension functions on the Xalan page.) - How can I find out where the encoding (or decoding) actually goes wrong? So far, I can only see the outcome, but I don't know how to do debugging on this. Thanks for all answers... Sincerely, Daniel P.S.: I noticed that in my setup Batik has terrible kerning problems: All characters are of equals width! 'i' and 'l' leave huge gaps, and 'm' overlaps with following letters. Is this a known problem of Batik, or is maybe something wrong with my environment? (e.g. fonts?) --------------------------------------------------------------------- Please check that your question has not already been answered in the FAQ before posting. <http://xml.apache.org/cocoon/faqs.html> To unsubscribe, e-mail: <[EMAIL PROTECTED]> For additional commands, e-mail: <[EMAIL PROTECTED]>