Hi guys, In the laborious process of 'unicodizing' my application, I'm coming across quite a few "funny" issues. In the context of a web application, URI-encoded unicode string are absolutely awful. It's actually nothing new, I had the same problem with the 8-bit ISO-8859-15 character set, where spaces are turned into %20 and so on.
The way I got around this was to build a lossy table mapping ISO-8859-15 to US ASCII, and then applying a few simple regexes so that a sentence like "Le rêve du café" gets turned into "le-reve-du-cafe". Not only is this useful to get cool looking URIs, but it's also useful to build search engines that actually match 'café', 'cafe', 'CäFê', etc. The problem is that as you can imagine, on a character set wider than latin-1, things get slightly trickier, especially when you realize that 'Dingbats' is in the Unicode charset ;-) Ideally I would like to write a CPAN Unicode::Transliterate module that could be modular enough to dynamically import transliteration tables from any charset to any other charset, and eventually depending on the language (for example, the japanese word 'roku' might actually sound better if written 'lok' when read in French). I would like to know if you had any suggestions on how I should do that. As for the interface, I was thinking of: package Unicode::Transliterate +@ Unicode::Transliterate new ('transliterator1', 'transliterator2', etc) + Unicode::String process (Unicode::String $string) where 'transliteratorX' is the name of a transliteration class to use (i.e. Unicode::Transliterate::ISO_8859_15::ASCII for the ISO_8859_15 to ASCII transliteration table). As for the implementation, I'm still trying to get my head around doing something not so-slow without having to go XS. Any ideas or suggestions? Cheers, -- IT'S TIME FOR A DIFFERENT KIND OF WEB ================================================================ Jean-Michel Hiver - Software Director [EMAIL PROTECTED] +44 (0)114 221 4968 ================================================================ VISIT HTTP://WWW.MKDOC.COM