On Wed, Nov 14, 2007 at 11:28:50PM +0100, Arrigo Marchiori wrote: > > Personally, I'm very much in favor of switching PmWiki's default > > to utf8 -- it will bring us some huge benefits -- but the big > > obstacle is that migrating existing sites from the old iso-8859-1 > > default to a utf8 default may be somewhat complicated and/or > > problematic. Thus I'm seeking comments and opinions. > > In Italian we use some accented characters (à, è, é, ì, ...) I think > a charset change would be a major step for every PmWiki-based Italian > site. Same thing for French, German... > > As a regular UTF-8 user (you can see it by this e-mail :-) I > personally think that the whole Internet should switch to UTF-8. But > I'm seeing also not a very good support of this charset, on some > systems. I'm afraid that some web servers or FTP clients may not > accept filenames encoded in UTF-8. I hope someone can contradict me!
Actually, I tend to run into the reverse situation, where a number of operating systems (notably Mac OSX, but also some versions of Windows) will accept filenames encoded in UTF-8 but not in another character set. That's one reason I'm keen to switch. :-) > > So, what I'm seeing at the moment is that if we switch to using > > utf8 by default, admins of existing sites have to be notified > > somehow that the default has changed and told how to configure > > the site to continue using iso-8859-1, or given a procedure to > > somehow convert the site's pages to utf8. And once someone > > starts the utf8 conversion, it can get a bit messy to try to > > convert back. > > Yes, I think that a big red label should be in the upgrade > instructions, with pointer to a recipe or something that explains how > to convert page text. I don't know about page names, though... :-/ We'd have a recipe to take care of the conversion. It's not difficult to write, it's just a pain if any unexpected errors occur. The first step would undoubtedly be to ensure a complete backup of the wiki.d/ directory. :-) > I suggest to do the first test with the PmWiki localized > documentation: that's a good ready-made example of foreign language > text! :-) Indeed. > About how to implement a charset conversion, the only idea I have is > to use something like html_entity_decode(htmlentities(text)). I'm > afraid that the filenames' conversion could only be left to each site > admin. As I mentioned, the steps of the actual conversion aren't all that difficult -- PHP provides utf8encode and utf8decode functions that automatically convert between iso-8859-1 and utf-8. The hard part is knowing _when_ a conversion is needed, and when things should be left alone. > These were my two cents. Thanks, very helpful! Pm _______________________________________________ pmwiki-users mailing list [email protected] http://www.pmichaud.com/mailman/listinfo/pmwiki-users
