On Thu, 19 Feb 2009 21:38:04 +0100 Sam Geeraerts <[email protected]> wrote:
> Karl Goetz schreef: > > On Sun, 15 Feb 2009 11:41:36 +0100 > > Sam Geeraerts <[email protected]> wrote: > > > >>> There's also a PmWiki recipe to convert input on the fly [2], but > >>> I think it's only useful if the content is already in UTF-8. It > >>> seems intended to catch input from a browser that is forced to > >>> another encoding (or one that can't handle UTF-8). > >>> > >>> [1] http://www.pmwiki.org/wiki/Cookbook/UTF-8 > >>> [2] http://www.pmwiki.org/wiki/Cookbook/UTF8Conv > >>> > > > > We seem to have two options with PmWiki when it comes to charset to > > use. Here's a snippet from our config: > > > > $WikiTitle = 'PmWiki'; > > $Charset = 'ISO-8859-1'; > > $HTTPHeaders = array( > > "Expires: Tue, 01 Jan 2002 00:00:00 GMT", > > "Cache-Control: no-store, no-cache, must-revalidate", > > "Content-type: text/html; charset=ISO-8859-1;"); > > $CacheActions = array('browse','diff','print'); > > > > I can change either or both of these, but I'm not sure what the > > consequences would be ... > > kk > > > > Grmbl, Charset is not documented (yet) [1]. I would have added a > placeholder as suggested, but I'm not sure if I'm supposed to do that > in [1] or in [2]. Thanks for doing this research for us. > > Anyway, I grepped through the code and it looks like Charset is the > encoding in the meta-element (or xml declaration). So both Charset > and HTTPHeaders should be changed after a conversion. I don't know > much about PHP, but it seems more sensible to reuse Charset in > HTTPHeaders. If that is valid then a bug report is in order. It would make more sense. > > I also stumbled upon some interesting comments to consider before > using UTF-8 (in scripts/xlpage-utf-8.php): > > This script configures PmWiki to use utf-8 in page content and > pagenames. There are some unfortunate side effects about PHP's > utf-8 implementation, however. First, since PHP doesn't have a > way to do pattern matching on upper/lowercase UTF-8 characters, > WikiWords are limited to the ASCII-7 set, and all links to page > names with UTF-8 characters have to be in double brackets. > Second, we have to assume that all non-ASCII characters are valid > in pagenames, since there's no way to determine which UTF-8 > characters are "letters" and which are punctuation. That sounds like a problem. Guess we'll have to look into encodings that are not utf-8. I have a test wiki here, but its a different version to the current live site (i'll need to setup a copy of the live site pre testing all these settings out). kk > > [1] http://pmwiki.org/wiki/PmWiki/Variables > [2] http://pmwiki.org/wiki/PmWiki/I18nVariables -- Karl Goetz, (Kamping_Kaiser / VK5FOSS) Debian user / gNewSense contributor http://www.kgoetz.id.au No, I won't join your social networking group
signature.asc
Description: PGP signature
_______________________________________________ gNewSense-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/gnewsense-users
