--- Christian Biesinger <[EMAIL PROTECTED]> wrote: > Hi, Hi Christian.
> so you may remember that some time ago, I checked in > a patch to change the encoding that AP_DiskStringSet > uses to whatever XAP_App::getDefaultEncoding uses > (or something like that, can't remember what exactly > I did :) ). > > Anyway, it looks like this broke non-US-ASCII > characters in the statusbar, because of this piece > of code in ap_StatusBar.cpp, line 493, in > AP_StatusBar::setStatusMessage(const char * pBuf, > int redraw) > UT_UCS4_strcpy_char(bufUCS,pBuf); > > That function just uses the encoding that the > default constructor of mbtowc thinks is good as the > source encoding. That seems to be ISO-8859-1 for me. > However, due to the patch I mentioned above, that > string is already in UTF-8. > This means that the statusbar will not display > special characters (like, but not limited to, german > umlauts) correctly. Instead, it will show characters > looking like undecoded UTF-8 (like Ì) > > So... the question is: > What's the best way for fixing this? > Should UT_UCS4_strcpy_char take an additional (maybe > optional) argument, specifying the charset to > convert from? AP_StatusBar would pass the result of > XAP_App::getDefaultEncoding to it, and this would > work... There is only 1 way to fix this. We are software engineers here. Guesswork is not and has not ever been an integral part of what engineers do. What we should do is *find out* the *correct encoding* for the destination we send a string to, *always*, and use that encoding. I've said it before and I'll say it again, having a default constructor for mbtowc and wctomb is just begging for bugs. There should never be a time when we convert an encoding without knowing what encoding we want. Would you go to a money changer without knowing what currency or exchange rate you want? How on earth we're supposed to do better than Microsoft when we leave these things open to chance time after time is completely beyond me. So maybe some people think encoding is a hard problem - in that case look through the code or ask on the list before making code and committing it when it's all based on guesswork. Sorry I got into a rant (: Now the encoding needed by the status bar will depend on the OS. There should be functions in the EncodingManager these day to give the encoding of the OS and the encoding of the GUI. I think the GUI encoding is currently covered by something like defaultSystemEncoding. Experience on XP code has shown that the user often can set an encoding for himself. On Unix this is via $LANG environment variable. The system will usually have an encoding it likes to use for its own stuff. This varies from system to system. On QNX, BeOS, and OS X this seems to be UTF-8. On Windows this can be set in the Control Panel right next to where the user can set his preferred locale. There are APIs to get both. In a Win32 Unicode build (which we don't yet support but which we need), this will always be UCS-2 or UTF-16. With the old Gnome and GTK, the GUI used an ISO encoding, maybe depending on the default language. With the new Gnome and GTK, the GUI *always* uses UTF-8. So the statusbar also must use UTF-8. Perhaps it is now a good idea to add a new GUIEncoding to the other encodings in the EncodingManager to make it more obvious which one to use - especially since it appears with new GTK/Gnome that it may be different from the system encoding. Sorry for grumbling. We still have encoding problems popping up relatively frequently and also have wrong fixes going in fairly often. I hope this has gone a little way toward clearing up some of the confusion and should at least shed light on solving this one immediate problem. Andrew Dunbar. Mr i18n (: > Other ideas? > > (Should I put this in bugzilla instead?) ===== http://linguaphile.sourceforge.net/cgi-bin/translator.pl http://www.abisource.com __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com
