--- Christian Biesinger <[EMAIL PROTECTED]> wrote: > Andrew Dunbar wrote: > > In this case we're converting > > from ISO-8859-X to UTF-8, then converting from > > ISO-8859-X to UTF-8 again, getting the source > > encoding wrong the second time. Argh! ): > > Yeah, this is what we're doing. (though the first > time, it might've been from utf-8 or whatever > the .strings file is in. anyway, that code is > correct).
The strings file can be in any encoding but I believe Dom has had them or asked for them all to be changed to UTF-8. Regardless, the XML parsing code *always* returns the strings in UTF-8 so that callers never need to think about encodings. > Anyway, so the question is: What charset is the > string passed to setStatusMessage (the char* > version) in? On which platform? I'm sorry but I don't really have access to the source right now. > That function wants an UCS-4 string and has a char*. It would seem odd for an AbiWord GUI function to want UCS-4. I would've thought document functions would want UCS-4 and GUI functions would want UTF-8. Win32 Unicode functions want UCS-2/UTF-16 but that doesn't seem to be what you're asking. > If it is always UTF-8, that function could just use > UT_convert (I think that's what it's called, might > be UT_iconv, can't remember) from UTF-8 to UCS-4 and > everyone would be happy. Alternatively, if it's > always XAP_App::getDefaultEncoding() that would be > fine too, because that could be used instead of > UTF-8. Well as always I would recommend tracing through the code to find out the correct answer. But my assumption is that we are passed .strings values in UTF-8 and that XP GUI functions ought to take UTF-8 since GTK2, QNX, BeOS, and OS X (and KDE), and Pango all use UTF-8 for their GUI strings. Windows is the exception and so ought to be handled in the Win32 layer. This may not reflect the current code though. > Now, before I dig into the code, does anyone know > what encoding the strings passed to setStatusMessage > are supposed to be in? I don't know what they are but they should be UTF-8 unless somebody can present a good argument otherwise. > -biesi, who really wishes Abiword would use a string > class for _all kinds of strings_ which also stored > the string's encoding. I actually started work on this way back when but there were just too many places where people were making strings without knowing or caring what encoding it was and the stored encoding name just ended up being wrong half the time. What we really need is one encoding to be used internally at all times, and converting it to clearly stated encodings at endpoints where various GUIs, APIs, etc need it. Andrew. > -- > Fiat iustitia, pereat mundus. > ===== http://linguaphile.sourceforge.net/cgi-bin/translator.pl http://www.abisource.com __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com
