Hi Mikhail I agree totally with Eike, UTF-8 is the way to go. As far as I know most or all other platforms (i.e. solaris , glibc) use UTF-8 for all locales.
A good example of the advantage of this is what happened when the Euro switchover happened. ISO8859-1 doesn't have the Euro symbol so any platform using theis charset had to change to one which did. UTF-8 future proofs us , it also makes maintenance easier as Eike said. Peter Eike Rathke ha scritto: > Hi Mikhail, > > On Wed, Feb 02, 2005 at 12:26:15 -0500, Mikhail Teterin wrote: > > >>[Dear developers! This is my conversation with Eike regarding an encoding >>used >> for the translation files in OOo. > > > To clarify: this was about i18npool's *.xml locale data files, not > resource files. > > >> I'm advocating the use of 8-bit native >> charsets, while Eike insists on using UTF-8 for all. Eike suggested, I take >> this to your list.] >> >> >>>>So, the same for computers, but harder for people. Sounds like my way is >>>>better. UTF-8 only makes sense, when charsets need to be mixed -- not in >>>>this case. >> >>>Changing encodings would also make use of ref="..." references harder, >>>one would always have to check that encodings match, and changing >>>encoding of one file might affect others, which is not a desirable >>>situation. >> >>Sorry, I don't understand this. Can you explain? > > > The locale data files use a ref="..." mechanism to refer data of other > locales, for example the gl_ES.xml contains > > <LC_CTYPE ref="es_ES"/> > <LC_COLLATION ref="en_US"/> > <LC_SEARCH ref="en_US"/> > <LC_INDEX ref="es_ES"/> > <LC_CURRENCY ref="es_ES"/> > <LC_TRANSLITERATION ref="en_US"/> > <LC_NumberingLevel ref="en_US"/> > <LC_OutLineNumberingLevel ref="en_US"/> > > Now if gl_ES.xml and en_US.xml or es_ES.xml used different encodings > this might not work anymore if also replaceTo="..." was used (it isn't > in the case of gl_ES) and the maintainer copied an encoded > replaceFrom="..." value from the referred file without noticing it was > a different encoding. This may sound hypothetical but it is possible and > can be prevented by sticking to one encoding only. > > > >>>>The uniformity here is hardly advantageous -- these files are, by their >>>>very nature, maintained by different people, >>> >>>which in itself, viewed in context of ref="..." uses, almost forbids any >>>other encoding than UTF-8 >> >>Why? Western Europeans will use iso8859-1, Eastern -- some KOI8 derivative, >>etc. They will almost never need to cooperate -- within one file > > > Yes, almost never. Which makes it a perfect candidate for always be > prepared for it. > > > >>>Installation of the GNU recode package should be always possible, even >>>on the oldest machine. >> >>Everything is possible, of course. I maintain, that gratuitious use of UTF is >>inelegant -- if the file format allows to stick to 8-bit encodings, using a >>multibyte one is wrong. > > > Now please take a look at my situation as a maintainer of all these > files, if I would have to switch back and forth between encodings for > each and every file I edit it would soon annoy me. > > >>If I can not `vi' it, it ain't a text-file :-) > > > Use vim, that handles utf-8 ;-) > > Eike > > P.S.: Please consider to subscribe to the mailing lists you're posting to. > By doing so you won't miss replies that are directed to the list only. > Please reply only to the list, not to my personal account. Thanks. > -- Peter Nugent, Software Engineer, Sun Microsystems Ireland Ltd, Hamilton House, East Point Business Park, Dublin 3, Ireland. Tel +353.1.8199522 Email: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
