Hi Mikhail
I agree totally with Eike, UTF-8 is the way to go.
As far as I know most or all other platforms (i.e. solaris , glibc) use
UTF-8 for all locales.

A good example of the advantage of this is what happened when the Euro
switchover happened. ISO8859-1 doesn't have the Euro symbol so any
platform using theis charset had to change to one which did.

UTF-8 future proofs us , it also makes maintenance easier as Eike said.

Peter


Eike Rathke ha scritto:
> Hi Mikhail,
> 
> On Wed, Feb 02, 2005 at 12:26:15 -0500, Mikhail Teterin wrote:
> 
> 
>>[Dear developers! This is my conversation with Eike regarding an encoding 
>>used 
>> for the translation files in OOo.
> 
> 
> To clarify: this was about i18npool's *.xml locale data files, not
> resource files.
> 
> 
>> I'm advocating the use of 8-bit native
>> charsets, while Eike insists on using UTF-8 for all. Eike suggested, I take
>> this to your list.]
>>
>>
>>>>So, the same for computers, but harder for people. Sounds like my way is
>>>>better. UTF-8 only makes sense, when charsets need to be mixed -- not in
>>>>this case.
>>
>>>Changing encodings would also make use of ref="..." references harder,
>>>one would always have to check that encodings match, and changing
>>>encoding of one file might affect others, which is not a desirable
>>>situation.
>>
>>Sorry, I don't understand this. Can you explain?
> 
> 
> The locale data files use a ref="..." mechanism to refer data of other
> locales, for example the gl_ES.xml contains
> 
> <LC_CTYPE ref="es_ES"/>
> <LC_COLLATION ref="en_US"/>
> <LC_SEARCH ref="en_US"/>
> <LC_INDEX ref="es_ES"/>
> <LC_CURRENCY ref="es_ES"/>
> <LC_TRANSLITERATION ref="en_US"/>
> <LC_NumberingLevel ref="en_US"/>
> <LC_OutLineNumberingLevel ref="en_US"/>
> 
> Now if gl_ES.xml and en_US.xml or es_ES.xml used different encodings
> this might not work anymore if also replaceTo="..." was used (it isn't
> in the case of gl_ES) and the maintainer copied an encoded
> replaceFrom="..." value from the referred file without noticing it was
> a different encoding. This may sound hypothetical but it is possible and
> can be prevented by sticking to one encoding only.
> 
> 
> 
>>>>The uniformity here is hardly advantageous -- these files are, by their
>>>>very nature, maintained by different people,
>>>
>>>which in itself, viewed in context of ref="..." uses, almost forbids any
>>>other encoding than UTF-8
>>
>>Why? Western Europeans will use iso8859-1, Eastern -- some KOI8 derivative, 
>>etc. They will almost never need to cooperate -- within one file
> 
> 
> Yes, almost never. Which makes it a perfect candidate for always be
> prepared for it.
> 
> 
> 
>>>Installation of the GNU recode package should be always possible, even
>>>on the oldest machine.
>>
>>Everything is possible, of course. I maintain, that gratuitious use of UTF is 
>>inelegant -- if the file format allows to stick to 8-bit encodings, using a 
>>multibyte one is wrong.
> 
> 
> Now please take a look at my situation as a maintainer of all these
> files, if I would have to switch back and forth between encodings for
> each and every file I edit it would soon annoy me.
> 
> 
>>If I can not `vi' it, it ain't a text-file :-)
> 
> 
> Use vim, that handles utf-8 ;-)
> 
>   Eike
> 
> P.S.: Please consider to subscribe to the mailing lists you're posting to.
> By doing so you won't miss replies that are directed to the list only.
> Please reply only to the list, not to my personal account. Thanks.
> 

-- 
Peter Nugent,
Software Engineer,
Sun Microsystems Ireland Ltd,
Hamilton House,
East Point Business Park,
Dublin 3,
Ireland.
Tel +353.1.8199522
Email: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to