Re: [Openpbx-dev] voices & prompts database, locales support etc

Daniel Swarbrick Sun, 05 Mar 2006 20:30:48 -0800

bkml wrote:
> 
> On Mar 6, 2006, at 4:26 AM, Peter Nixon wrote:
> 
>> I'm with Daniel on this one. We most definitely should use the correct, 
>> internationally accepted codes. To do anything is is simply insanity.
> 
> I didn't say we shouldn't use them.
> 
> The issue is about which abstraction layer should use technical terms, 
> and which abstraction layer should use plain language.


So, given your examples of "US-English" and "JP-Nihongo", why are you 
using an ISO 3166-1 country code, but not an ISO 639-1 language code? 
Surely "JP" is just as ambiguous to us dumb humans as "JA" would be for 
the language.

JP-Japanese, or JP-Nihongo?
RU-Russian, or RU-Russkiy? (or even "Russkiy" written in Cyrillic, which 
I won't type here, because believe it or not, not all mail readers will 
handle Unicode. How's that for a crazy idea?)

So are you just gonna anglicise the names of the languages? Or maybe 
transliterate them (don't forget that there are usually several methods 
of transliteration for some languages - certainly for Russian). Maybe we 
should just completely bastardise them instead.

I really don't understand your objection to using simple codes like 
en_US, ja_JP, de_DE, de_CH etc. If English wasn't my own native 
language, and particularly if the latin alphabet was not my native 
alphabet, I'd be quite annoyed at having to write the name of my 
language in romanised/transliterated form, using latin characters. More 
annoyed than consulting an internationally recognised table of language 
codes.

I know from my time living abroad in St Petersburg (where lots of people 
don't speak english, let alone know the english alphabet), and Berlin 
(where they kindly convert a/o/u with umlauts to ae/oe/ue when 
corresponding with english speakers), you're not going to win friends by 
imposing your ideals of written language upon them

It's one thing to expect a non-english speaking person to know a few 
two-letter codes (or where to find them). It's a whole different thing 
to expect them to be fluent in romanisation and/or transliteration of 
their language.

This thread is starting to get way off the original topic, so I'll 
summarise for any late-comers.

The only way you can be truly sensitive to foreign languages is to write 
  them in their native script. This would require a Unicode character 
set such as UTF-8, UTF-16, or the very wasteful UTF-32. Depending on 
which of them you use, this would require major retrofitting of all 
string handling functions in OpenPBX that had to deal with those config 
files. You gave an example of the console spitting out the transcript of 
what file it was playing, in verbose mode. What character set are you 
planning to use for that, if the transcript uses non-latin characters?

In ANSI C, the only Unicode spec that is safe to use with functions like 
strncmp and strncpy is UTF-8, since it is byte-aligned. There may still 
be code that assumes that one letter = one byte, which is not the case 
Asian glyphs.

I really, really strongly advise against throwing Unicode into the mix 
at this stage. Use Unicode in your GUIs if you like. But use recognised 
standards like ISO 3166-1 and ISO 639-1 internally. It will save you a 
lot of headaches.
_______________________________________________
Openpbx-dev mailing list
[email protected]
http://lists.openpbx.org/mailman/listinfo/openpbx-dev

Re: [Openpbx-dev] voices & prompts database, locales support etc

Reply via email to