Hi "Reshat, On Wed, Aug 30, 2006 at 15:11:13 -1000, "Reshat Sabiq (Reşat)" wrote:
> > Note that currently modifiers or ISO 15924 script codes as drafted > > in RFC 3066bis are not supported yet. Furthermore I didn't find any > > normative reference to 'iqte' regarding the Tatar languag. > If the modifiers are expected to be supported in the future, then > perhaps absence of their support now shouldn't prevent the initiation > of the project, as it is guaranteed to take a while. Of course not, and as it already was mentioned, creation of a native language _site_ should normally be targeted to the language, independent of the script it is written in. The concerns I expressed are merely from a developer's point of view and are affected by availability and support of _locales_, not just languages, because that is what the code mostly deals with. There are technical restrictions we currently have to live with until quite an amount of implementation will be changed to support script values and maybe variants and extensions in locale designators according to the upcoming RFC 3066bis. This will most certainly not happen in the very near future. > As far as iqte, yes i realize that it is not standardized. I was > hoping that practices would be similar to the "freeform" qualification > in the following statement on mozilla.org: > "In some rare cases, we might need the dialect part as a third part > (3- to 8-letter basically freeform part), we currently can imagine two > cases there: ..." > http://wiki.mozilla.org/L10n:Simple_locale_names This three-part notation language-region-dialect or language-region-variant currently is not supported by OOo. As a side note, not directly related to the Tatar script problems, I'm also not convinced that the approach taken by Mozilla in the case of | 1. there's no ISO 639.2 code for some language that wants to do | a localization | (e.g. for Venetian Firefox team). In this case, we | can use the generic | identifier for the language family (romance: | roa) from ISO 639.2 as the | language code, and add an identifier for | the specific language as the dialect | (if one exists, we prefer to | use the 3-letter SIL code). In the case of | venetian, we end up with | "roa-IT-vec" this way. would actually be a good solution for document exchange, because of the nature of a language _family_ designator or collective code, using that instead of a real language designator could fool matching and fallback mechanisms. Instead, for the rare cases where it would be necessary, I'd go for ISO/DIS 639-3 where available. This would not strictly comply with RFC 3066 because at the time it was written ISO/DIS 639-3 didn't exist so it mentions only 639-1 and 639-2 (and RFCs are not based on drafts like DIS anyway), but would follow the same pattern of <language>-<region>. This is also what Ethnologue recommends nowadays, see http://www.ethnologue.com/codes/default.asp#standards RFC 3066bis then would allow for scripts, variants and extensions, and a subsequent RFC 3066ter will hopefully adopt the then ISO 639-3 codes for primary language tags. > > If RFC 3066bis was implemented, Tatar written in Latin script would > > be 'tt-Latn', this would be independent of a region code. However, > > to form a proper locale a region code is needed, with a few > > exceptions we agreed on, Esperanto or Interlingua for example. > Unfortunately, it's not so simple for Idil-Ural (Qazan) Tatar. There > are currently about 6 different Latin alphabets in use, oerghs.. > [...] > I'm a believer in IQTElif because it is the only alphabet that > provides similar orthography w/ Crimean Tatar (crh). Given the > plethora of Latin alphabets, the iqte modifier was to make clear what > alphabet is used. All your points taken, just that it is not possible to use with our current implementation. > P.P.S. If we made a team for tt alone, w/ no modifiers, frankly i > wouldn't know what it would represent. Would it be Cyrillic, or Latin? What is the (a) most widely used script and (b) official form? The official one is Cyrillic, as far as I have understood the matter, so probably efforts should go into that direction. > P.P.P.S. Thanks to Crimean Tatars and Ukraine, crh has none of these > questions, and has one decent Latin alphabet. I guess we could start > by making a team for crh, and i look forward to feedback on the rest. At least 'crh' is a valid ISO 639-2 code.. Eike -- PGP/OpenPGP/GnuPG encrypted mail preferred in all private communication. Key ID: 0x293C05FD - 997A 4C60 CE41 0149 0DB3 9E96 2F1A D073 293C 05FD
signature.asc
Description: Digital signature
