On 04/21/10 10:10, Caolán McNamara wrote:
So..., how about we adopt a BCP-47 based approach. i.e.

a) Where we are currently describing locales as a string in "iso-format"
we use BCP-47. Currently valid locale strings get to remain valid.

"get to remain valid": Is it the case that all currently valid locale strings happen to adhere to the BCP-47 restrictions, so would automatically be valid BCP-47 strings, or is it the case that not all currently valid locale strings happen to adhere to the BCP-47 restrictions, so we would have to bend BCP-47 to keep them valid?

b) Where we use a Locale structure, Language and Country stay the same,
but we specify a format for the remaining Variant field where it is
BCP-based sequence of tags separated by '-'. The Variant field becomes
the equivalent BCP-47 locale string for the totality, minus the language
and region tags, plus that the first tag entry *must* be a Script Code
to ensure forward and backward conversion to an unambiguous BCP-47
string. In this scheme the script tag at the start of the Variant can
(and must) be empty to denote the default script.

Is the requirement "that the first tag entry *must* be a Script Code to ensure forward and backward conversion to an unambiguous BCP-47 string" really necessary? A <langtag> w/o <language> and <region> parts would be

  [script] *("-" variant) *("-" extension) ["-" privateuse]

where the syntactic forms allowed for <script> are disjoint of those allowed for <variant>, <extension>, and <privateuse>.

In any event, we would also need rules how to translate between the <privateuse> and <grandfathered> variations of <Language-Tag> and Language/Country/Variant locales.

Parsers that want to convert a Unix Locale into the above structure can
take, e.g.
aa_er.ut...@saaho

and make it into

Language = aa
Country = ER
Variant = -.ut...@saaho

to give a reversible scheme where the original Unix Locale string can be
reconstructed

Is reversibility necessary here? I ask because this makes the Variant contain data that does not adhere to the above BCP-47 <langtag> w/o <language> and <region> parts.

-Stephan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.org
For additional commands, e-mail: dev-h...@openoffice.org

Reply via email to