On Thu, 2010-04-22 at 15:33 +0200, Eike Rathke wrote:
> Hi Caolan,

> ODF 1.2 introduces the attributes fo:script, number:script and
> table:script.
> 
> Additionally to *:script attributes ODF 1.2 already introduces
> number:rfc-language-tag, table:rfc-language-tag and
> style:rfc-language-tag to store BCP47 language tags if a locale can't be
> described as a combination of *:language *:country *:script

Neat, so no file format changes needed after all. Slots already
available.

> > a BCP-47 string of de-DE-1901 becomes
> >
> > Language = de
> > Country = DE
> > Variant = -1901
> 
> With the leading '-' indicating the default script?

Yup.

> Why does it need to be reversible? Without that requirement we could
> drop information after Language-Country starting with '.', leaving
> 
> Language = sr
> Country = RS
> Variant = Latn
> 
> We should also prepare for transport of full BCP47 tags (see further
> down), having this mix of script and Unix locale in the Variant field
> somewhat makes me shudder.. I'd rather use the Variant here such that if
> the content starts with a capital ASCII letter and is 4 characters it is
> a script ISO 15924 code, else it is something different, to be defined.

This is covered in the other responses, where some hackery exists inside
rtl that currently fiddles with the Variant field.

> > 3. comphelper::Locale is very little used, it looks like a good idea to
> > move uses of it over to com::sun::star::lang::Locale and convert it to
> > some calls that operate on that instead and/or merge the unused bits
> > over to e.g. MSLangId.
> 

I logged a patch earlier to at least delete all the unused parts of
comphelper::Locale and strip it down to the small stub of it that's
actually used, mainly just one place in framework IIRC and in one other
minor location.

> Future perspective: the syntax of RFC 5646 allows more complicated
> language tags, not all can be fitted into Language/Country fields using
> ISO 639-2/3 and ISO 3166-1 codes. For these we'd have to use some
> notation to indicate the full BCP47 tag is to be used, having
> Language=x-bcp47 and Variant=full_bcp47_string might do. Of course this
> would affect all places that simply take the Language/Country fields as
> ISO codes.

> If an extended language subtag (extlang) came into play, the approach of
> concatenating Language-Country-Variant wouldn't work anymore if we said
> Variant had to start with the 4 letter script code or '-'.

I had imagined that the Language string in a Locale struct would contain
the extended language subtag. so that something like zh-cmn-Latn-CN
would appear as
Language = zh-cmn
Country = CN 
Variant = Latn

But I guess after all that this would break anyway as Language is then
no longer a ISO-639 code, and without yet-more-processing it would
appear as e.g. fo:Language="zh-cmn" in e.g. an .odt which I guess is
out. Rats.

C.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@openoffice.org
For additional commands, e-mail: dev-h...@openoffice.org

Reply via email to