On Thu, 2010-04-22 at 15:33 +0200, Eike Rathke wrote: > Hi Caolan, > ODF 1.2 introduces the attributes fo:script, number:script and > table:script. > > Additionally to *:script attributes ODF 1.2 already introduces > number:rfc-language-tag, table:rfc-language-tag and > style:rfc-language-tag to store BCP47 language tags if a locale can't be > described as a combination of *:language *:country *:script
Neat, so no file format changes needed after all. Slots already available. > > a BCP-47 string of de-DE-1901 becomes > > > > Language = de > > Country = DE > > Variant = -1901 > > With the leading '-' indicating the default script? Yup. > Why does it need to be reversible? Without that requirement we could > drop information after Language-Country starting with '.', leaving > > Language = sr > Country = RS > Variant = Latn > > We should also prepare for transport of full BCP47 tags (see further > down), having this mix of script and Unix locale in the Variant field > somewhat makes me shudder.. I'd rather use the Variant here such that if > the content starts with a capital ASCII letter and is 4 characters it is > a script ISO 15924 code, else it is something different, to be defined. This is covered in the other responses, where some hackery exists inside rtl that currently fiddles with the Variant field. > > 3. comphelper::Locale is very little used, it looks like a good idea to > > move uses of it over to com::sun::star::lang::Locale and convert it to > > some calls that operate on that instead and/or merge the unused bits > > over to e.g. MSLangId. > I logged a patch earlier to at least delete all the unused parts of comphelper::Locale and strip it down to the small stub of it that's actually used, mainly just one place in framework IIRC and in one other minor location. > Future perspective: the syntax of RFC 5646 allows more complicated > language tags, not all can be fitted into Language/Country fields using > ISO 639-2/3 and ISO 3166-1 codes. For these we'd have to use some > notation to indicate the full BCP47 tag is to be used, having > Language=x-bcp47 and Variant=full_bcp47_string might do. Of course this > would affect all places that simply take the Language/Country fields as > ISO codes. > If an extended language subtag (extlang) came into play, the approach of > concatenating Language-Country-Variant wouldn't work anymore if we said > Variant had to start with the 4 letter script code or '-'. I had imagined that the Language string in a Locale struct would contain the extended language subtag. so that something like zh-cmn-Latn-CN would appear as Language = zh-cmn Country = CN Variant = Latn But I guess after all that this would break anyway as Language is then no longer a ISO-639 code, and without yet-more-processing it would appear as e.g. fo:Language="zh-cmn" in e.g. an .odt which I guess is out. Rats. C. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@openoffice.org For additional commands, e-mail: dev-h...@openoffice.org