On Wednesday, May 4, 2011 at 7:53 PM, Phillips, Addison wrote:
Hi Marcos and Webapps,
This is a personal last call comment [chair hat off]. I realize that it is
late, but I was on vacation
A developer recently sent me a code review implementing the latest spec and I
found some code that was utterly mystifying to me. It turns out that he
implemented rule 9.1.12 in the current spec, which defines
user_agent_locales. This text seems really odd to me.
What I expected was from looking at step 8 in section 9.1.17 was to have a
language priority list containing the locale(s) to search for using the
Lookup algorithm from BCP 47. I expected that this list would usually consist
of a single locale identifier (language tag) representing the current Widget
runtime locale, although an implementation might occasionally use
Accept-Language or some other source for a full language priority list such
as shown in the examples. I read user_agent_locales to be the Widget's
version of a language priority list.
The odd thing is that the rules in 9.1.12 explode each language range, but
the BCP 47 Lookup algorithm already does this. For example, 9.1.12 would
convert the language priority list zh-Hans-CN,* to the user_agent_locales
string zh-Hans-CN,zh-Hans,zh,*. But this is unnecessary because the Lookup
algorithm's remove from right logic already does this (or, for xml:lang
matching, you can use prefix matching). If you do both the lookup algorithm
*and* the computation of user_agent_locales, you will do a bunch of
unnecessary searching with lower efficiency.
If what you want is to define lookup by computing user_agent_locales and then
doing exact path matches for each element, then that's okay, I suppose,
although personally I would (and have) implemented it in memory rather than
by computing an exploded user_agent_locales list.
So my request/proposal would be to either (a) eliminate section 9.1.12 and
just specify BCP 47 Lookup or (b) specify the complete match algorithm using
the list computed in 9.1.12 and state that it is compatible with BCP 47
lookup. I would strongly prefer (a) personally. The internationalization work
in ECMAScript and at least one implementation do it that way.
I am all for using (a), so long as the result is the same as what we currently
have today (i.e., we don't break existing implementations and the test suite).