widgets PC rule 9.1.12...

2011-05-04 Thread Phillips, Addison
Hi Marcos and Webapps,

This is a personal last call comment [chair hat off]. I realize that it is 
late, but I was on vacation

A developer recently sent me a code review implementing the latest spec and I 
found some code that was utterly mystifying to me. It turns out that he 
implemented rule 9.1.12 in the current spec, which defines 
user_agent_locales. This text seems really odd to me.

What I expected was from looking at step 8 in section 9.1.17 was to have a 
language priority list containing the locale(s) to search for using the 
Lookup algorithm from BCP 47. I expected that this list would usually consist 
of a single locale identifier (language tag) representing the current Widget 
runtime locale, although an implementation might occasionally use 
Accept-Language or some other source for a full language priority list such as 
shown in the examples. I read user_agent_locales to be the Widget's version 
of a language priority list.

The odd thing is that the rules in 9.1.12 explode each language range, but 
the BCP 47 Lookup algorithm already does this. For example, 9.1.12 would 
convert the language priority list zh-Hans-CN,* to the user_agent_locales 
string zh-Hans-CN,zh-Hans,zh,*. But this is unnecessary because the Lookup 
algorithm's remove from right logic already does this (or, for xml:lang 
matching, you can use prefix matching). If you do both the lookup algorithm 
*and* the computation of user_agent_locales, you will do a bunch of unnecessary 
searching with lower efficiency.

If what you want is to define lookup by computing user_agent_locales and then 
doing exact path matches for each element, then that's okay, I suppose, 
although personally I would (and have) implemented it in memory rather than by 
computing an exploded user_agent_locales list.

So my request/proposal would be to either (a) eliminate section 9.1.12 and just 
specify BCP 47 Lookup or (b) specify the complete match algorithm using the 
list computed in 9.1.12 and state that it is compatible with BCP 47 lookup. I 
would strongly prefer (a) personally. The internationalization work in 
ECMAScript and at least one implementation do it that way.

Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N WG)

Internationalization is not a feature.
It is an architecture.




Re: widgets PC rule 9.1.12...

2011-05-04 Thread Marcos Caceres

On Wednesday, May 4, 2011 at 7:53 PM, Phillips, Addison wrote: 
 Hi Marcos and Webapps,
 
 This is a personal last call comment [chair hat off]. I realize that it is 
 late, but I was on vacation
 
 A developer recently sent me a code review implementing the latest spec and I 
 found some code that was utterly mystifying to me. It turns out that he 
 implemented rule 9.1.12 in the current spec, which defines 
 user_agent_locales. This text seems really odd to me.
 
 What I expected was from looking at step 8 in section 9.1.17 was to have a 
 language priority list containing the locale(s) to search for using the 
 Lookup algorithm from BCP 47. I expected that this list would usually consist 
 of a single locale identifier (language tag) representing the current Widget 
 runtime locale, although an implementation might occasionally use 
 Accept-Language or some other source for a full language priority list such 
 as shown in the examples. I read user_agent_locales to be the Widget's 
 version of a language priority list.
 
 The odd thing is that the rules in 9.1.12 explode each language range, but 
 the BCP 47 Lookup algorithm already does this. For example, 9.1.12 would 
 convert the language priority list zh-Hans-CN,* to the user_agent_locales 
 string zh-Hans-CN,zh-Hans,zh,*. But this is unnecessary because the Lookup 
 algorithm's remove from right logic already does this (or, for xml:lang 
 matching, you can use prefix matching). If you do both the lookup algorithm 
 *and* the computation of user_agent_locales, you will do a bunch of 
 unnecessary searching with lower efficiency.
 
 If what you want is to define lookup by computing user_agent_locales and then 
 doing exact path matches for each element, then that's okay, I suppose, 
 although personally I would (and have) implemented it in memory rather than 
 by computing an exploded user_agent_locales list.
 
 So my request/proposal would be to either (a) eliminate section 9.1.12 and 
 just specify BCP 47 Lookup or (b) specify the complete match algorithm using 
 the list computed in 9.1.12 and state that it is compatible with BCP 47 
 lookup. I would strongly prefer (a) personally. The internationalization work 
 in ECMAScript and at least one implementation do it that way.
I am all for using (a), so long as the result is the same as what we currently 
have today (i.e., we don't break existing implementations and the test suite).