Re: Language Negotiation API

2013-07-17 Thread Zbigniew Braniecki



   	   
   	Anne van Kesteren  
  July 15, 2013 
1:37 PM
  FWIW,
 exposing a new API because another API is broken in a particularimplementation
 is a known anti-pattern. We should fix problems at thesource.

Good point, but I believe that there are more potential sources of 
language tags passed to language negotiation, including programmed 
composition, feeding from unknown sources (databases etc.), or even 
manually entered by the user.

Having a function that enables us to canonicalize it (even the simplest 
part of that - upper/lower cases) allows to use compare operators 
(langTag1 == langTag2), or, in localization case, allows us to build a 
path to the resource on case sensitive systems.

Cheers,
g.

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-16 Thread Ian Hickson
On Tue, 16 Jul 2013, Andy Earnshaw wrote:
 
 navigator.language isn't part of any stable specification

It's part of the HTML standard:

   http://whatwg.org/html/#language-preferences

...which is very stable at this point (there's basically no way that part 
of the spec can change in an incompatible fashion, since it's widely 
implemented; the only possible changes are those that approach reality 
more, and those that add features).


 and even the current HTML 5.1 draft doesn't specify that tags should be 
 returned in canonical form.  Do you think it would be a good idea to 
 raise an issue for this?

Fixed. (A change that approaches reality more.)

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-15 Thread Zbigniew Braniecki
 As for LookupAvailableLocales, there might be a problem with Zbigniew's
 vision of it as any tags would be returned without extensions. I'm not sure
 if this is something that we'd need to worry about, though.

No, that's good, because locales will be stored under names without them as 
well.

Cheers,
zb.
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-15 Thread Andy Earnshaw
Would you expect to support the same locales as Intl constructors in your
library?  Can you safely make that assumption?

Canonicalisation makes sense because I would expect a library to
canonicalise the tag and then try and load the file containing relevant
data whether the native API supports it or not. Forgive me if I'm
misunderstanding something, I didn't have a look at your project in great
detail.

Andy
On 15 Jul 2013 16:49, Zbigniew Braniecki zbranie...@mozilla.com wrote:

  As for LookupAvailableLocales, there might be a problem with Zbigniew's
  vision of it as any tags would be returned without extensions. I'm not
 sure
  if this is something that we'd need to worry about, though.

 No, that's good, because locales will be stored under names without them
 as well.

 Cheers,
 zb.

___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-15 Thread Zbigniew Braniecki


- Original Message -
 Would you expect to support the same locales as Intl constructors in your
 library?

Yes.

  Can you safely make that assumption?

I'd have to think more about edge cases, but my initial reaction is - yes.


 Canonicalisation makes sense because I would expect a library to
 canonicalise the tag and then try and load the file containing relevant
 data whether the native API supports it or not. Forgive me if I'm
 misunderstanding something, I didn't have a look at your project in great
 detail.

There's no need to look at my project. All I'm asking is to talk about exposing 
the API for negotiating between locales provided by the application and locales 
requested by the user with the result being the list of available locales that 
the user wants sorted by the user preference.

That enables us to load the locale 0 and fallback to locale 1 and then to 
locale 2 etc.

The only crucial point here is that we need to operate on the list of available 
locales, not requested, because we will be selecting from the available ones.

Cheers,
g.
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-15 Thread Anne van Kesteren
On Sun, Jul 14, 2013 at 5:20 AM, Andy Earnshaw andyearns...@gmail.com wrote:
 I certainly do, at least for Canonicalize-.  I've come across one user agent
 that returns `navigator.language` in non-canonical form which presented a
 small problem for data I had stored with canonical file names.  This was a
 WebKit based Smart TV platform from 2012, so it was fairly recent, there
 could be other platforms or frameworks that do the same.

FWIW, exposing a new API because another API is broken in a particular
implementation is a known anti-pattern. We should fix problems at the
source.


--
http://annevankesteren.nl/
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-15 Thread Andy Earnshaw
On Mon, Jul 15, 2013 at 9:37 PM, Anne van Kesteren ann...@annevk.nl wrote:

 On Sun, Jul 14, 2013 at 5:20 AM, Andy Earnshaw andyearns...@gmail.com
 wrote:
  I certainly do, at least for Canonicalize-.  I've come across one user
 agent
  that returns `navigator.language` in non-canonical form which presented a
  small problem for data I had stored with canonical file names.  This was
 a
  WebKit based Smart TV platform from 2012, so it was fairly recent, there
  could be other platforms or frameworks that do the same.

 FWIW, exposing a new API because another API is broken in a particular
 implementation is a known anti-pattern. We should fix problems at the
 source.


Normally, I would agree.  However, I was just using my scenario as an
example for where exposing the API would have been useful for me.  I can
also think of a few other reasons:

 - Language tags can be in extlang form or canonical form.  Depending on
the source providing the language tag, it's not guaranteed to be the
canonical form (extlang form can reinstate extlang subtags that were
removed during canonicalisation).
 - The Internationalization API doesn't cover all aspects of its namesake,
like translation, or formatting of postal codes or telephone numbers, as a
few examples.  Developer libraries could augment Intl with this data, so it
would make lives easier if we exposed CanonicalizeLanguageTag to be used by
such libraries.
 - Canonicalisation has at least a couple of optional steps (like
normalising case or ordering variant subtags) so exposing a canonicalizing
method would give developers a way to achieve consistency with the
Internationalisation API.

navigator.language isn't part of any stable specification, and even the
current HTML 5.1 draft doesn't specify that tags should be returned in
canonical form.  Do you think it would be a good idea to raise an issue for
this?

Andy
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-15 Thread Anne van Kesteren
On Mon, Jul 15, 2013 at 7:51 PM, Andy Earnshaw andyearns...@gmail.com wrote:
 navigator.language isn't part of any stable specification, and even the
 current HTML 5.1 draft doesn't specify that tags should be returned in
 canonical form.  Do you think it would be a good idea to raise an issue for
 this?

Filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=22681


--
http://annevankesteren.nl/
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-14 Thread Andy Earnshaw
On Sun, Jul 14, 2013 at 2:07 AM, Norbert Lindenberg 
ecmascr...@lindenbergsoftware.com wrote:

  CanonicalizeLanguageTag isn't even defined for non-structurally valid
 language tags. That's why I meant a combined IsStructurallyValidLanguageTag
 + CanonicalizeLanguageTag function is more useful than access to the bare
 CanonicalizeLanguageTag function.

 Correct. As currently specified, the CanonicalizeLanguageTag abstract
 operation assumes that its input is a String valueI'm not too sure about
 the that's a structurally valid language tag. An API cannot make such
 assumptions - it has to be ready to deal with any input, as well as the
 absence of input. It has to do something like the steps in
 CanonicalizeLocaleList 8.c.ii-iv before calling the current
 CanonicalizeLanguageTag.


You're both right, it assumes a string and doesn't check validity.  That
didn't occur to me, it's been a few months since my implementation.


 Before we get too much into spec details: Do others believe that exposing
 API as proposed by Zbigniew would be useful?


I certainly do, at least for Canonicalize-.  I've come across one user
agent that returns `navigator.language` in non-canonical form which
presented a small problem for data I had stored with canonical file names.
 This was a WebKit based Smart TV platform from 2012, so it was fairly
recent, there could be other platforms or frameworks that do the same.

As for LookupAvailableLocales, there might be a problem with Zbigniew's
vision of it as any tags would be returned without extensions.  I'm not
sure if this is something that we'd need to worry about, though.

Andy
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-13 Thread Andy Earnshaw
Sorry g, forgot the Cc :-)

On Thu, Jul 11, 2013 at 11:52 PM, Zbigniew Braniecki zbranie...@mozilla.com
 wrote:

 ...



1) CanonicalizeLanguageTag [1]

 Because language tags come from developers and users, ability to
 canonicalize them is crucial to us. ECMA 402 specifies this function and
 all we need is to expose it in the API


I was thinking the same thing recently, at least for
CanonicalizeLanguageTag. I was working with a platform that gave me a
language tag in non-canonical form, meaning I had to either canonicalize it
or rename my language files to match the same non-canonical form.  Exposing
it as `Intl.canonicalizeLanguageTag(tag)` seems like a good idea.



 1.1) CanonicalizeLocaleList [2]

 That would also be nice to have :)


I don't think you could expose CanonicalizeLocaleList directly without
altering it to return an array, you'd have to do something similar to step
5 of LookupSupportedLocales.  I'm not sure we could change that function in
the spec without other abstracts potentially being affected by tainted a
Array.prototype, so I guess you'd need to specify a new function.  In which
case I'm wondering if maybe you'd be better off with
`Intl.canonicalizeTags(tags)` which would cover both
CanonicalizeLanguageTag() and CanonicalizeLocaleList().

2) LookupAvailableLocales

 This function has almost identical heuristic to LookupSupportedLocales [3]
 with a single difference being in step d).

 Replace:
  - If *availableLocale* is not *undefined*, then append *locale* to the
 end of *subset*. 
 with:
  - If *availableLocale* is not *undefined*, then append *availableLocale*to 
 the end of
 *subset*. 

 The reason behind this is that localization frameworks need to choose the
 available locales that closest match the user preferences. If we used
 LookupSupportedLocales, we will receive the locales that user requested,
 not ones that are available on the system.
 In result on each of those, we'd have to call BestAvailableLocale [4] to
 receive the tag name that we can pull resources for.


You can at least work around this for a single locale with
Intl.NumberFormat(tag).resolvedOptions().locale.  If you're already using
the native localisation APIs, this might not be too much of a hindrance.
 What you're suggesting would need to be a function property of the
constructors, e.g. `Intl.NumberFormat.availableLocalesOf()`.  I'm not so
sure this approach makes sense, though; wouldn't you still have a problem
if your own API provided variant data where the system does not?



 With that one change, we are actually going to receive the right set of
 language tags that we can then use to provide best language with fallbacks.

 Example implementation of this is L20n localization framework [5] which
 copies Mozilla ECMA 402 code to expose the required functions and uses
 custom function called prioritizeLocales to build the final locale fallback
 chain.

 Comments? Feedback? Next steps? :)

 Cheers,
 g.
 --

 Mozilla (http://www.mozilla.org)

 [1] http://ecma-international.org/ecma-402/1.0/index.html#sec-6.2.3
 [2] http://ecma-international.org/ecma-402/1.0/index.html#sec-9.2.1
 [3] http://ecma-international.org/ecma-402/1.0/index.html#sec-9.2.6
 [4] http://ecma-international.org/ecma-402/1.0/index.html#sec-9.2.2
 [5] https://github.com/l20n/l20n.js/blob/master/lib/l20n/intl.js#L431

 ___
 es-discuss mailing list
 es-discuss@mozilla.org
 https://mail.mozilla.org/listinfo/es-discuss


___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-13 Thread André Bargull

On Thu, Jul 11, 2013 at 11:52 PM, Zbigniew Braniecki zbraniecki at mozilla.com  
https://mail.mozilla.org/listinfo/es-discuss
/  wrote:
[...]
//
/1) CanonicalizeLanguageTag [1]
/
//  Because language tags come from developers and users, ability to
//  canonicalize them is crucial to us. ECMA 402 specifies this function and
//  all we need is to expose it in the API
//
/
I was thinking the same thing recently, at least for
CanonicalizeLanguageTag. I was working with a platform that gave me a
language tag in non-canonical form, meaning I had to either canonicalize it
or rename my language files to match the same non-canonical form.  Exposing
it as `Intl.canonicalizeLanguageTag(tag)` seems like a good idea.


Only exposing CanonicalizeLanguageTag does not seem useful to me without 
having access to IsStructurallyValidLanguageTag. Most likely a combined 
IsStructurallyValidLanguageTag + CanonicalizeLanguageTag function is 
necessary/wanted for most use cases.





/
//  1.1) CanonicalizeLocaleList [2]
//
//  That would also be nice to have :)
//
/
I don't think you could expose CanonicalizeLocaleList directly without
altering it to return an array, you'd have to do something similar to step
5 of LookupSupportedLocales.  I'm not sure we could change that function in
the spec without other abstracts potentially being affected by tainted a
Array.prototype, so I guess you'd need to specify a new function.  In which
case I'm wondering if maybe you'd be better off with
`Intl.canonicalizeTags(tags)` which would cover both
CanonicalizeLanguageTag() and CanonicalizeLocaleList().


I don't see why you'd need to change CanonicalizeLocaleList at all. Just 
let it return the internal list as-is, and then define 
`Intl.canonicalizeLocaleList` like so:


Intl.canonicalizeLocaleList(locales):
1. Let canonicalizedLocaleList be the result of 
CanonicalizeLocaleList(locales).

2. ReturnIfAbrupt(canonicalizedLocaleList).
3. Return CreateArrayFromList(canonicalizedLocaleList).

(ReturnIfAbrupt and CreateArrayFromList are defined in ES6 as internal 
abstract operations.)


It also needs to be considered whether the duplicate removal in 
CanonicalizeLocaleList creates any issues for users of a potential 
`Intl.canonicalizeLocaleList` or `Intl.canonicalizeTags` function.



- André
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-13 Thread Andy Earnshaw
On Sat, Jul 13, 2013 at 1:05 PM, André Bargull andre.barg...@udo.eduwrote:

  ...
 Only exposing CanonicalizeLanguageTag does not seem useful to me without
 having access to IsStructurallyValidLanguageTag. Most likely a combined
 IsStructurallyValidLanguageTag + CanonicalizeLanguageTag function is
 necessary/wanted for most use cases.


Hmm.  I'm not sure I'd agree it's necessary.
 IsStructurallyValidLanguageTag makes sense as an abstract function because
you need to throw accordingly when an invalid tag is passed to the
constructors or methods.  However, it's still the developer's
responsibility to make sure their tags are valid during the development
process.  Canonicalisation would still throw an error if the tag is invalid.


  I don't see why you'd need to change CanonicalizeLocaleList at all. Just
 let it return the internal list as-is, and then define
 `Intl.canonicalizeLocaleList` like so:


Lists are internal, they aren't part of the ECMAScript language.  It makes
no sense to return an internal list to ECMAScript code unless you intend to
go the whole hog and specify them with a constructor/prototype.


 It also needs to be considered whether the duplicate removal in
 CanonicalizeLocaleList creates any issues for users of a potential
 `Intl.canonicalizeLocaleList` or `Intl.canonicalizeTags` function.


Perhaps.  Are there any cases you think of where removing duplicates would
be a problem?

Andy
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-13 Thread André Bargull


On 7/13/2013 8:48 PM, Andy Earnshaw wrote:

On Sat, Jul 13, 2013 at 1:05 PM, André Bargull andre.barg...@udo.edu
mailto:andre.barg...@udo.edu wrote:

...
Only exposing CanonicalizeLanguageTag does not seem useful to me
without having access to IsStructurallyValidLanguageTag. Most likely
a combined IsStructurallyValidLanguageTag + CanonicalizeLanguageTag
function is necessary/wanted for most use cases.


Hmm.  I'm not sure I'd agree it's necessary.
  IsStructurallyValidLanguageTag makes sense as an abstract function
because you need to throw accordingly when an invalid tag is passed to
the constructors or methods.  However, it's still the developer's
responsibility to make sure their tags are valid during the development
process.  Canonicalisation would still throw an error if the tag is invalid.


CanonicalizeLanguageTag isn't even defined for non-structurally valid 
language tags. That's why I meant a combined 
IsStructurallyValidLanguageTag + CanonicalizeLanguageTag function is 
more useful than access to the bare CanonicalizeLanguageTag function.





I don't see why you'd need to change CanonicalizeLocaleList at all.
Just let it return the internal list as-is, and then define
`Intl.canonicalizeLocaleList` like so:


Lists are internal, they aren't part of the ECMAScript language.  It
makes no sense to return an internal list to ECMAScript code unless you
intend to go the whole hog and specify them with a constructor/prototype.


The internal list structure is not returned to user code instead a 
possible `Intl.canonicalizeLocaleList` function is a simple wrapper 
around CanonicalizeLocaleList to perform the necessary conversion from 
list to array. That's exactly the point of the algorithm steps in my 
previous mail.





It also needs to be considered whether the duplicate removal in
CanonicalizeLocaleList creates any issues for users of a potential
`Intl.canonicalizeLocaleList` or `Intl.canonicalizeTags` function.


Perhaps.  Are there any cases you think of where removing duplicates
would be a problem?


I thought about use cases when a user assumes the i-th element of the 
output array is the canonicalised value of the i-th element in the input 
array. I can't tell whether this is a valid use case - I've only 
implemented ECMA-402, so I know a bit about the spec, but never actually 
used it in an application...





Andy



- André
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss


Re: Language Negotiation API

2013-07-13 Thread Norbert Lindenberg
On Jul 13, 2013, at 12:37 , André Bargull andre.barg...@udo.edu wrote:

 On 7/13/2013 8:48 PM, Andy Earnshaw wrote:
 On Sat, Jul 13, 2013 at 1:05 PM, André Bargull andre.barg...@udo.edu
 mailto:andre.barg...@udo.edu wrote:
 
   Only exposing CanonicalizeLanguageTag does not seem useful to me
   without having access to IsStructurallyValidLanguageTag. Most likely
   a combined IsStructurallyValidLanguageTag + CanonicalizeLanguageTag
   function is necessary/wanted for most use cases.
 
 
 Hmm.  I'm not sure I'd agree it's necessary.
 IsStructurallyValidLanguageTag makes sense as an abstract function
 because you need to throw accordingly when an invalid tag is passed to
 the constructors or methods.  However, it's still the developer's
 responsibility to make sure their tags are valid during the development
 process.  Canonicalisation would still throw an error if the tag is invalid.
 
 CanonicalizeLanguageTag isn't even defined for non-structurally valid 
 language tags. That's why I meant a combined IsStructurallyValidLanguageTag + 
 CanonicalizeLanguageTag function is more useful than access to the bare 
 CanonicalizeLanguageTag function.

Correct. As currently specified, the CanonicalizeLanguageTag abstract operation 
assumes that its input is a String value that's a structurally valid language 
tag. An API cannot make such assumptions - it has to be ready to deal with any 
input, as well as the absence of input. It has to do something like the steps 
in CanonicalizeLocaleList 8.c.ii-iv before calling the current 
CanonicalizeLanguageTag.

Before we get too much into spec details: Do others believe that exposing API 
as proposed by Zbigniew would be useful?

Norbert
___
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss