+1

> On 23/6/16 上午10:34, Naoto Sato wrote:

I see that you tested thoroughly :-)

Naoto

On 6/23/16 1:13 PM, Brent Christian wrote:
On 23/6/16 上午10:34, Naoto Sato wrote:
I reviewed webrev.04 thoroughly this time and comments below:

- Although the OSX API returns the cases as in the spec, BCP 47 language
tags are case insensitive, so it would be  better to check [a-z] in line
94/95 as well.

Done (and I've simplified the assert code to use isalpha() and isdigit()).

- Line 82: Does the API actually returns "zh-Hans_HK"? Underscore ('_')
is not a part of language tags.

Ah, thanks.  No, "zh-Hans_HK" is not returned by
CFLocaleCopyPreferredLanguages().  We do see it in the default switch
case when we call CFLocaleGetIdentifier(), which I guess was the source
of my confusion.

I've removed that comment line, and reworked the comment a little bit.

http://cr.openjdk.java.net/~bchristi/7131356/webrev.05/

Thanks,
-Brent

On 6/23/16 8:24 AM, Naoto Sato wrote:
On 6/22/16 9:48 PM, Brent Christian wrote:
On 6/22/16 3:58 PM, Naoto Sato wrote:
Hi Brent,

1. It seems that the display language in the new code seems to have
some
problems. I see (in es-419.txt):

---
locale[Default|Display|Format].getLanguage () is
user.language: it is
---

And in fact, display strings are no longer in Spanish in the new
version
(as the language is "is").

Strange - something in your local environment?  It looks right to me.

Never mind. Thanks to Chrome, it was automatically converted to English
:-) Funny thing was that the browser did not translate to English for
es-419.old.txt...


2. I think mapping language/script combination to a typical locale
is ok
to keep the compatibility (e.g., "sr-Latn" to "sr_CS", done by the
JRS,
right?) In the meantime, I would like to have "user.script" set to
"Latn" in such a case.

Is this something that I could file a follow-on issue for?  The
existing
code doesn't set "user.script" in these cases, either, and it looks
like
that would take some digging into java_props_md.c...

Fine by me.

Naoto


-Brent

On 6/22/16 2:45 PM, Brent Christian wrote:
On 6/21/16 3:27 PM, Naoto Sato wrote:
Actually, j.u.Locale class' "country" code is defined as ISO-3166
alpha-2 *or* UN M.49 numeric-3 area code, so if the OSX's underlying
setting is "es-419" for the preferred language, "user.country"
should be
"419".

OK, thanks - looks like another latent locale bug on Mac.  I've
uploaded
output comparing the old [1] and new[2] behavior WRT java.util.Locale
under Spanish (Latin America).  It looks correct to me, based on my
reading of Locale.toString()/getCountry()toLanguageTag().

The updated webrev is here:
http://cr.openjdk.java.net/~bchristi/7131356/webrev.04/

The code now works with ISO-3166 alpha-2 and UN M.49 numeric-3 region
designators.  The es-419 mapping is no longer needed in locale_str.h.

Thanks!
-Brent

1.
http://cr.openjdk.java.net/~bchristi/7131356/webrev.04/es-419.old.txt
2. http://cr.openjdk.java.net/~bchristi/7131356/webrev.04/es-419.txt

On 6/21/16 3:17 PM, Brent Christian wrote:
Hi, Naoto

"Spanish (Latin America)" already works the same as it did before
the
change - it maps to "es_XL".  Because "es-419" has more than 2
characters following the '-', no change is made and
getMacOSXLocale(LC_MESSAGES) returns "es-419".  I added a mapping
for
"es-419" -> "es_XL" in locale_str.h.

System property values are (still) set to:
  user.language: es
  user.country: XL
  user.country.format: (2-char country code for the selected
Region)

Thanks,
-Brent

On 21/06/16 14:48, Naoto Sato wrote:
Hi Brent,

I looked at the system preference setting panel and noticed
there is
"Spanish (Latin America)", which I think uses UN M.49 code
("es-419") as
the designate region. Can you make changes to deal with it?
(sorry I
should have noticed this earlier, and it's unfortunate not to be
able to
upcall Locale.forLanguageTag()!)

Naoto

On 6/20/16 4:38 PM, Brent Christian wrote:
Hi,

I have an updated webrev at:
http://cr.openjdk.java.net/~bchristi/7131356/webrev.03/

The comments have been updated as Gerard suggested.

The code to process the languageString now accounts for
3-character
language codes, and 4-char script designations (line 86).

More extensive testing of languages with multiple scripts/regions
revealed that more changes were needed to maintain current
behavior.
Without knowing the internal workings of how
JRSCopyCanonicalLanguageForPrimaryLanguage adjusts the language
ID, I
believe the best options are to add a few more mappings as
needed to
locale_aliases (locale_str.h), and to add a special case for
Portuguese
(line 104).

OS X has 3 language options for Portuguese:
Portugues (Portugal)
Portugues (Brasil)
Portugues (Brasileiro)

CFLocaleCopyPreferredLanguages() gives the expected language code
for
Portugues (Brasileiro) ("pt-BR"), but "Portugues (Brasil)"
doesn't
include a region code (it's simply, "pt"), so gets mapped to the
default
country, Portugal.  To maintain current behavior, I added a
special
case
to map "pt" to "pt_BR" when the Region system preference is
set to
Brazil.


This change introduces one notable behavior change, which I
argue is
actually a fix.  The bug report and test case do not list the
"Spanish
(Mexico)" language, but it is present on MaxOS 10.9
(presumably it
was
added to the OS since the original investigation).  The old code
mapped
this to "es_XL", XL being one of the "user-assigned" ISO 3166
country
codes.  The new code maps to "es_MX", MX being the country code
for
Mexico.


I've tested pretty thoroughly on MacOS 10.9; I'm pretty sure I
tried
every language that includes multiple scripts/locales.  I also
added a
couple updated tests to the bug report.

General testing has looked good so far.

Thanks,
-Brent

Reply via email to