Re: [8] Review request for JEP 127: Improve Locale Data Packaging and Adopt Unicode CLDR Data

Masayoshi Okutsu Tue, 14 Aug 2012 19:04:39 -0700


On 8/15/2012 1:52 AM, Steven R. Loomis wrote:

On Tue, Aug 14, 2012 at 2:37 AM, Masayoshi Okutsu<[email protected] <mailto:[email protected]>> wrote:


    On 8/14/2012 2:25 PM, Steven R. Loomis wrote:

        Naoto,
          okay, thought I was done for the night, but just two more
        things..

        - again on the "talk to us" category.. Sun already wrote one LDML
        converter, and contributed to another. They're part of the
        CLDR toolset and
        work with OOo and Solaris data.

        - also, it appears that the new converter doesn't handle
        aliases at all, or
        parentLocales. You're guaranteed to get the wrong answer.

        - Some of the processing (such as for Norwegian) and in other
        places seems
        to be very .. hardcoded and fragile.


    These are limitations of the existing parser. I've briefly checked
    the output, but I will need to work on the parser more.
    Please note that we use the existing JRE classes (runtime) for
    CLDR support, not ICU4J. My understanding is that CLDR is after
    all the data part of ICU. A lot of adjustments need to be made to
    use the JRE classes.

No, that is not correct. First, CLDR is consumed by a number of otherpackages, besides ICU, including most recently TwitterCLDR. ICU isused in the development of CLDR.You could take the opportunity to inflence CLDR to benefit the JRE byproviding input into the CLDR process.

That's not my point. As you know, IBM took the JDK 1.3 source code asthe basis for ICU4J. After that IBM made some incompatible changes (fromJDK), including deprecating functionality. Then, CLDR data was adjustedwith the ICU4J changes.

Also, I was not referring to using the ICU data generator ( inorg.unicode.cldr.icu ) but the parser and utility, (org.unicode.cldr.util - particularly, CLDRFile ).


I wasn't, either.

        - Are you aware of the fact that CLDR 22 is nearly released?


    Yes.


          Has there been
        any testing with the interim data, or any plans to do so?


    Currently we have no plan to use 22 in JDK 8. There are still tons
    of work to finish for JDK 8, including fixing ancient bugs.
It's ironic and unfortunate timing, to independently pull in 21 atthis point. The data input in 21 was from the 2.0 release, (2011-May-25 ), which by 2013 will be two years old.

This kind of things will happen if external specs, data, whatever areincorporated into another product. CLDR in JDK isn't special.

        I think the summary again is, talk to us.  Where "us" is the
        CLDR technical
        committee.


    Thanks for the suggestion, but do you mean it's risky to create
    something from the spec and its implementation (data)?
It's not an unacceptable risk, but it may be an unnecessary one towork in isolation. The parser does not match the spec in a number ofareas. As I noted, I myself have been a bit absent from thesediscussions, both physically and in catching up on the i18n-dev maildigests. But I hope that more conversation will be mutually beneficial.

CLDR is written in XML. If its spec is well defined and stable, what'sthe problem (risk) to write an XML parser to convert the XML files toanother format?


Thanks,
Masayoshi

Re: [8] Review request for JEP 127: Improve Locale Data Packaging and Adopt Unicode CLDR Data

Reply via email to