On 9/22/25 11:20 a.m., Philip Jägenstedt wrote:
On Thu, Sep 18, 2025 at 3:29 PM Philip Jägenstedt <[email protected]> wrote:

    On Thu, Sep 18, 2025 at 2:51 PM Mike Taylor
    <[email protected]> wrote:

        On 9/18/25 5:54 a.m., Philip Jägenstedt wrote:

        Some additional context. Upgrading ICU can break sites and it
        cannot be done with a flag because of the size of the library
        (two copies would be needed). To mitigate the risk, we'd like
        to use the Blink launch process going forward. Where we can
        identify a risk ahead of time, we can add a targeted flag for
        that, as we've done for Italian number formatting here.

        On Thu, Sep 18, 2025 at 11:40 AM Chromestatus
        <[email protected]> wrote:

            *Contact emails*
            [email protected], [email protected]

            *Explainer*
            None

            *Specification*
            https://tc39.es/ecma402

            *Design docs*

            https://unicode-org.github.io/icu/download/77.html
            https://cldr.unicode.org/downloads/cldr-46
            https://www.unicode.org/versions/Unicode16.0.0

            *Summary*
            ICU is not a feature itself, but the third-party library
            we use for general Unicode support. We are using the
            Blink launch process because there is web compat risk and
            security considerations. The upgrade is from ICU 74.2 to
            ICU 77.1, the current latest release. ICU 77 contains
            CLDR 46 and other changes to support Unicode 16. The
            web-exposed changes are mainly the Intl and RegExp APIs,
            IDNA rules for URLs, and text segmentation. Intl and
            RegExp (V8): Lots of small changes. The change of Italian
            number formatting is the riskiest and has a dedicated
            flag, see compat risk section. IDNA: Generally more
            things are allowed, and this upgrade improves our overall
            test results in WPT. Text segmentation: The most
            interesting change is better Japanese line breaking when
            using `word-break: auto-phrase`, related to
            https://chromestatus.com/feature/5133892532568064. All
            test changes are explained in
            
https://docs.google.com/document/d/1lrfJJmWvLXYPYSYlxE3mXTgDZI9U1bw2FrJYrDorgqE/edit?usp=sharing

        This doc mentions changes to "added comma in en-GB in a date
        format" - where's the best place to see what that change looks
        like? This kind of change sounds pretty similar to
        https://issues.chromium.org/issues/40256057 (or
        go/omg-1414292-pm <http://go/omg-1414292-pm> if you can read
        it (apologies to non-googlers)), and the type of things to
        easily break regular expressions


    It shows up in two places in test changes in
    https://chromium-review.googlesource.com/c/chromium/src/+/6578333.

    The first is base/i18n/time_formatting_unittest.cc with these are
    the changes:

    "Saturday 30 April 2011 at 15:42:07" → "Saturday, 30 April 2011 at
    15:42:07
    "Saturday 30 April 2011" → "Saturday, 30 April 2011"

    The other
    is third_party/blink/web_tests/webexposed/intl-date-time-format-expected.txt
    with this change:

    "Wednesday 14 June 2023 at 14:50:00 British Summer Time" →
    "Wednesday, 14 June 2023 at 14:50:00 British Summer Time"

    I agree that this carries some risk. On the "probably fine" part
    of the ledger we have:

      * It aligns more with en-US where there is a comma for the
        "full" format, as well as some of the shorter ones
      * Firefox shipped it and didn't get any regression reports about
        this. I can't spot anything in my own search
        
<https://bugzilla.mozilla.org/buglist.cgi?query_format=specific&order=relevance+desc&bug_status=__all__&product=&content=en-gb+comma&comments=0>
        either.

    https://issues.chromium.org/issues/40256057 is precisely the
    reason why nobody has updated ICU for a long time and why it's
    important to have enough checks in place to make such breakage
    much less likely.

    Putting a change like this behind a flag is a bit fraught because
    we're writing some new code to emulate what older versions of ICU
    did, and can't be 100% confident that it's correct for all
    combinations. I will dig up the CLDR change behind this and see if
    it's controllable by ICU inputs, as was luckily the case for the
    Italian number formatting change.


The CLDR change for this is https://github.com/unicode-org/cldr/pull/3879. I reached out to my CLDR expert colleagues and learned that the origin of that change is feedback from the CLDR Survey Tool, where linguists propose and review changes once per year. The comment was:

    The way I believe we used to render these strings is like this:
    For en-GB, there should be no comma in any date string UNLESS it
    consists of 4 date fields or more, and then the comma would go
after the day of the week, so:
    "29 June 2023”, “Thu 29 June”, "Thu 29/06" and "Thursday 29 June",
    but "Thursday, 29 June 2023” and "Thursday, 29 June 2023 AD”, for
    example.


To see which locales are impacted I tested more English locales:
https://chromium-review.googlesource.com/c/chromium/src/+/6973556

Comparing the results before/after the ICU upgrade the locales that got the extra comma are en-AU, en-GB, and en-IN. (The CLDR PR touched them all, as well as en-CA for another change, see below.)

It should be possible to undo this change behind a flag, by removing any commas if the date style is "full" for en-AU, en-GB, and en-IN.

This also revealed a few more minor changes to English locales:

  * en-CA changes "Pacific Daylight Saving Time" to "Pacific Daylight
    Time", matching en-US
  * en-AU removes a period in narrow duration format, from "2h 35s."
    to "2h 35s", matching all other en-* variants tested
  * en-ZA changes the thousands grouping and decimal separator, from
    "12,345.00" to "12 345,00". It's the only tested English variant
    with this format, but many European variants like en-FR and en-UA
    have the same style.

None of these seem risky enough to warrant putting them behind a flag.

While testing things I also noticed that Safari 18.6 on macOS now uses ICU 76 or later. Everything except the en-CA PDT change was confirmed already shipped in Safari:

  * Italian number formatting
  * Added comma for en-AU, en-GB, and en-IN
  * en-AU duration format change
  * en-ZA number formatting changes

Shipping last reduces the risk quite a lot, but I'll still add a flag for the en-* comma change just to be safe. The other changes don't seem worth the flag, but I can try if requested.

Thanks Philip for describing the scope of the changes!

LGTM1 with this additional flag for en-*, as a precautionary measure.

--
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/dc20b0db-481d-4f9b-8a35-d73fe3097d93%40chromium.org.

Reply via email to