On 8/13/20 7:05 PM, Frank Tang (譚永鋒) wrote:
On Thu, 13 Aug 2020 at 02:19, Emilio Cobos Álvarez <[email protected]
<mailto:[email protected]>> wrote:
On 8/13/20 12:17 AM, Frank Tang wrote:
[Note: Resent due to message header problem. Sorry]
Contact emails
[email protected] <mailto:[email protected]>, [email protected]
<mailto:[email protected]>
Explainer
https://github.com/tc39/proposal-intl-segmenter
<https://github.com/tc39/proposal-intl-segmenter>
Specification
https://tc39.github.io/proposal-intl-segmenter/
<https://tc39.github.io/proposal-intl-segmenter/>
Design docs
https://docs.google.com/document/d/1xugLpLmgRFnNXK8ztariTAbD2IXueDw1T3VNuuZCz8k/edit#heading=h.xgjl2srtytjt
<https://docs.google.com/document/d/1xugLpLmgRFnNXK8ztariTAbD2IXueDw1T3VNuuZCz8k/edit#heading=h.xgjl2srtytjt>
https://docs.google.com/presentation/d/1X2zBU3bZ4ergVMWfubCsdnHFzeaDgqiTRJVgvNGjQBs/edit#slide=id.p
<https://docs.google.com/presentation/d/1X2zBU3bZ4ergVMWfubCsdnHFzeaDgqiTRJVgvNGjQBs/edit#slide=id.p>
TAG review
review by ECMA402
Summary
Intl.Segmenter implements methods for finding the location of
boundaries in text, including grapheme, line, word and sentence
boundary analysis.
Motivation
Currently, chrome is shipped with Intl.v8BreakIterator - a non
standard way for similar functionality. According to
https://www.chromestatus.com/metrics/feature/timeline/popularity/556
<https://www.chromestatus.com/metrics/feature/timeline/popularity/556> on
2020 Feb there are 0.74% of the web page use it. Intl.Segmenter
is the web standard to replace it.
Risks
Interoperability and Compatibility
The specification is moved to Stage 3 in TC39 2020-Jul meeting
with support from ECMA402.
/Gecko/: In development
(https://bugzilla.mozilla.org/show_bug.cgi?id=1423593
<https://bugzilla.mozilla.org/show_bug.cgi?id=1423593>)
FWIW, in development seems a bit of a stretch since there hasn't
been activity in the bug for a while.
The main reason is there is a long discussion of the approach in the
spec and the spec was moved from Stage 3 back to stage 3 last 2 years
for the new champion to revise it. It finally reach a better shape and
moved to Stage 3 in TC39 in July meeting after getting folks from
Mozilla supporting during monthly ECMA402 meeting.
Sure, to be clear, I wasn't trying to push back. Jbliust wanted to point
out that it doesn't seem to be worked on right now, so "in development"
doesn't seem quite accurate. "Positive" seems like a more accurate
description per this document
<https://docs.google.com/document/d/1xkHRXnFS8GDqZi7E0SSbR3a7CZsGScdxPUWBsNgo-oo/edit#>?
The patch is three years old and there was a bit of a concern due
to the binary size growing quite a bit
<https://bugzilla.mozilla.org/show_bug.cgi?id=1423593#c9>.
(Not an expert on this, but Gecko's layout engine doesn't use ICU
for line-breaking,
I know, I hand wrote that between 1998-2002 when I was Mozilla's i18n
module owner and Netscape i18n client manager. They have not changed
that part code for the last 20+ years as I know (beside the work I
worked with some Thai folks in late 2003 to deal with Thai line
breaking). The current version of Intl.Segmenter spec took out the
line break support two years ago so that is irrelevant anyway.
That's an awesome bit of trivia, thanks :)
FWIW, it seems that for a bunch of more complex languages (Thai
included) nowadays we rely on platform-native APIs instead (see bug
389520 <https://bugzilla.mozilla.org/show_bug.cgi?id=389520>, bug 336959
<https://bugzilla.mozilla.org/show_bug.cgi?id=336959>, bug 390048
<https://bugzilla.mozilla.org/show_bug.cgi?id=390048>, etc).
-- Emilio
IIRC, so a lot of the ICU data that would be required for this has
to be imported).
I reduced the ICU break rule table size by ~50% in
https://github.com/unicode-org/icu/pull/1100
<https://github.com/unicode-org/icu/pull/1100> so the data size for
break iterator in ICU68 scheduled to be released in late Oct 2020 will
be ~230K less than 67.
There may be alternative implementation strategies or what not,
but it doesn't seem to be actively worked on.
/WebKit/: No signal
/Web developers/: No signals
Ergonomics
Engineer from Apple believe we should not add line break support
to the Intl.Segmenter because the developer may abuse the API and
perform text layout by themselves instead of depending on CSS.
The line break feature then were removed from the specification
in the current shape.
Will this feature be supported on all six Blink platforms
(Windows, Mac, Linux, Chrome OS, Android, and Android
WebView)?
Yes
Is this feature fully tested by web-platform-tests
<https://chromium.googlesource.com/chromium/src/+/master/docs/testing/web_platform_tests.md>?
Yes
https://github.com/tc39/test262/tree/master/test/intl402/Segmenter
<https://github.com/tc39/test262/tree/master/test/intl402/Segmenter>
Link to entry on the Chrome Platform Status
https://www.chromestatus.com/feature/6099397733515264
<https://www.chromestatus.com/feature/6099397733515264>
This intent message was generated by Chrome Platform Status
<https://www.chromestatus.com/>.
--
You received this message because you are subscribed to the
Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [email protected]
<mailto:[email protected]>.
To view this discussion on the web visit
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOcELL-m7DZ5OAwZj9FqX9VKZKWYd_Qf0YeaXCs3YAEbcnPsKA%40mail.gmail.com
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOcELL-m7DZ5OAwZj9FqX9VKZKWYd_Qf0YeaXCs3YAEbcnPsKA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
Frank Yung-Fong Tang
譚永鋒 / 🌭🍊
Sr. Software Engineer
--
--
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/v8-dev/67d8ede1-baa0-b82b-b33a-efa80e1d0e79%40mozilla.com.