I will say personally, as an author, I am generally accustomed to using 
special characters in my regex character classes freely, and hearing that 
the HTML spec is making a backwards incompatible breaking change that would 
prevent me from doing so is kind of surprising, and personally i would 
probably find it a little bit annoying. Especially since it puts the HTML 
spec out of sync with every other major regex implementation that I know 
if, meaning that new authors would have a hard time checking their patterns 
for validity on any of the common regex visualization/testing sites. 

Additionally, what we're considering as "not real breakage" here is still a 
decreased user experience for users, right? As well as extra work for 
developers who's pages previously had perfectly acceptable UX (warned users 
about invalid values prior to submission) and now have pages where that 
feature no longer functions. 

Going off of the discussion on backwards compatibility in the original TC39 
proposal, 
https://github.com/tc39/proposal-regexp-v-flag#is-the-new-syntax-backwards-compatible-do-we-need-another-regular-expression-flag
 
it feels like a lot of the considerations in that discussion boiled down to 
"flags are an accessible and convenient way to opt in". But in this case 
the opposite is true—flags are completely unexposed by the HTML spec, so 
there's no easy and accessible way to opt in to the new behavior. It feels 
like some of the other options they considered (for example, a way of 
specifying flags "in pattern" like some engines allow for, or a prefix like 
\U[ to allow for set operations) would make much more sense when considered 
in the HTML usage context where flags aren't readily available to authors.

I guess if I had to summarize my concerns, it's that by making this change 
for all users with no opt in, we're adopting the letter of the TC39 
proposal but not the spirit, which leaves the HTML spec out of step with 
user expectations. 

However, I do understand the actual potential for breakage here is pretty 
small and it's the price we pay for a more featureful and expressive regex 
language. Has the HTML spec considered any small changes they could do to 
improve backwards compatibility and usability in this area, like falling 
back to parsing regexes as /u if /v parsing fails? True, that would encode 
legacy behavior in the platform where it's arguably not necessary to, but I 
feel like the improvement in developer ergonomics is worth the tradeoff.

On Wednesday, April 19, 2023 at 3:28:13 PM UTC-5 rby...@chromium.org wrote:

On Wed, Apr 19, 2023 at 3:41 PM Philip Jägenstedt <foo...@chromium.org> 
wrote:

I wonder if we can get enough confidence with less work than investigating 
40 randomly chosen sites from UseCounter hits.

This is a population proportion problem, and 
https://sample-size.net/confidence-interval-proportion/ is a useful tool. 
If you check 40 cases and find no breakage (N=40, x=0) that gives us 95% 
confidence that breakage is less than 7.2% of samples in this data set. If 
it's useful to check that much depends on the value of the use counter.

Is https://chromestatus.com/metrics/feature/timeline/popularity/4463 the 
right use counter, and has it reached stable yet? Why is marked as obsolete?

For purposes of illustration, let's use 0.04% from earlier in the thread 
and say we want to be (95%) confident that real breakage is less than 
0.01%. Then we just need to get below 25% in the linked tool, and checking 
11 samples and finding nothing is enough to do this.


I like your more sophisticated math, but it's <0.001% that we want I'm 
afraid, not 0.01%. So, if I'm following your instructions right, that's ~42 
samples to have 95% confidence :-)

Of course 0.001% is just a rough guideline that has often proven to be too 
high or too low. So this is all a judgement call anyway.

On Wed, Apr 19, 2023 at 5:43 PM Mathias Bynens <mt...@google.com> wrote:

Thanks for the guidance, Rick. I’ve prepared a CL moving the flag to 
status=experimental 
<https://chromium-review.googlesource.com/c/chromium/src/+/4447958> and 
I can commit to investigating 40 unique UseCounter hits and summarizing my 
findings. Fingers crossed the trend of “no actual breakage detected” 
continues. I’ll keep you posted.

On Wed, Apr 19, 2023 at 5:26 PM Rick Byers <rby...@chromium.org> wrote:

Thanks for doing a thorough compat analysis of this Mathias. I can totally 
see this being one where all the examples we can find don't seem to cause 
breakage in practice. I know it's a lot, but if we looked at 40 random 
examples and found none of them to break, that would suggest an upper bound 
of <0.001% of pages impacted (probably much lower) and I'd be OK giving 
this a shot with a finch killswitch ready in case of reports of serious 
breakage. Does that sound reasonable to you?

Also feel free to set your flag 
<https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/platform/runtime_enabled_features.json5;l=1912?q=HTMLPatternRegExpUnicodeSets%20file:.json5&ss=chromium>
 
to status=experimental, that'll get us some additional usage coverage (from 
the small population that runs with 
--enable-experimental-web-platform-features) and also signal that this is 
close to becoming shipping behavior.

Rick

On Mon, Apr 17, 2023 at 7:03 AM 'Mathias Bynens' via blink-dev <
blin...@chromium.org> wrote:

So far, none of the UseCounter hits I investigated constitute any actual 
breakage. The vast majority of hits seem to be login forms backed by 
server-side validation. I’ll keep looking though.

In the meantime, this feature is now 
<https://chromium-review.googlesource.com/c/chromium/src/+/4414859> 
available behind the `--enable-blink-features=HTMLPatternRegExpUnicodeSets` 
flag (disabled by default).

On Wednesday, April 5, 2023 at 5:53:10 PM UTC+2 Mathias Bynens wrote:

On Wed, Apr 5, 2023 at 5:23 PM Alex Russell <sligh...@chromium.org> wrote:

I don't understand why TAG review is not applicable for this intent.


Fair enough. I’ve filed a TAG review request here: 
https://github.com/w3ctag/design-reviews/issues/832 I’ll update the 
ChromeStatus entry to refer to it.

On Tuesday, April 4, 2023 at 5:21:16 AM UTC-7 mt...@google.com wrote:

Thanks to the UseCounter + UKM + M112 hitting Stable, more results are 
starting to come in. I’ll be collecting public examples of potential 
incompatibilities here: 
https://bugs.chromium.org/p/chromium/issues/detail?id=1412729#c11 So far 0 
out of the 2 examples cause any actual breakage — fingers crossed that 
trend continues.

On Mon, Apr 3, 2023 at 10:26 AM Philip Jägenstedt <foo...@chromium.org> 
wrote:

I took a look at https://github.com/whatwg/html/pull/7908 and it looks like 
there's agreement to merge it, but it's waiting on this intent to be 
approved. Normally we block in the other direction, but that's fine, as 
long as the spec change is merged.

Looks like there's broad support for this change, and it's just a question 
of the site compat risk. ~0.04% as an upper bound is quite high. Can we 
wait until the use counter is in stable and look at a random set of sites 
hitting the use counter to determine what the real-world breakage looks 
like?

On Fri, Mar 31, 2023 at 5:07 PM 'Mathias Bynens' via blink-dev <
blin...@chromium.org> wrote:

On Fri, Mar 31, 2023 at 4:35 PM Mike Taylor <mike...@chromium.org> wrote:

Hey Mathias,
On 3/31/23 5:56 AM, Mathias Bynens wrote:

Contact emails

mat...@chromium.org

Specification 

https://github.com/whatwg/html/pull/7908

Summary 

The <input pattern> attribute allows developers to specify a regular 
expression pattern against which the input’s values are checked for 
validity.

<label>

  Part number:

  <input pattern="[0-9][A-Z]{3}" name="part"

         title="A part number is a digit followed by three uppercase 
letters.">

</label>

When the pattern attribute was first implemented, these regular expressions 
were compiled without any RegExp flags. In 2014, the HTML Standard changed 
this by implicitly enabling the u flag for the pattern attribute, enabling 
better Unicode support (including support for Unicode character properties 
like \p{Letter}). This change shipped in Chrome 53. 
<https://chromestatus.com/feature/4753420745441280>

Now, we’re taking this to the next level by enabling the new RegExp v flag 
<https://v8.dev/features/regexp-v-flag> instead of u, enabling the use of 
set notation, string literal syntax, and Unicode properties of strings.

(Context: The RegExp v flag is a JavaScript language feature which 
previously went through the Blink Intents process and shipped in Chrome 112 
<https://chromestatus.com/feature/5144156542861312>. This new ChromeStatus 
entry is specifically about integrating it with the HTML pattern attribute.)

Blink component 

Blink>Forms 
<https://bugs.chromium.org/p/chromium/issues/list?q=component:Blink%3EForms>

Search tags 

unicode <https://chromestatus.com/features#tags:unicode>, regexp 
<https://chromestatus.com/features#tags:regexp>, pattern 
<https://chromestatus.com/features#tags:pattern>, validation 
<https://chromestatus.com/features#tags:validation>

TAG review 
TAG review status 

Not applicable

Risks 
Interoperability and Compatibility 

The spec patch at https://github.com/whatwg/html/pull/7908 lists the 
potentially breaking changes. Some patterns that previously would compile, 
now throw an early error with the v flag — specifically those with a 
character class including either an unescaped special character or a double 
punctuator.

We expect such patterns to be rare. To validate this assumption we’ve added 
a UseCounter called 
HTMLPatternRegExpUnicodeSetIncompatibilitiesWithUnicodeMode 
<https://chromestatus.com/metrics/feature/popularity#HTMLPatternRegExpUnicodeSetIncompatibilitiesWithUnicodeMode>
 
in M112, which tracks patterns in any JavaScript u RegExps generated via 
the HTML pattern attribute that would throw if they were used with the v 
flag. 

Importantly, note that any throwing pattern gracefully degrades — it simply 
behaves as if the pattern attribute wasn’t present, resulting in 
inputElement.validity.valid 
=== true for any input value. Consequently, the only compatibility risk is 
that some value/pattern combinations that would previously result in 
inputElement.validity.valid being false now result in it being true. Thus, 
for every UseCounter hit, it could still be that there is no actual 
breakage — the UseCounter just gives us the upper bound. The currently 
available data from Beta suggests the UseCounter hits for 0.0393% of Chrome 
page loads.

I'm somewhat curious to see how much that UseCounter will grow (if at all) 
when 112 goes to stable next week...

Me too, and FWIW I'd understand if you and the other API owners prefer to 
wait until there’s some data for Stable before responding to this Intent.

Do you have any concerns about certain inputs being sent to a server that 
might not have any backend validation, that would previously be prevented 
by the u-vintage validation?

That’s indeed the only scenario in which there would be breakage. So far we 
haven’t heard of such cases in the wild. (Arguably, such web pages are 
already broken, since DevTools could easily be used to remove the `pattern` 
attribute, or requests could be made with tools like `curl`.) FWIW, there 
was a similar discussion in this old blink-dev thread: 
https://groups.google.com/a/chromium.org/g/blink-dev/c/XUNMtri0tI4/m/mjPkwXKNAQAJ

I forgot to mention that we explicitly added a console warning in M112 for 
any `pattern` attribute values that would be affected by this change, to 
help developers prepare for the potential change. One developer reported 
seeing the warning and adjusting their `pattern` attribute values 
accordingly, but it’s unclear whether inaction would have really broken 
their web page: 
https://bugs.chromium.org/p/chromium/issues/detail?id=1412729#c7
 


Gecko: Positive (Mozilla standards position request 
<https://github.com/mozilla/standards-positions/issues/745>, implementation 
tracking issue <https://bugzilla.mozilla.org/show_bug.cgi?id=pattern-v>)

WebKit: Positive (WebKit standards position request 
<https://github.com/WebKit/standards-positions/issues/132>, implementation 
tracking issue <https://bugs.webkit.org/show_bug.cgi?id=pattern-v>)

Web developers: No signals

Other signals:

Debuggability 

The pattern attribute is already well-supported in DevTools and other 
tooling; no changes are necessary.

Will this feature be supported on all six Blink platforms (Windows, Mac, 
Linux, Chrome OS, Android, and Android WebView)? 

Yes

Is this feature fully tested by web-platform-tests 
<https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_platform_tests.md>
? 

Pull Request: https://github.com/web-platform-tests/wpt/pull/38547

Flag name 

N/A

Requires code in //chrome? 

False

Tracking bug 

https://bugs.chromium.org/p/chromium/issues/detail?id=1412729

Sample links 

https://mathiasbynens.be/demo/pattern-u-vs-v

Estimated milestones 

M114

Link to entry on the Chrome Platform Status 

https://chromestatus.com/feature/5149507107422208

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.

To unsubscribe from this group and stop receiving emails from it, send an 
email to blink-dev+...@chromium.org.


To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CADizRgaAq4FwzJbUqLQVo%2BQdd_V0PT7rBr510OGe8fenHA%3D3HQ%40mail.gmail.com
 
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CADizRgaAq4FwzJbUqLQVo%2BQdd_V0PT7rBr510OGe8fenHA%3D3HQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
.

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.

To unsubscribe from this group and stop receiving emails from it, send an 
email to blink-dev+...@chromium.org.


To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/c9571b2a-a35b-3824-0f37-c93a9bb522fc%40chromium.org
 
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/c9571b2a-a35b-3824-0f37-c93a9bb522fc%40chromium.org?utm_medium=email&utm_source=footer>
.

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.

To unsubscribe from this group and stop receiving emails from it, send an 
email to blink-dev+...@chromium.org.


To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CADizRgYxU2v2ANQgzNiLD%2B4P-qJHxzTYJfRDsKNCtY0Yb_0bdg%40mail.gmail.com
 
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CADizRgYxU2v2ANQgzNiLD%2B4P-qJHxzTYJfRDsKNCtY0Yb_0bdg%40mail.gmail.com?utm_medium=email&utm_source=footer>
.

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an 
email to blink-dev+...@chromium.org.
To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/bf73fe5b-fde2-42df-90f0-582a905d1948n%40chromium.org
 
<https://groups.google.com/a/chromium.org/d/msgid/blink-dev/bf73fe5b-fde2-42df-90f0-582a905d1948n%40chromium.org?utm_medium=email&utm_source=footer>
.

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to blink-dev+unsubscr...@chromium.org.
To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/a199cb55-99bb-406a-994e-ac6fe4179c97n%40chromium.org.

Reply via email to