Does there exist any string where an old browser using old rules would decide that a <module> is closed at one place, but a new browser following the rules you propose would decide that the <module> is closed at a different place?
On Fri, Jun 13, 2014 at 9:15 AM, Domenic Denicola < [email protected]> wrote: > Thanks Scott; much appreciated. > > IMO it would be a good universe where `<module>` had the following things > `<script>` has: > > - Does not require escaping < > & ' " in any contexts. > - Terminates when seeing `</module` + extra chars. (Possibly we could do > this only when it would otherwise be a parsing error, to avoid `"</mod" + > "ule>"` grossness? But that would require some intertwingling of the HTML > and ES parsers, which I can imagine implementers disliking.) > > But it removes the following things `<script>` has: > > - `<!--` escaped data mode and double-escaped mode > - \r, \r\n, \0 special-casing > - The two new single-line comment forms (maybe; I know these work in Node > though, so maybe just leave them in as part of the ES6 spec). > > Although I know some people think making `<script>` and `<module>` have > different rules would be confusing for authors, IMO this would be a nice > authoring experience. > ________________________________________ > From: [email protected] <[email protected]> on behalf of C. Scott > Ananian <[email protected]> > Sent: Friday, June 13, 2014 12:06 > To: Domenic Denicola > Cc: Mark S. Miller; es-discuss; Ben Newman > Subject: Re: 5 June 2014 TC39 Meeting Notes > > On Thu, Jun 12, 2014 at 11:11 AM, Domenic Denicola > <[email protected]> wrote: > > I guess part of it is clarifying which part of "<script>'s insane parsing > > rules" we're talking about. From what I'm aware of there are quite a lot > of > > different insanities; but I am fuzzy on the details. Does anyone know > which > > rules are inherently necessary, and which are historical accidents or > > constraints? > > I'll recap the rules for "script data state" from > > http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#script-data-state > > As a general rule, `\r` and `\r\n` are converted to `\n`, and `\0` is > not allowed. > The case-insensitive sequence `</script` followed by a character in `[ > \t\r\n\f/>]` terminates the script data section. > (These constraints would be present for HTML-embedding.) > > In addition, the exact character sequence `<!--` switches to "escaped > data" parsing. This is a bit hairy, and you can even end up in > "double escaped" modes. See > http://stackoverflow.com/questions/23727025/script-double-escaped-state > for an example. Presumably these are the "insane parsing rules" under > discussion. You are encouraged to try to follow the logic in the > WHATWG spec yourself. ;) > > In addition, [Web EcmaScript](http://javascript.spec.whatwg.org/) > introduces two new single line comment forms: `<!--` must be treated > as if it were `//`, and `-->` (with some crazy start-of-line > restrictions) is also treated as a single line comment. > > To some degree the line between the HTML parser and Web EcmaScript is > movable; currently the HTML parser recognizes the `<!--` etc tokens > but pushes them into the data section of the script tag anyway; one > could just as easily imagine the HTML parser doing all the work and > stripping the "new comment forms" from the token stream. > --scott > -- Cheers, --MarkM
_______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

