Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On 2015-05-07 15:59, Boris Zbarsky wrote: On 5/7/15 7:16 AM, Rune Lillesveen wrote: This adds an implementation complexity to type selector matching. What's the rationale for matching the selector case-sensitively in the svg case? The idea is to allow the selector match to be done case-sensitively in all cases so it can be done as equality comparison on interned string representations instead of needing expensive case-insensitive matching on hot paths in the style system. (Note! This is veering a little off topic.) One way to cheapen the computational cost is to have partial case insensitive matching. If (character = $0041) And (character = $005A) character = (character | $0020) EndIf Basically if the character is 'A' to 'Z' then the 6th bit is set, thereby turning 'A' to 'Z' into 'a' to 'z' this works both for Ascii-7 and Latin-1 and Unicode (like UTF-8 for example). No need for table lookups, it can all be done in the CPU registers. Other commonly used characters like '0' to '9' or '_' or similar has no lower/upper case. And more language specific characters is not ideal for such use anyway (people of mixed nationalities wold have issues typing those characters). So there is no need to do full case insensitive matching. Just do a partial to lower case normalization of 'A' to 'Z' and then do a simple binary comparison. In optimized C or or ASM this should perform really well compared to calling a Unicode function to normalize and lower case the text. This would mean restricting to 'A' to 'Z', 'a' to 'z', '0' to '9, and '_' but all tags/elements/properties/whatever that I can recall seeing only ever use those characters. I certainly won't complain if I can't use the letter 'å' in the code, then again I never use weird characters in code in the first place. How does it look in the wild? If only A to Z is used in xx% of cases then restricting to that character range would allow very quick lowercasing and thus allow use of fast binary matching. -- Roger Hågensen, Freelancer, http://skuldwyrm.no/ -- Roger Hågensen, Freelancer, http://skuldwyrm.no/
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Fri, May 8, 2015 at 7:09 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: I don't think ascii case-insensitivity is a mistake here. (ASCII) case-insensitivity is a mistake. JavaScript doesn't have it and wherever we do have it it's sticking out as sore thumb with a dozen subtleties attached. -- https://annevankesteren.nl/
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Thu, May 7, 2015 at 10:42 PM, Anne van Kesteren ann...@annevk.nl wrote: On Thu, May 7, 2015 at 11:23 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: Well, beyond the existing conflicts of style, script, and a. (font too, but that's dropped from SVG2, so who cares.) textArea is out too? textArea was never in - it was an SVG Tiny element, and thus never appeared in any web-compatible version of SVG. With respect to case-insensitive matching, I don't really understand why we simultaneously want to make these rather trivial changes to SVG while at the same time move to constructors for creating elements, which is even stricter than createElementNS() (literal has to be correctly spelled or you get an exception). If we want to move to a world where people write new SVGRectElement why would we even bother making document.createElement(RECT) work? It's not about that, it's about making, say, lineargradient {...} work, because you can case selectors any which way for HTML, and it's confusing that you can't for SVG. (And I'm still not convinced on constructors, anyway - they're so verbose! And we would to have to fix so many of them!) Same for CSS, the majority of CSS already uses type selectors where the case matches up with the HTML. Is complicating it really worth it? We should bring the languages closer, but we shouldn't put the mistakes we made with HTML into SVG. I don't think ascii case-insensitivity is a mistake here. ~TJ
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Fri, May 8, 2015 at 10:13 AM, Anne van Kesteren ann...@annevk.nl wrote: On Fri, May 8, 2015 at 7:09 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: I don't think ascii case-insensitivity is a mistake here. (ASCII) case-insensitivity is a mistake. JavaScript doesn't have it and wherever we do have it it's sticking out as sore thumb with a dozen subtleties attached. It's all over CSS (every single language-defined keyword is CI), and all over a bunch of HTML. Whether you think it's a mistake or not, it's *common*, and having some tags use it while others don't is confusing. ~TJ
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On 5/8/15 11:56 AM, Roger Hågensen wrote: One way to cheapen the computational cost is to have partial case insensitive matching. If you're walking the string at all, you have already lost in terms of performance for this stuff. If (character = $0041) And (character = $005A) character = (character | $0020) EndIf Yes, this is basically the algorithm that would be considered expensive case-insensitive matching in this context. In optimized C or or ASM this should perform really well compared to calling a Unicode function to normalize and lower case the text. No one is even remotely considering anything Unicode here. -Boris
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Fri, May 8, 2015 at 9:09 AM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/8/15 11:56 AM, Roger Hågensen wrote: One way to cheapen the computational cost is to have partial case insensitive matching. If you're walking the string at all, you have already lost in terms of performance for this stuff. Looks like the patch that recently landed *did* exactly this - it would do a string-walk when matching tags/attributes against SVG elements. It was indeed slower, but it was just for SVG, and wasn't a huge hit. But then Elliot realized that we keep around a second pointer for the uppercased tagname anyway (some of the tagname accessors return uppercase for HTML), so he just rewrote it to store the proper casing there instead, for SVG elements. ^_^ Now it's no extra memory or runtime cost. ~TJ
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Thu, May 7, 2015 at 2:11 PM, Elliott Sprehn espr...@chromium.org wrote: On Thu, May 7, 2015 at 2:09 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/7/15 5:07 PM, Tab Atkins Jr. wrote: I believe the SVGWG is fine with a parsing-based approach, exactly like what HTML does. An SVG element created with mixed casing, or imported from an XML document, might not match a lowercase tagname selector, but SVG written in HTML will. Hmm. The main problem here is for scripts that create SVG elements in an HTML document, since those have to use createElementNS and pass the mixed-case names (e.g. for foreignObject). One idea could be to make createElement() return SVG elements for svg tag names embedded in HTML. Neither spec is ever going to have a conflicting tag name. Well, beyond the existing conflicts of style, script, and a. (font too, but that's dropped from SVG2, so who cares.) But the SVGWG already resolved to allow HTML elements inside of SVG, so they don't have to, say, add an svg:video element. (I'm on the hook to define the SVG layout model in terms of the CSS box model, which is trivial, but I haven't had the time to do it yet.) But yeah, the SVGWG resolved to never add new conflicts, and I assume HTML is similarly okay with not adding tagnames that SVG has already claimed. ~TJ
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Thu, May 7, 2015 at 2:13 PM, Roger Hågensen rh_wha...@skuldwyrm.no wrote: On 2015-05-07 13:16, Rune Lillesveen wrote: Currently, the HTML spec says that type selectors matches case sensitively for non-html elements like svg elements in html documents ... This adds an implementation complexity to type selector matching. What's the rationale for matching the selector case-sensitively in the svg case? Isn't SVG based on XML? Which means SVG is probably case sensitive! My questions are about SVG elements parsed by the HTML parser to create an HTML document. Parsing SVG tag names are case-insensitive in HTML documents, yet normalized to their camelCased form. -- Rune Lillesveen
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Thu, May 7, 2015 at 11:23 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: Well, beyond the existing conflicts of style, script, and a. (font too, but that's dropped from SVG2, so who cares.) textArea is out too? With respect to case-insensitive matching, I don't really understand why we simultaneously want to make these rather trivial changes to SVG while at the same time move to constructors for creating elements, which is even stricter than createElementNS() (literal has to be correctly spelled or you get an exception). If we want to move to a world where people write new SVGRectElement why would we even bother making document.createElement(RECT) work? Same for CSS, the majority of CSS already uses type selectors where the case matches up with the HTML. Is complicating it really worth it? We should bring the languages closer, but we shouldn't put the mistakes we made with HTML into SVG. -- https://annevankesteren.nl/
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On 5/7/15 7:16 AM, Rune Lillesveen wrote: This adds an implementation complexity to type selector matching. What's the rationale for matching the selector case-sensitively in the svg case? The idea is to allow the selector match to be done case-sensitively in all cases so it can be done as equality comparison on interned string representations instead of needing expensive case-insensitive matching on hot paths in the style system. Should we change the spec in this regard? To what, exactly? What is your proposed behavior here? -Boris
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Thu, May 7, 2015 at 8:26 AM, Boris Zbarsky bzbar...@mit.edu wrote: Note that at least for textArea this matters, in that you could suddenly have selectors that are not meant to match it start matching it. That's not part of SVG1.1 or SVG2; it's not supported on most (all?) major browsers anyway, so that's not a big deal. You mean case-sensitively in the implementation? Type selectors are case-insensitive for html elements. style * { color: green } foo { color: red } /style script var e = document.createElementNS(http://www.w3.org/1999/xhtml;, Foo); e.textContent = Is this red?; document.documentElement.appendChild(e); script If the matching were actually case-insensitive, the text would be red. It's not. The WebKit implementation represents each type selector with two strings, one lowered and one with original case, when the type selector is not lower-cased in the source. What does Gecko do? Exactly the same thing. To always match type selectors case-insensitively in html documents. I don't think that's acceptable to at least Gecko from a performance standpoint. Not if it means we have to end up with red text in my testcase above. I believe the SVGWG is fine with a parsing-based approach, exactly like what HTML does. An SVG element created with mixed casing, or imported from an XML document, might not match a lowercase tagname selector, but SVG written in HTML will. ~TJ
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On 5/7/15 5:07 PM, Tab Atkins Jr. wrote: I believe the SVGWG is fine with a parsing-based approach, exactly like what HTML does. An SVG element created with mixed casing, or imported from an XML document, might not match a lowercase tagname selector, but SVG written in HTML will. Hmm. The main problem here is for scripts that create SVG elements in an HTML document, since those have to use createElementNS and pass the mixed-case names (e.g. for foreignObject). -Boris
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Thu, May 7, 2015 at 2:09 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/7/15 5:07 PM, Tab Atkins Jr. wrote: I believe the SVGWG is fine with a parsing-based approach, exactly like what HTML does. An SVG element created with mixed casing, or imported from an XML document, might not match a lowercase tagname selector, but SVG written in HTML will. Hmm. The main problem here is for scripts that create SVG elements in an HTML document, since those have to use createElementNS and pass the mixed-case names (e.g. for foreignObject). One idea could be to make createElement() return SVG elements for svg tag names embedded in HTML. Neither spec is ever going to have a conflicting tag name. - E
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Thu, May 7, 2015 at 4:16 AM, Rune Lillesveen r...@opera.com wrote: Currently, the HTML spec says that type selectors matches case sensitively for non-html elements like svg elements in html documents [1]. So according to the spec, and the implementation in Gecko, the rules below matches according to the prose: !DOCTYPE html style foreignObject { color: red } foreignobject { color: green } /style foreignOBJECTMatches both rules. Case-insensitive match. Is green./foreignOBJECT svg FOREIGNobjectMatches the first rule because the parser normalizes to the camelCased form as per spec. Is red./FOREIGNobject /svg This adds an implementation complexity to type selector matching. What's the rationale for matching the selector case-sensitively in the svg case? I'm sorry if I've missed previous discussions. I did see a few really long mails and threads about svg in html on this list, but weren't able to find the resolution for this in reasonable time. Should we change the spec in this regard? I did have a correct implementation for Blink, but was asked in the review [2] to match insensitively for non-html elements in html documents due to the complexity. [1] https://html.spec.whatwg.org/multipage/scripting.html#selectors [2] https://codereview.chromium.org/1099963003 The SVGWG just resolved to allow SVG elements and attributes to be matched case-insensitively, like HTML. We also resolved to try and find a method to make parsing SVG case-insensitive in general. For the latter, we're specifically interested in doing so when SVG is embedded in HTML, and also having some way to indicate that a standalone SVG should parse using similar rules (either a modification of the HTML parser, or Anne's XML5 work, or something similar). When written in the XML syntax, it's fine for it to continue being case-sensitive as it is today. I presume this means that we'd start returning lowercased tagnames for element in SVG-in-HTML. Do we think that's a compat risk? If so, perhaps we can do something smaller, like having the localName/nodeName/tagname attributes be getters that camelcase those tagnames, but internally they're lowercased by the parser. (It looks like localName and tagname already differ in casing in HTML, so either we store two strings, or we do the casing transformation in a getter already.) ~TJ
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On 5/7/15 10:53 AM, Rune Lillesveen wrote: So there's no author-rationale here? Well... that depends. The way things used to work before SVG-in-HTML existed is that selector matching was case-sensitive in SVG and apppeared case-insensitive in HTML. I say appeared because it wasn't, actually: if you created an HTML element with local name Foo (e.g. via createElementNS or importNode from an XHTML document) then the selector foo would not match it in at least some implementations. When we were trying to design how this should work in a document with mixed HTML and SVG, we aimed to preserve the existing behaviors of HTML and SVG while also not forcing UAs into doing any actual case-insensitive matching at selector-matching time, which allows much faster compares on pre-interned strings. The depends part is whether you consider aimed to preserve the existing behaviors of HTML and SVG as author rationale or not. Note that at least for textArea this matters, in that you could suddenly have selectors that are not meant to match it start matching it. You mean case-sensitively in the implementation? Type selectors are case-insensitive for html elements. style * { color: green } foo { color: red } /style script var e = document.createElementNS(http://www.w3.org/1999/xhtml;, Foo); e.textContent = Is this red?; document.documentElement.appendChild(e); script If the matching were actually case-insensitive, the text would be red. It's not. The WebKit implementation represents each type selector with two strings, one lowered and one with original case, when the type selector is not lower-cased in the source. What does Gecko do? Exactly the same thing. To always match type selectors case-insensitively in html documents. I don't think that's acceptable to at least Gecko from a performance standpoint. Not if it means we have to end up with red text in my testcase above. -Boris
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On Thu, May 7, 2015 at 3:59 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 5/7/15 7:16 AM, Rune Lillesveen wrote: This adds an implementation complexity to type selector matching. What's the rationale for matching the selector case-sensitively in the svg case? The idea is to allow the selector match to be done case-sensitively in all cases so it can be done as equality comparison on interned string representations instead of needing expensive case-insensitive matching on hot paths in the style system. So there's no author-rationale here? You mean case-sensitively in the implementation? Type selectors are case-insensitive for html elements. The WebKit implementation represents each type selector with two strings, one lowered and one with original case, when the type selector is not lower-cased in the source. What does Gecko do? Should we change the spec in this regard? To what, exactly? What is your proposed behavior here? To always match type selectors case-insensitively in html documents. -- Rune Lillesveen
Re: [whatwg] Case-sensitivity of CSS type selectors in HTML
On 2015-05-07 13:16, Rune Lillesveen wrote: Currently, the HTML spec says that type selectors matches case sensitively for non-html elements like svg elements in html documents ... This adds an implementation complexity to type selector matching. What's the rationale for matching the selector case-sensitively in the svg case? Isn't SVG based on XML? Which means SVG is probably case sensitive! I found some info here, not sure if that'll help clarify anything. http://www.w3.org/TR/SVG/styling.html#CaseSensitivity PS! This is why I always make it a rule to type lowercase for anything that will possibly be machine read (file names, properties/attributes), it also compresses better (lower case letters are more frequent). -- Roger Hågensen, Freelancer, http://skuldwyrm.no/