Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-08 Thread Roger Hågensen

On 2015-05-07 15:59, Boris Zbarsky wrote:

On 5/7/15 7:16 AM, Rune Lillesveen wrote:

This adds an implementation complexity to type selector matching.
What's the rationale for matching the selector case-sensitively in the
svg case?


The idea is to allow the selector match to be done case-sensitively in
all cases so it can be done as equality comparison on interned string
representations instead of needing expensive case-insensitive matching
on hot paths in the style system.


(Note! This is veering a little off topic.)


One way to cheapen the computational cost is to have partial case 
insensitive matching.


If (character = $0041) And (character = $005A)
character = (character | $0020)
EndIf


Basically if the character is 'A' to 'Z' then the 6th bit is set, 
thereby turning 'A' to 'Z' into 'a' to 'z' this works both for Ascii-7 
and Latin-1 and Unicode (like UTF-8 for example). No need for table 
lookups, it can all be done in the CPU registers.


Other commonly used characters like '0' to '9' or '_' or similar has no 
lower/upper case. And more language specific characters is not ideal for 
such use anyway (people of mixed nationalities wold have issues typing 
those characters).


So there is no need to do full case insensitive matching. Just do a 
partial to lower case normalization of  'A' to 'Z' and then do a 
simple binary comparison.
In optimized C or or ASM this should perform really well compared to 
calling a Unicode function to normalize and lower case the text.


This would mean restricting to 'A' to 'Z', 'a' to 'z', '0' to '9, and 
'_' but all tags/elements/properties/whatever that I can recall seeing 
only ever use those characters.
I certainly won't complain if I can't use the letter 'å' in the code, 
then again I never use weird characters in code in the first place.


How does it look in the wild? If only A to Z is used in xx% of cases 
then restricting to that character range would allow very quick 
lowercasing and thus allow use of fast binary matching.



--
Roger Hågensen, Freelancer, http://skuldwyrm.no/


--
Roger Hågensen, Freelancer, http://skuldwyrm.no/


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-08 Thread Anne van Kesteren
On Fri, May 8, 2015 at 7:09 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 I don't think ascii case-insensitivity is a mistake here.

(ASCII) case-insensitivity is a mistake. JavaScript doesn't have it
and wherever we do have it it's sticking out as sore thumb with a
dozen subtleties attached.


-- 
https://annevankesteren.nl/


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-08 Thread Tab Atkins Jr.
On Thu, May 7, 2015 at 10:42 PM, Anne van Kesteren ann...@annevk.nl wrote:
 On Thu, May 7, 2015 at 11:23 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 Well, beyond the existing conflicts of style, script, and a.
 (font too, but that's dropped from SVG2, so who cares.)

 textArea is out too?

textArea was never in - it was an SVG Tiny element, and thus never
appeared in any web-compatible version of SVG.

 With respect to case-insensitive matching, I don't really understand
 why we simultaneously want to make these rather trivial changes to SVG
 while at the same time move to constructors for creating elements,
 which is even stricter than createElementNS() (literal has to be
 correctly spelled or you get an exception). If we want to move to a
 world where people write

   new SVGRectElement

 why would we even bother making

   document.createElement(RECT)

 work?

It's not about that, it's about making, say, lineargradient {...}
work, because you can case selectors any which way for HTML, and it's
confusing that you can't for SVG.

(And I'm still not convinced on constructors, anyway - they're so
verbose! And we would to have to fix so many of them!)

 Same for CSS, the majority of CSS already uses type selectors where
 the case matches up with the HTML. Is complicating it really worth it?
 We should bring the languages closer, but we shouldn't put the
 mistakes we made with HTML into SVG.

I don't think ascii case-insensitivity is a mistake here.

~TJ


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-08 Thread Tab Atkins Jr.
On Fri, May 8, 2015 at 10:13 AM, Anne van Kesteren ann...@annevk.nl wrote:
 On Fri, May 8, 2015 at 7:09 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 I don't think ascii case-insensitivity is a mistake here.

 (ASCII) case-insensitivity is a mistake. JavaScript doesn't have it
 and wherever we do have it it's sticking out as sore thumb with a
 dozen subtleties attached.

It's all over CSS (every single language-defined keyword is CI), and
all over a bunch of HTML. Whether you think it's a mistake or not,
it's *common*, and having some tags use it while others don't is
confusing.

~TJ


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-08 Thread Boris Zbarsky

On 5/8/15 11:56 AM, Roger Hågensen wrote:

One way to cheapen the computational cost is to have partial case
insensitive matching.


If you're walking the string at all, you have already lost in terms of 
performance for this stuff.



If (character = $0041) And (character = $005A)
 character = (character | $0020)
EndIf


Yes, this is basically the algorithm that would be considered expensive 
case-insensitive matching in this context.



In optimized C or or ASM this should perform really well compared to
calling a Unicode function to normalize and lower case the text.


No one is even remotely considering anything Unicode here.

-Boris


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-08 Thread Tab Atkins Jr.
On Fri, May 8, 2015 at 9:09 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/8/15 11:56 AM, Roger Hågensen wrote:
 One way to cheapen the computational cost is to have partial case
 insensitive matching.

 If you're walking the string at all, you have already lost in terms of
 performance for this stuff.

Looks like the patch that recently landed *did* exactly this - it
would do a string-walk when matching tags/attributes against SVG
elements. It was indeed slower, but it was just for SVG, and wasn't a
huge hit.

But then Elliot realized that we keep around a second pointer for the
uppercased tagname anyway (some of the tagname accessors return
uppercase for HTML), so he just rewrote it to store the proper casing
there instead, for SVG elements. ^_^  Now it's no extra memory or
runtime cost.

~TJ


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Tab Atkins Jr.
On Thu, May 7, 2015 at 2:11 PM, Elliott Sprehn espr...@chromium.org wrote:
 On Thu, May 7, 2015 at 2:09 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/7/15 5:07 PM, Tab Atkins Jr. wrote:
 I believe the SVGWG is fine with a parsing-based approach, exactly
 like what HTML does.  An SVG element created with mixed casing, or
 imported from an XML document, might not match a lowercase tagname
 selector, but SVG written in HTML will.

 Hmm.  The main problem here is for scripts that create SVG elements in an
 HTML document, since those have to use createElementNS and pass the
 mixed-case names (e.g. for foreignObject).

 One idea could be to make createElement() return SVG elements for svg tag
 names embedded in HTML.

 Neither spec is ever going to have a conflicting tag name.

Well, beyond the existing conflicts of style, script, and a.
(font too, but that's dropped from SVG2, so who cares.)  But the
SVGWG already resolved to allow HTML elements inside of SVG, so they
don't have to, say, add an svg:video element.  (I'm on the hook to
define the SVG layout model in terms of the CSS box model, which is
trivial, but I haven't had the time to do it yet.)

But yeah, the SVGWG resolved to never add new conflicts, and I assume
HTML is similarly okay with not adding tagnames that SVG has already
claimed.

~TJ


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Rune Lillesveen
On Thu, May 7, 2015 at 2:13 PM, Roger Hågensen rh_wha...@skuldwyrm.no wrote:
 On 2015-05-07 13:16, Rune Lillesveen wrote:

 Currently, the HTML spec says that type selectors matches case
 sensitively for non-html elements like svg elements in html documents
 ...

 This adds an implementation complexity to type selector matching.
 What's the rationale for matching the selector case-sensitively in the
 svg case?

 Isn't SVG based on XML? Which means SVG is probably case sensitive!

My questions are about SVG elements parsed by the HTML parser to
create an HTML document. Parsing SVG tag names are case-insensitive in
HTML documents, yet normalized to their camelCased form.

-- 
Rune Lillesveen


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Anne van Kesteren
On Thu, May 7, 2015 at 11:23 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 Well, beyond the existing conflicts of style, script, and a.
 (font too, but that's dropped from SVG2, so who cares.)

textArea is out too?

With respect to case-insensitive matching, I don't really understand
why we simultaneously want to make these rather trivial changes to SVG
while at the same time move to constructors for creating elements,
which is even stricter than createElementNS() (literal has to be
correctly spelled or you get an exception). If we want to move to a
world where people write

  new SVGRectElement

why would we even bother making

  document.createElement(RECT)

work?

Same for CSS, the majority of CSS already uses type selectors where
the case matches up with the HTML. Is complicating it really worth it?
We should bring the languages closer, but we shouldn't put the
mistakes we made with HTML into SVG.


-- 
https://annevankesteren.nl/


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Boris Zbarsky

On 5/7/15 7:16 AM, Rune Lillesveen wrote:

This adds an implementation complexity to type selector matching.
What's the rationale for matching the selector case-sensitively in the
svg case?


The idea is to allow the selector match to be done case-sensitively in 
all cases so it can be done as equality comparison on interned string 
representations instead of needing expensive case-insensitive matching 
on hot paths in the style system.



Should we change the spec in this regard?


To what, exactly?  What is your proposed behavior here?

-Boris


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Tab Atkins Jr.
On Thu, May 7, 2015 at 8:26 AM, Boris Zbarsky bzbar...@mit.edu wrote:
 Note that at least for textArea this matters, in that you could suddenly
 have selectors that are not meant to match it start matching it.

That's not part of SVG1.1 or SVG2; it's not supported on most (all?)
major browsers anyway, so that's not a big deal.

 You mean case-sensitively in the implementation? Type selectors are
 case-insensitive for html elements.


 style
   * { color: green }
   foo { color: red }
 /style
 script
   var e = document.createElementNS(http://www.w3.org/1999/xhtml;, Foo);
   e.textContent = Is this red?;
   document.documentElement.appendChild(e);
 script

 If the matching were actually case-insensitive, the text would be red. It's
 not.

 The WebKit implementation represents each type selector with two
 strings, one lowered and one with original case, when the type
 selector is not lower-cased in the source. What does Gecko do?


 Exactly the same thing.

 To always match type selectors case-insensitively in html documents.


 I don't think that's acceptable to at least Gecko from a performance
 standpoint.  Not if it means we have to end up with red text in my testcase
 above.

I believe the SVGWG is fine with a parsing-based approach, exactly
like what HTML does.  An SVG element created with mixed casing, or
imported from an XML document, might not match a lowercase tagname
selector, but SVG written in HTML will.

~TJ


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Boris Zbarsky

On 5/7/15 5:07 PM, Tab Atkins Jr. wrote:

I believe the SVGWG is fine with a parsing-based approach, exactly
like what HTML does.  An SVG element created with mixed casing, or
imported from an XML document, might not match a lowercase tagname
selector, but SVG written in HTML will.


Hmm.  The main problem here is for scripts that create SVG elements in 
an HTML document, since those have to use createElementNS and pass the 
mixed-case names (e.g. for foreignObject).


-Boris



Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Elliott Sprehn
On Thu, May 7, 2015 at 2:09 PM, Boris Zbarsky bzbar...@mit.edu wrote:

 On 5/7/15 5:07 PM, Tab Atkins Jr. wrote:

 I believe the SVGWG is fine with a parsing-based approach, exactly
 like what HTML does.  An SVG element created with mixed casing, or
 imported from an XML document, might not match a lowercase tagname
 selector, but SVG written in HTML will.


 Hmm.  The main problem here is for scripts that create SVG elements in an
 HTML document, since those have to use createElementNS and pass the
 mixed-case names (e.g. for foreignObject).


One idea could be to make createElement() return SVG elements for svg tag
names embedded in HTML.

Neither spec is ever going to have a conflicting tag name.

- E


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Tab Atkins Jr.
On Thu, May 7, 2015 at 4:16 AM, Rune Lillesveen r...@opera.com wrote:
 Currently, the HTML spec says that type selectors matches case
 sensitively for non-html elements like svg elements in html documents
 [1]. So according to the spec, and the implementation in Gecko, the
 rules below matches according to the prose:

 !DOCTYPE html
 style
 foreignObject { color: red }
 foreignobject { color: green }
 /style
 foreignOBJECTMatches both rules. Case-insensitive match. Is
 green./foreignOBJECT
 svg
 FOREIGNobjectMatches the first rule because the parser
 normalizes to the camelCased form as per spec. Is red./FOREIGNobject
 /svg

 This adds an implementation complexity to type selector matching.
 What's the rationale for matching the selector case-sensitively in the
 svg case?

 I'm sorry if I've missed previous discussions. I did see a few really
 long mails and threads about svg in html on this list, but weren't
 able to find the resolution for this in reasonable time.

 Should we change the spec in this regard?

 I did have a correct implementation for Blink, but was asked in the
 review [2] to match insensitively for non-html elements in html
 documents due to the complexity.


 [1] https://html.spec.whatwg.org/multipage/scripting.html#selectors
 [2] https://codereview.chromium.org/1099963003

The SVGWG just resolved to allow SVG elements and attributes to be
matched case-insensitively, like HTML.  We also resolved to try and
find a method to make parsing SVG case-insensitive in general.

For the latter, we're specifically interested in doing so when SVG is
embedded in HTML, and also having some way to indicate that a
standalone SVG should parse using similar rules (either a modification
of the HTML parser, or Anne's XML5 work, or something similar).  When
written in the XML syntax, it's fine for it to continue being
case-sensitive as it is today.

I presume this means that we'd start returning lowercased tagnames for
element in SVG-in-HTML.  Do we think that's a compat risk?  If so,
perhaps we can do something smaller, like having the
localName/nodeName/tagname attributes be getters that camelcase those
tagnames, but internally they're lowercased by the parser.  (It looks
like localName and tagname already differ in casing in HTML, so either
we store two strings, or we do the casing transformation in a getter
already.)

~TJ


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Boris Zbarsky

On 5/7/15 10:53 AM, Rune Lillesveen wrote:

So there's no author-rationale here?


Well... that depends.

The way things used to work before SVG-in-HTML existed is that selector 
matching was case-sensitive in SVG and apppeared case-insensitive in 
HTML.  I say appeared because it wasn't, actually: if you created an 
HTML element with local name Foo (e.g. via createElementNS or 
importNode from an XHTML document) then the selector foo would not 
match it in at least some implementations.


When we were trying to design how this should work in a document with 
mixed HTML and SVG, we aimed to preserve the existing behaviors of HTML 
and SVG while also not forcing UAs into doing any actual 
case-insensitive matching at selector-matching time, which allows much 
faster compares on pre-interned strings.


The depends part is whether you consider aimed to preserve the 
existing behaviors of HTML and SVG as author rationale or not.


Note that at least for textArea this matters, in that you could 
suddenly have selectors that are not meant to match it start matching it.





You mean case-sensitively in the implementation? Type selectors are
case-insensitive for html elements.


style
  * { color: green }
  foo { color: red }
/style
script
  var e = document.createElementNS(http://www.w3.org/1999/xhtml;, Foo);
  e.textContent = Is this red?;
  document.documentElement.appendChild(e);
script

If the matching were actually case-insensitive, the text would be red. 
It's not.



The WebKit implementation represents each type selector with two
strings, one lowered and one with original case, when the type
selector is not lower-cased in the source. What does Gecko do?


Exactly the same thing.


To always match type selectors case-insensitively in html documents.


I don't think that's acceptable to at least Gecko from a performance 
standpoint.  Not if it means we have to end up with red text in my 
testcase above.


-Boris



Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Rune Lillesveen
On Thu, May 7, 2015 at 3:59 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 5/7/15 7:16 AM, Rune Lillesveen wrote:

 This adds an implementation complexity to type selector matching.
 What's the rationale for matching the selector case-sensitively in the
 svg case?

 The idea is to allow the selector match to be done case-sensitively in all
 cases so it can be done as equality comparison on interned string
 representations instead of needing expensive case-insensitive matching on
 hot paths in the style system.

So there's no author-rationale here?

You mean case-sensitively in the implementation? Type selectors are
case-insensitive for html elements.

The WebKit implementation represents each type selector with two
strings, one lowered and one with original case, when the type
selector is not lower-cased in the source. What does Gecko do?

 Should we change the spec in this regard?

 To what, exactly?  What is your proposed behavior here?

To always match type selectors case-insensitively in html documents.

-- 
Rune Lillesveen


Re: [whatwg] Case-sensitivity of CSS type selectors in HTML

2015-05-07 Thread Roger Hågensen

On 2015-05-07 13:16, Rune Lillesveen wrote:

Currently, the HTML spec says that type selectors matches case
sensitively for non-html elements like svg elements in html documents
...

This adds an implementation complexity to type selector matching.
What's the rationale for matching the selector case-sensitively in the
svg case?


Isn't SVG based on XML? Which means SVG is probably case sensitive!

I found some info here, not sure if that'll help clarify anything.
http://www.w3.org/TR/SVG/styling.html#CaseSensitivity



PS!
This is why I always make it a rule to type lowercase for anything that 
will possibly be machine read (file names, properties/attributes), it 
also compresses better (lower case letters are more frequent).




--
Roger Hågensen, Freelancer, http://skuldwyrm.no/