Hi,
I have reviewed all of the recent discussion and spent some time
analysing JQuery, and I have compiled this rough specification detailing
how I think find, findAll and matches can work. The following details
the rationale for each of the design decisions made.
The new methods should be available on documents, document fragments and
elements, just like querySelector. The easiest approach is to put these
on the same NodeSelector interface as the existing methods.
This email is long and detailed. For those of you who just want the
conclusion, skip to the the proposed IDL and summary.
---
*Table of Contents*:
1. Methods and Return Types
2. Document Methods
3. Document Fragment Methods
4. Element Methods
5. Match Testing
6. Proposed IDL
7. Summary of Proposed Spec Changes
8. Proposed Rules for Prepending :scope
---
1. *Methods and Return Types*
Throughout the discussion, there seems to be the assumption that we
should have both find() and findAll() methods, which return a single
matching element and a collection of all matches, respectively. One
issue to decide is, based on experience with and usage of
querySelector() and querySelectorAll(), whether it worth introducing the
same distinction for new methods, or would it be better to just go with
a single method that returns a collection? That is, is it really useful
or better in practice to have the method that only returns the first match?
For the purposes of the rest of this email, however, I'll stick with the
assumption that we'll introduce both find() and findAll().
There is also the open issue of what type of collection should be
returned by findAll(), whether it be an Array or special kind of
NodeList with an Array-like interface. I have not addressed this issue
in this proposal.
---
2. *Document Methods*
The document.findAll() method is supposed to be designed to more closely
align with the behaviour of JQuery's global $() method, which is defined
as an alias for the jQuery() method:
jQuery( selector [, context] )
(Note: The other overloaded JQuery methods are not relevant.)
http://api.jquery.com/jQuery/
All script examples relate to the following sample document:
<!DOCTYPE html>
<body>
<p id="1"></p>
<div>
<p id="2"></p>
</div>
</body>
JQuery results:
(Note: All results are returned as instances of jQuery objects, which
are indexable like an array)
$("html") // returns [html]
$(">body") // returns []
$("+div") // returns []
$(">body", document.documentElement) // returns [body]
$(">p", $("body")) // returns [p#1]
$("p", $("div")) // returns [p#2]
$("+div", $("#1")) // returns [div]
$(">p, >div", $("body")) // returns [p#1, div]
$("body", []) // returns []
$("body", null) // returns [body]
$("body", undefined) // returns [body]
$("") // returns []
document.findAll() should support the same parameters as $() and return
an equivalent result collection in the majority of cases. Likewise,
document.find() should work the same way, but return only the first
match. Where findAll() returns an empty collection, find() would return
null.
From the above, it's clear that in the new API, there are cases where
:scope should be implied and cases where it should not. In cases where
:scope is implied and there's no explicit combinator, a descendant
combinator needs to be implied too.
$("html") // returns [html]
document.findAll("html")
:scope cannot be implied here because html is the root element, so it
wouldn't match if the selector was interpreted as ":scope html".
$(">body") // returns []
document.findAll(">body")
$("+div") // returns []
document.findAll("+div")
:scope needs to be implied to make a syntactically valid selectors,
making them equivalent to ":scope>body" and ":scope+div", respectively.
But :scope cannot match the root element here because otherwise the
first would return the body element as a match.
$(">body", document.documentElement) // returns [body]
document.findAll(">body", document.documentElement)
Like the previous case, :scope needs to be implied. This time, however,
it needs to match the specified element node. It also means that the
second parameter must be able to accept an Element node.
$(">p", $("body")) // returns [p#1]
document.findAll(">p", $("body"))
document.findAll(">p", document.findAll("body"))
This is the same as the previous case, except a collection of elements
is passed instead of a single Element node. This is equivalent to
":scope>p", where :scope matches the elements in the collection.
It should work regardless of the type of collection. In the first case,
$() returns a numerically indexed JQuery object, the latter returns a
yet to be defined Array-like structure.
$("p", $("div")) // returns [p#2]
document.findAll("p", document.findAll("div"))
In this case, despite not beginning with a combinator, the presence of a
reference node indicates that :scope with a descendant combinator should
be implied, equivalent to ":scope p".
$("+div", $("#1")) // returns [div]
document.findAll("+div", document.find("#1"))
:scope is implied, equivelent to ":scope+div", where scope matches the
specified element.
$(">p, >div", $("body")) // returns [p#1, div]
document.findAll(">p, >div", $("body"))
:scope needs to be implied before each individual selector in the list,
equivalent to ":scope>p, :scope>div".
$("body", []) // returns []
document.findAll("body", [])
:scope should be implied, but since the collection is empty, it matches
nothing.
$("body", null) // returns [body]
document.findAll("body", null)
$("body", undefined) // returns [body]
document.findAll("body", undefined)
:scope should not be implied. But note that, according to the current
algorithm to determine contextual reference nodes in the spec, if :scope
were explicitly included, then it would not match anything.
$("") // returns []
document.findAll("")
Unlike JQuery, querySelectorAll() throws a SYNTAX_ERR exception in this
case. I'm not sure if it's better for findAll() to throw an exception,
or simply return an empty collection. But note that if we decide not to
throw for this, then the find() method would return null.
Based on these results, :scope should not be implied in all cases.
Rather, it should be implied under the following conditions:
1. When the given selector begins with a combinator other than the
descendant combinator (space), or
2. Whenever a reference element, or a collection of zero or more nodes,
is passed and there is no explicit :scope.
This is most easily achieved by supporting the refElement and refNodes
parameters in the same was as querySelector() and specifying the
conditions under which :scope is implied. However, the algorithm to
determine contextual reference nodes needs to be modified so that the
documentElement does not match :scope, when no other nodes are supplied.
We should also consider whether the definition of :scope in Selectors 4
should be changed, which currently states that it matches the same as
:root where no other reference elements are specified.
---
3. *Document Fragment Methods*
It seems that JQuery does not fully support querying document fragments
and so I can't analyse it fully to make this API behave similarly.
var f = document.createDocumentFragment();
f.appendChild(document.body);
$(f).find("div"); // returns []
The closest I could get was by first running the .contents() method,
which just returns a collection of children, and searching that.
However, this is more directly comparable with the proposed NodeArray
interface, rather than with searching document fragments directly.
$(f).contents().find("p") // returns [p#1, p#2]
This is the same result as the querySelectorAll() method returns.
f.querySelectorAll("p") // returns [p#1, p#2]
I believe the sensible approach here would be to make the .find methods
on fragments behave the same way they do on document.
---
4. *Element Methods*
When invoked on an element, the contextual reference element (that is,
the one that matches :scope) is set to the element itself and :scope
should be explicitly included or implicitly prepended to each selector.
JQuery results:
$("div").find("p") // returns [p#2]
$("body").find(">p") // returns [p#1]
$("body").find("p", $("div")) // returns [p#1, p#2]
$("body").find(">p", $("div")) // returns [p#1]
$("p").find("+div") // returns [div]
$("body").find(">p, >div") // returns [p#1, div]
$("body").find("") // returns []
Element.findAll() should behave similarly to the jQuery.find() method
and return equivalent results in the majority of cases.
var body = document.body;
var p = document.find("#1");
var div = document.find("div");
//$("div").find("p") // returns [p#2]
div.findAll("p")
//$("body").find(">p") // returns [p#1]
body.findAll(">p")
These imply :scope and behave as expected, where scope matches the
context node.
$("body").find("p", $("div")) // returns [p#1, p#2]
body.findAll("p", div)
$("body").find(">p", $("div")) // returns [p#1]
body.findAll(">p", div)
:scope is still implied, but unlike the document.findAll() method, any
additional parameters, including specified reference elements need to be
ignored because :scope should still match the context node.
$("p").find("+div") // returns [div]
p.find("+div")
Again, :scope is implied, but may also match siblings, not just descendants.
$("body").find(">p, >div") // returns [p#1, div]
body.findAll(">p, >div")
:scope is implied for each selector in the list.
$("body").find("") // returns []
body.findAll(""); // SYNTAX_ERR or return []?
body.find(""); // SYNTAX_ERR or return null?
The issue of whether to throw a SYNTAX_ERR or return [] (or null) also
applies to this case.
Finally, although not supported by JQuery, the reference combinator
needs to be considered:
label.find("/for/ input")
I think it makes the most sense for this case to match anywhere in the
whole document, and imply :scope which matches the given context node.
From the above, it's clear that in the new API, :scope should always be
implied when there is no explicit :scope and :scope should always match
the element on which the methods are invoked, regardless of any
additional parameters.
---
5. *Match Testing*
There's been debate concerning whether we should just rename
matchesSelector() to matches(), or introduce a new matches() method that
is distinct from matchesSeletor().
In the general case,
elm.matches(":scope *", ref);
answers the question: is this element related to (e.g. descendant,
sibling, child or referenced-by) one or more given elements?
e.g.
input.matches(":scope /for/ *", label)
Returns true if the label references the input element. It's basically
the inverse of either:
document.find("/for/ input", label)
label.find("/for/ input")
The reason given for introducing a new distinct method is to imply
:scope, which matches a specified reference element, and which was
claimed to be useful for JQuery's proxybind() method.
I was, however, unable to confirm the existence of such a method as the
only google results for it seemed to be in recent threads on
public-webapps; searching the JQuery forums and bug tracker returned no
results for "proxybind", and there were no occurrences found anywhere in
the JQuery source code on github. The closest alternative I found was
the .on() method, added in JQuery 1.7, which accepts an event type,
selector and handler.
<!DOCTYPE html>
<script src="http://code.jquery.com/jquery-1.7.1.js"></script>
<div>
<button>A</button>
<button>B</button>
<span><button>C</button></span>
</div>
<script>
function handler(evt) {
alert("Clicked " + evt.target.textContent);
}
$("div").on("click", "div>button", handler)
</script>
This example attaches the click event and listens only for clicks on the
children of the div element itself. So the handler is called for
buttons A and B, but not for C.
This one explicitly uses "div>button" in the selector because :scope is
not supported and it did not work with an implied-:scope-like
alternative ">button".
JQuery's .is() method is equivalent to Element.matches(), with the
interface defined as as:
.is( selector )
JQuery examples:
var body = $("body");
var p = $("#1");
var div = $("div");
body.is("body"); // returns true
div.is("body div"); // returns true
But unlike the jQuery() method, there is no supported context parameter.
p.is(">p", body); // returns false (context parameter not supported)
div.is("+div", p); // returns false
Comparing that with matchesSelector() as currently defined:
var body = document.body;
var p = document.find("#1");
var div = document.find("div");
body.matchesSelector("body"); // returns true
div.matchesSelector("body div"); // returns true
p.matchesSelector(">p", body); // throws SYNTAX_ERR
p.matchesSelector(":scope>p", body); // returns true
div.matchesSelector("+div", p); // throws SYNTAX_ERR
div.matchesSelector(":scope+div", p); // returns true
While the :scope functionality doesn't yet exist natively in JQuery, it
is possible to emulate it:
var p = $("#1");
!!$("body").find(">p").filter(p).length
This returns true if p is a child of body, or false otherwise. Using
implied :scope, this could be handled simply by:
p.matches(">p", body);
We could certainly do that to handle this case better, but I'm not
convinced we need a new method distinct from the existing
matchesSelector() method, for three reasons:
1. The most common case without reference nodes is handled by the
methods as implemented today.
2. The shorter method name should be used for the most common case.
3. Implying :scope really only makes sense where explicit reference
elements are provided. Otherwise, :scope would only match the
context node itself and would result in false always being returned
for the common case, which highly is undesirable.
e.g.
elm.matches(".foo"); // Should not imply :scope
In this case, if we did have two distinct methods, and the new method
only implied :scope under certain conditions, then .matches() and
.matchesSelector() would be nearly identical. They would only differ
when reference nodes are provided, and then only with respect to the
implied or explicit :scope.
i.e.
Assume matches() implies :scope under certain conditions, and
matchesSelector() doesn't.
elm.matches(".foo") // No implied :scope, equivalent to
elm.matchesSelector(".foo");
elm.matches(":scope .foo", ref) // Explicit :scope, equivalent to
elm.matchesSelector(":scope .foo", ref);
elm.matches(".foo", ref) // implies ":scope .foo", not equivalent to
elm.matchesSelector(".foo", ref); // No implied :scope.
The existence of two nearly identical methods, which differ in only one
small case would likely be confusing for authors and not provide any
real benefit. (This is an issue we can't avoid with querySelector vs.
find though)
I therefore believe we should simply rename matchesSelector() to
matches() and introduce the desired implied-:scope functionality in a
way that supports the common case, as well as the reference node case.
Given that the implied :scope behaviour needs to be made available in
the .find() methods, it would possible to make it available for
matches() too. So the most reasonable approach here is to imply :scope
according to the same rules described above for document.findAll() (i.e.
starts with a combinator or ref nodes were passed and no explicit :scope).
---
6. *Proposed IDL*
interface NodeSelector {
Element find(DOMString selectors, optional Element refElement);
Element find(DOMString selectors, sequence<Node>? refNodes);
??? findAll(DOMString selectors, optional Element refElement);
??? findAll(DOMString selectors, sequence<Node>? refNodes);
};
Document implements NodeSelector;
DocumentFragment implements NodeSelector;
Element implements NodeSelector;
This extends the same interface as that the existing querySelector
methods use, which will make the methods available on elements,
documents and fragments.
Open Issues:
1. The return type for findAll is yet to be decided. It may be the
proposed NodeArray, a regular Array or something else.
2. These new methods for Element may be split out to a separate
interface that omits the refElements and and refNodes parameters.
3. Do we need both find() and findAll(), or should we only have a
single new method that returns a collection?
Additionally, matchesSelector() will simply be renamed to matches().
---
7. *Summary of Proposed Spec Changes*
For Document and DocumentFragment, the refElement and refNodes
parameters are handled according to the existing algorithm to determine
contextual reference nodes currently in the specification. This
algorithm will be modified so that when the context node is document,
:scope does not match the root element.
For Element, the refElement and refNodes parameters are effectively
ignored and the algorithm to determine contextual reference nodes will
be modified to always return the element itself for these methods.
Open Issue: Should this change affect Element.querySelector() too, or
leave it as currently specified?
For all interfaces, these new find(), findAll() methods, and the renamed
matches() method must have automatic :scope prepending, subject to the
rules for prepending :scope, outlined below.
The findAll() method must return all matching elements from anywhere in
the document. The find() method must return the first matching element
in document order.
Open Issue: Should findAll("") and find("") throw SYNTAX_ERR or return
empty collection and null, respectively?
Note:
* Element.findAll(":matches(:scope, *)"); will match all elements in
the document, equivalent to document.querySelectorAll(*);
* Similarly, Element.findAll(":not(:scope)"); will match all elements
excluding the :scope element.
Notes for Implementers:
For Element methods, in cases where the selectors:
1. Only include descendant, sibling or child combinators, and
2. Do not include explicit :scope inside a functional pseudo-class
i.e. :not(:scope), :matches(:scope, *)
Implementations may optimise to only search descendants and/or
siblings, rather than the whole document.
Otherwise, it's possible for the selector to match any element in
the entire document, possibly including the :scope element itself.
---
8. *Proposed Rules for Prepending :scope*
Given a selector list as input to the method, trim whitespace and then
for each complex selector, run the first step that applies:
(Note: if the selector list is "", then there are 0 complex selectors in
the list and the following doesn't run)
| 1. Otherwise, if the complex selector begins with any combinator other
| than the descendant combinator (>, +, ~ or /attr/), then
| prepend :scope immediately before the combinator.
|
| 2. Otherwise, if there are no contextual reference nodes, do not
| prepend :scope.
|
| 3. Otherwise, If any compound selector includes a functional
| pseudo-class that accepts a selector as its parameter, and which
| contains the :scope pseudo-class anywhere within it, then do not
| prepend :scope.
| e.g. ":matches(:scope)", ":not(:scope)"
|
| 4. Otherwise, if the complex selector includes :scope within any
| compound or simple selector, then do not prepend :scope.
| e.g. ":scope", "div:scope.foo", "div :scope p"
|
| 5. Otherwise, prepend :scope and a descendant combinator.
Finally, return the modified list of complex selectors.
--
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/