[Prototype-core] Proposed rewrite of $$/Selector

Andrew Dupont Fri, 19 Jan 2007 16:58:40 -0800

I posted about this on Basecamp, but now that we've started to use this
list I'm going to post it here for public consumption.


We've talked about optimizing $$ in the past -- it's one of my personal
goals for 1.5.1.  So I took great interest in Jack Slocum's new
DomQuery extension for YUI
(http://www.jackslocum.com/blog/2007/01/11/domquery-css-selector-basic-xpath-implementation-with-benchmarks/).
 Jack is a brilliant JavaScripter and has managed to write a really,
really fast CSS selector engine here.

I took a look at his code -- it's quite clever, but also verbose and
inelegant in places.  He handles a lot of specific CSS token
combinations by hand, which results in really fast querying but also
*lots* of lines of code.  I resolved to write a version that was more
Prototypish.

In the middle of doing so, I thought back to my earlier attempt with
XPath
(http://www.andrewdupont.net/2006/07/10/more-than-you-ever-wanted-to-know-about-and-xpath/),
and realized that I could add on an XPath approach with just a bit more
code.  So I did.

You can view the code at
(http://andrewdupont.net/test/double-dollar/selector.new.js). I've set
up a test page (expertly made by the jQuery team) that compares the
speed of the current $$ and my experimental $$:
(http://andrewdupont.net/test/double-dollar/).

I'm still trying to make it better, but as I see it this new $$ solves
several problems with the current $$:

(1) The current $$ does not filter out duplicates.
You can see this on the test page: "div div" and "div div div" both
return far more results than they should because certain nodes are
added to the collection more than once.  Calling "uniq" on the array
before it's returned is *far* too costly, so I used Jack's inspired
method here: it enumerates the collection sets a property on each node,
so that if the function finds a node with that property already it
knows it's seen it before.  (It then sets that property to "undefined"
before it's done.)

(2) The current $$ is not very modular or extensible.
With this new implementation, I can add new tokens very easily -- for
example, all the operators for attribute matching (=, $=, ^=, *=, |=,
~=) -- because they live in a hash with the operator as the key and a
comparator as the value.  Similarly, adding new selectors like child
(>) and adjacency (+), or even pseudoclasses (:nth-child(even)) could
be done by adding a regex, an XPath translation, and a
string-of-JS-code translation.

(3) The current $$ is SLOW.
XPath clearly solves this problem (try the tests in Firefox and see for
yourself), but not for Safari (version <2.0) or MSIE (version
*anything*).  So even the "slow lane" needs to be faster here.  Jack
claims his implementation is the fastest on earth, and though I've
taken his general approach I have not yet realized his gains.  Still,
on costly selectors the new approach is almost twice as fast as the
current $$ (even leaving out XPath), and that's with more functionality
and fewer bugs (i.e., duplicated nodes).


I'd love to hear some feedback.  I've been looking at this code for way
too long now, so a fresh pair of eyes may point out something obvious
that I've missed.  Also, I'd love it if someone were to modify this
code to add new stuff so that $$ can accommodate a wider range of
selectors.

Cheers,
Andrew


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Prototype: Core" group.
To post to this group, send email to [EMAIL PROTECTED]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/prototype-core?hl=en
-~----------~----~----~----~------~----~------~--~---

[Prototype-core] Proposed rewrite of $$/Selector

Reply via email to