> Let's talk real numbers here.
>
> http://jsperf.com/xpath-vs-dom
This test is invalid. There are almost no valid examples where
operations per second account for any sort of performance associated
with web technologies. The only exception I can think of is testing
certain conditions in a loop. I have already mentioned, several times,
that there is a performance hit for accessing a foreign API. This said
a regular expression operation would also fail on your test and should
this suggest that regular expressions are disgustingly slow? No. It is
my guess that the regular expression operator is the fastest way to
access anything in the form of a large string, and if this is true then
your test fails to identify with its sole intention.
Exactly like I mentioned in my prior email the only valid way to test
this sort of operation is against a millisecond clock. This means that
if you want to see a benchmark then you have to do some actual work,
opposed to merely feeding numbers into some extraneous application.
A valid test is stated in the form of a question. This identifies your
purpose. Then you must predict a conclusion, which is called a
hypothesis. A hypothesis identifies your bias before any testing
begins. Then you must describe each test variant and exactly how the
testing is conduction as an instruction set so that the test is easily
reproducible. If your test does not contain each of these items, uses
methods irrelevant to your purpose, or is not reproducible by a separate
party then your test has absolutely no merit.
In this case your test calculates operations per second from a
JavaScript interpreter without any indication of execution time per
iteration and it is also missing a question/hypothesis. If your results
are entirely indistinguishable then you will need to use a larger data
sample until they become distinguishable. I predict that by the time
the data sample(s) become so large that any jQuery selector will timeout
entirely and possibly so for use with the DOM methods while an XPath
operation will become perceptibly slower than a regular expression
operation. I have never executed XPath expressions from JavaScript, so
it is quite likely that I am severely wrong. It would make for a good
test.
You did not seem to know how to use regular expressions to access the
DOM. Regular expressions can only be used against string type data.
The DOM can be read from and written to as a string literal thanks to
the innerHTML property. Consider this example:
var domRoot = document.documentElement,
domString = domRoot.innerHTML,
test = /myUniqueIdentifier/.test(domString);
//perform a change to domString.
domRoot.innerHTML = domString;
If you need help writing a timing clock you can take the one I wrote in
the Pretty Diff tool, but it must wrap each test operation closely and
uniquely in order to prevent cross interference. If you would rather I
just produce this test then you will need to wait a week until I have
the necessary time available.
Thanks,
Austin Cheney, CISSP
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf
Of Scott Sauyet
Sent: Friday, September 09, 2011 7:45 PM
To: The JSMentors JavaScript Discussion Group
Subject: [JSMentors] Re: spaces in attribute values
Austin Cheney wrote:
> Scott Sauyet wrote:
>> You brought up XPath as an alternative to JQuery's
>> selectors API. JQuery (and most competing selector engines) claim to
>> support most if not all of CSS3.
>
> I have compared XPath to CSS selectors.
Perhaps I misunderstood.
> jQuery is not compiled byte code, and so it is too slow to be of worthy
> consideration.
Neither is JSON2.js, code from addthis.com, or third-party analytics
tools. Nor is your own code.
What speed is necessary to make this worth consideration? If a
selector takes 28 microseconds and another only 21, is the latter one
the only one worth consideration?
> You can
> easily verify the difference yourself. Just compare a complex operation
> navigating the DOM from one location to a different, and not necessarily
> descendent, location versus using DOM methods while separately timing
> each operation against a millisecond clock. The timing difference is
> staggering, especially for results in a loop and queries against an
> extremely large DOM. [ ... ]
You seem to be trying to make the case that there is no place in JS
for CSS selectors. But you're discussing situations (loops against
large DOMs) where running queries against the document already don't
really make sense. If CSS selectors don't work for you, don't use
them. The popularity of selector-based query engines demonstrate that
they are useful to many others.
>> Do you intend to make a contrast with XPath-style selection? I simply
>> don't see any particular reusability advantages in targeting the
>> preceeding header to the button being clicked with "../../h3"
>
> This goes to the nature of modules. In a modular environment fragments
> are open to reuse in an isolated capacity compared to a wider context.
I'm sorry, I'm having trouble again parsing that.
> Further more if I were unsure if I should expect a h3 element or h4 I
> could use a union: ../../(h3|h4)[0]
"I'm sorry, your tool is broken. It's not matching my markup. Why
doesn't it target my H5? Why doen'st it match my STRONG element?"
> If the ascendance has changed I could use this:
> ancestor::div//(h3|h4)[0]
And if that takes you up to a div far removed from the relevant
content, because the relevant container is no longer a DIV but a
FIELDSET?
>> Actually, I think having multiple classes enhances reusability.
>
> No, because instead of being bound to some context there is a binding to
> the reuse of a name, or naming convention, in addition to the mapped
> context. This overlap is in conflict with the separation of structure
> and behavior principle. The reuse as addressed by your example is
> reuse within a single document and does not illustrate reuse outside of
> a document instance.
It most certainly illustrates exactly the sort of reuse I'm used to
getting from subsystems, which is precisely that I can use it in
multiple environments with only a minimum of configuration. How would
combining the `debug` and `new` clases into, say `debugNew` improve
portability? I'd say it would significantly degrade portablility of
both the `addNew` and the `showAll` functionality, as it intertwines
them.
>> Obviously if we can target specific elements easily, that's the best
>> approach. But that's not always feasible. In the dynamic environment
>> I'm in, it's rarely feasible.
>
> When absolutely ever other approach has failed there are still regular
> expressions, which is still arguably a faster mode of targeting than
> CSS selectors and certainly faster if the CSS selector is being
> Interpreted by interpreted code.
How would you use regexes to target nodes in the document? What
strings would your test against a regex?
>> But perhaps there's some misapprehension here. I am part of a large
>> team working on a very large web application with many dynamic parts.
>> One team member might be involved in tweaking the look by updating the
>> markup or the CSS while others are adding functionality based on
>> certain class names, and occasionally ids. Although there was a
>> significant up-front design effort, the design has been changing as
>> additional requirements are noted. We certainly don't have the luxury
>> of locking down all the markup ahead of time. Perhaps you are able to
>> do so.
>
> This is why module patterns are becoming increasingly more popular. It
> is also evidence that there is benefit in separating the application
> layer from the UI layer.
Of course, but with no hooks, how does your application layer interact
with the existing UI? Or is your entire UI generated from the
application layer?
>> So because you have no need for it in your project it can't make sense
>> in any environment?
>
> From a business perspective, yes, correct. [ ... ]
I think you misunderstood; either that or there is a very large ego in
play. :-) Let me rephrase: So because Austin Cheney has no need for
it in his project, no one could possibly need it?
>> With the right abstractions on top of the former, it could be quite
>> clean. It would probably not outperform plain DOM methods and
>> doesn't, to my mind, gain very much. It certainly could work. But I
>> rarely have had this need either. If I want nodes to be linked, I
>> usually try to do this with related ids rather than assuming a
>> particular DOM relationship.
>
> Not necessarily. The DOM is slow for two reasons:
> 1) It is an API outside of the JavaScript interpreter
> 2) There is only one document object, so while there can be many
> simultaneous read operations there can only be a single write operation
> to a given node at a given time.
What sort of simultaneity do you expect in a single-threaded
environment?
> The first of those reasons is universal to any API that JavaScript must
> access whether it be the regular expression engine, XPath, or even eval.
> There is a performance hit for requesting access to a separate code
> interpreter. Let's consider the two prior examples:
>
> ../../h3[0]
> x.parentNode.parentNode.getElementsByTagName("h3")[0]
>
> The primary difference is that the XPath expression is a single
> operation. The DOM example is three operations. This means in the
> prior instruction set I have to access a foreign API one for XPath, but
> three times for the DOM query. The evaluation of the instructions is
> not slow, but the lookup and access for each instruction comes at a
> cost. This said, it could be argued that the execution of an XPath
> expression should be roughly equivalent to a regular expression
> operation, which is faster than DOM methods and certainly faster than a
> jQuery index.
Let's talk real numbers here.
http://jsperf.com/xpath-vs-dom
There are three tests on there. On a recent version of Firefox, we
could perform the jQuery version with its three separate calls 35K
times per second, whereas we could do the XPath version 49K times per
second. In a recent Chrome, the difference is larger: jQuery, 67K,
XPath, 114K. So if raw speed is what we want for this, then XPath
seems a better solution. Except... the pure dom version runs 1519K
times per second in Firefox and 1560K times per second in Chrome. If
raw speed is what you want, the DOM API is far superior. Moreover,
the XPath code, at least the stuff from MDN, doesn't even work in IE8
or the Android version I tested. (If you have better code, please
make a new version of that test.)
And still one more point, your example was around a click-handler for
a button. Presumably selecting the nearby header will only happen
once during that event. How important is it that it runs in 1/49th of
a millisecond versus 1/35th of one? I just don't see the point of
this sort of optimization in that kind of client-side code.
If you prefer XPath as your selector format, by all means use it. But
it doesn't seem to me that you've demonstrated any particular reason
for others to choose it.
-- Scott
--
To view archived discussions from the original JSMentors Mailman list:
http://www.mail-archive.com/[email protected]/
To search via a non-Google archive, visit here:
http://www.mail-archive.com/[email protected]/
To unsubscribe from this group, send email to
[email protected]
--
To view archived discussions from the original JSMentors Mailman list:
http://www.mail-archive.com/[email protected]/
To search via a non-Google archive, visit here:
http://www.mail-archive.com/[email protected]/
To unsubscribe from this group, send email to
[email protected]