On Feb 13, 2010, at 3:18 AM, Ian Hickson wrote:
On Fri, Jan 22, 2010 at 5:11 AM, Anton Muhin <[email protected]>
wrote:
Good day.
Currently DOM core 3 spec is somewhat inconsistent regarding if
invocations of getElementsByTagName and alike must return a new
NodeList or could cache this list. For Document it's mandated for
both getElementsByTagName and getElementsByTagNameNS, but for
Element,
it's only worded for getElementsByTagNameNS, but not for
getElementsByTagName. Maciej noticed as well difference between
getElementsByTagName and other getElementsBy queries (see
http://www.w3.org/Bugs/Public/show_bug.cgi?id=8792). And word "new"
is missing from ECMAScript bindings spec:
http://www.w3.org/TR/DOM-Level-3-Core/ecma-script-binding.html
Is it possible to allow caching for those cases? Firefox caches
those
node lists for a long time (Maciej found the related bug
https://bugzilla.mozilla.org/show_bug.cgi?id=140758). IE8 caches as
well. Opera, Safari and Chrome do not.
I'm concerned about the GC-sensitivity of such behaviour (we might end
up snookering ourselves in a situation where specific GC behaviour
actually matters for compatibility).
It's not GC that matters but the degree of caching (e.g. whether cache
items are ever cleared for reasons other than GC). It's true that this
is theoretically a hazard, but the only observable effect would be
whether custom properties set on one NodeList appear on one retrieved
later. Since it's very uncommon (and indeed unlikely) for authors to
set custom properties on NodeLists, I think this benefit is purely
theoretical, not real.
How about the following compromise: these methods return a new object
each time, except if they are called with the same argument as the
previous invocation of the method? i.e. cache the last returned object
only. Would that be acceptable? It gives you a performance win in the
case where the author spins a loop using the same call over and over,
and is completely predictable.
It's only predictable if that last object is kept alive, even if it
were otherwise a candidate for garbage collection. Are you suggesting
to do that? I assume so, because that's the only way it would be
"completely predictable". If so, then I would object, because it could
lead to a large long-term memory cost (fully traversing a large
NodeList in a loop would leave you paying the cost of that memory
until you leave the page or the author fetches a different NodeList).
Imagine the last NodeList you accessed was the result of
getElementsByTagName("*") and the author fully traversed it. Now
you've likely pinned memory proportional to the size of the DOM.
Even without the memory issue, I would not favor this design, because
it makes performance fall off a cliff if you use more than one
NodeList. Changing your loop from fetching one NodeList to two could
suddenly make it 50x slower. We do not like coding performance hazards
like this into our implementation.
Alternatively, if we need to cache more than that, how about blowing
away the cache with each spin of the event loop, so that anything in a
tight loop is cached (and _not_ subject to GC — this could be a
problem if the script calls one of these methods with 10000 different
arguments and sets properties on each one), but not beyond one task?
(i.e. don't share objects in calls across setTimeout)
Pinning a potentially unbounded number of NodeLists in memory would
definitely be unacceptable from both speed and memory perspectives.
Especially on mobile devices.
I note that if all you care about is ensuring that behavior is
deterministic, the simplest solution would be to make NodeList objects
disallow setting of custom properties. Then there is no way to observe
the side effects of GC behavior. This would be simpler to implement
than either of your proposed rules, and would not create speed or
memory hazards. I do not know if we could justify such a change as a
mere erratum to DOM Level 3 Core, but the same goes for both your
proposed policies.
For now for the objects in HTML5 I've gone with the first of these
suggested compromises.
I don't think we'd be willing to implement that in WebKit. We're more
likely to copy existing Firefox and IE behavior.
Regards,
Maciej