Re: Allow to return same NodeList object for queries like getElementsByTagName, getElementsByClassName and getElementsByName

Maciej Stachowiak Sat, 13 Feb 2010 16:36:36 -0800


On Feb 13, 2010, at 3:18 AM, Ian Hickson wrote:

On Fri, Jan 22, 2010 at 5:11 AM, Anton Muhin <[email protected]>wrote:

Good day.

Currently DOM core 3 spec is somewhat inconsistent regarding if
invocations of getElementsByTagName and alike must return a new
NodeList or could cache this list.  For Document it's mandated for

both getElementsByTagName and getElementsByTagNameNS, but forElement,

it's only worded for getElementsByTagNameNS, but not for
getElementsByTagName.  Maciej noticed as well difference between
getElementsByTagName and other getElementsBy queries (see
http://www.w3.org/Bugs/Public/show_bug.cgi?id=8792).  And word "new"
is missing from ECMAScript bindings spec:
http://www.w3.org/TR/DOM-Level-3-Core/ecma-script-binding.html

Is it possible to allow caching for those cases? Firefox cachesthose

node lists for a long time (Maciej found the related bug
https://bugzilla.mozilla.org/show_bug.cgi?id=140758).  IE8 caches as
well.   Opera, Safari and Chrome do not.


I'm concerned about the GC-sensitivity of such behaviour (we might end
up snookering ourselves in a situation where specific GC behaviour
actually matters for compatibility).

It's not GC that matters but the degree of caching (e.g. whether cacheitems are ever cleared for reasons other than GC). It's true that thisis theoretically a hazard, but the only observable effect would bewhether custom properties set on one NodeList appear on one retrievedlater. Since it's very uncommon (and indeed unlikely) for authors toset custom properties on NodeLists, I think this benefit is purelytheoretical, not real.

How about the following compromise: these methods return a new object
each time, except if they are called with the same argument as the
previous invocation of the method? i.e. cache the last returned object
only. Would that be acceptable? It gives you a performance win in the
case where the author spins a loop using the same call over and over,
and is completely predictable.

It's only predictable if that last object is kept alive, even if itwere otherwise a candidate for garbage collection. Are you suggestingto do that? I assume so, because that's the only way it would be"completely predictable". If so, then I would object, because it couldlead to a large long-term memory cost (fully traversing a largeNodeList in a loop would leave you paying the cost of that memoryuntil you leave the page or the author fetches a different NodeList).Imagine the last NodeList you accessed was the result ofgetElementsByTagName("*") and the author fully traversed it. Nowyou've likely pinned memory proportional to the size of the DOM.

Even without the memory issue, I would not favor this design, becauseit makes performance fall off a cliff if you use more than oneNodeList. Changing your loop from fetching one NodeList to two couldsuddenly make it 50x slower. We do not like coding performance hazardslike this into our implementation.

Alternatively, if we need to cache more than that, how about blowing
away the cache with each spin of the event loop, so that anything in a
tight loop is cached (and _not_ subject to GC — this could be a
problem if the script calls one of these methods with 10000 different
arguments and sets properties on each one), but not beyond one task?
(i.e. don't share objects in calls across setTimeout)

Pinning a potentially unbounded number of NodeLists in memory woulddefinitely be unacceptable from both speed and memory perspectives.Especially on mobile devices.

I note that if all you care about is ensuring that behavior isdeterministic, the simplest solution would be to make NodeList objectsdisallow setting of custom properties. Then there is no way to observethe side effects of GC behavior. This would be simpler to implementthan either of your proposed rules, and would not create speed ormemory hazards. I do not know if we could justify such a change as amere erratum to DOM Level 3 Core, but the same goes for both yourproposed policies.

For now for the objects in HTML5 I've gone with the first of these
suggested compromises.

I don't think we'd be willing to implement that in WebKit. We're morelikely to copy existing Firefox and IE behavior.


Regards,
Maciej

Re: Allow to return same NodeList object for queries like getElementsByTagName, getElementsByClassName and getElementsByName

Reply via email to