Yeah, you are right. NodeIterators, and presumably Ranges, suffer from
the observer problem. I.e. they want to be notified about mutations to
the DOM, but only as long as the NodeIterator/Range stay alive.

My understanding is that this is one of the more common scenarios
where the need for weak-references come up. Where you want to register
something as an observer, but don't want the notification mechanism to
hold a strong reference to the observer.

Fortunately though, neither NodeIterators nor Ranges expose this in
their public API. I.e. there is no way to use them to detect when GC
happens.

/ Jonas

On Wed, Feb 17, 2016 at 5:23 AM, Joris van der Wel
<[email protected]> wrote:
> Here is an example of using a NodeIterator:
>
>
> ```
> const jsdom = require("jsdom");
> const document = jsdom.jsdom(`<a></a><b></b><c></c>`);
>
> let it = document.createNodeIterator(document.body);
> console.log(it.nextNode().nodeName); // BODY
> console.log(it.nextNode().nodeName); // A
> console.log(it.nextNode().nodeName); // B
> console.log(it.nextNode().nodeName); // C
> console.log(it.nextNode()); // null
>
> it = document.createNodeIterator(document.body);
> console.log(it.nextNode().nodeName); // BODY
> document.body.removeChild(document.body.firstChild); // This remove
> operation updates the internal state of the NodeIterator
> console.log(it.nextNode().nodeName); // B
> console.log(it.nextNode().nodeName); // C
> console.log(it.nextNode()); // null
> it = null;
> ```
>
> In the case of NodeIterator, there are currently (read: in ES6) two
> spec (DOM whatwg) compliant implementations possible:
>
> 1. Keep a history of all changes a Document has gone through, forever.
> 2. Keep a list of all NodeIterators which have been created for a
> Document, forever.
>
> jsdom uses solution #2. This not only leaks memory, but remove
> operations become slower as more and more NodeIterator's are created.
> (however as domenic described earlier we limit this list to 10 entries
> by default).
>
> The conflict between the DOM spec and ES6 is that we can not detect if
> a NodeIterator is still in use by code outside of jsdom:
>
> ```
> it = document.createNodeIterator(document.body);
> console.log(it.nextNode().nodeName); // BODY
> // ... wait an hour ...
> console.log(it.nextNode().nodeName); // A
> it = null; // and only now we can stop updating the NodeIterator state
> ```
>
> (There used to be a it.detach() method for this purpose, but this has
> been removed from the spec.)
>
> Being able to keep a list of NodeIterator's weakly would be the only
> solution if we want to avoid leaking resources.
>
> Weak references might also be required for MutationObserver, although
> I've not yet looked at this feature extensively, so I could be wrong.
> Other features which you could implement using a weak reference (like
> in the live collections) could be implemented using ES6 Proxy instead.
>
> XMLHttpRequest, fetch, WebSocket, etc would even require a something
> similar to a phantom reference (like in java) so that we can close the
> connection when the object is no longer strongly or weakly referenced.
>
> I would also really like to use weak references not just for jsdom,
> there are some uses cases where they can simplify my code.
>
> Gr. Joris
>
>
> On Wed, Feb 17, 2016 at 9:41 AM, Jonas Sicking <[email protected]> wrote:
>>
>> On Tue, Feb 16, 2016 at 11:02 PM, Domenic Denicola <[email protected]> wrote:
>> >> For each NodeIterator object iterator whose root’s node document is 
>> >> node’s node document, run the NodeIterator pre-removing steps given node 
>> >> and iterator.
>> >
>> > Rephrased: every time you remove a Node from a document, you must go 
>> > through all of the document's NodeIterators and run some cleanup steps 
>> > (which have the effect of changing observable properties and behavior of 
>> > the NodeIterator).
>>
>> Could you implement all of this using MutationObservers? I.e. have the
>> NodeIterators observe the relevant nodes using MutationObservers?
>>
>> The only case that I can think of where the DOM could use weak
>> references is for the getElementsByTagName(x) function. This function
>> will either return a new NodeList object, or an existing one. The
>> reason it sometimes returns an existing one is for performance
>> reasons. We saw a lot of code doing:
>>
>> var i;
>> for (i = 0; i < document.getElementsByTagName("div").length; i++) {
>>   var elem = document.getElementsByTagName("div")[i];
>>   doStuffWith(elem);
>> }
>>
>> This generated a ton of NodeList objects, which are expensive to
>> allocate. Hence browsers started caching these objects and returned an
>> existing object "sometimes".
>>
>> The gecko implementation of "sometimes" uses a hash map keyed on
>> tagname containing weak references to the returned NodeList. This is
>> observable by for example doing:
>>
>> document.getElementsByTagName("div").foopy = "foopy";
>> if (document.getElementsByTagName("div").foopy != "foopy") {
>>   // GC ran between the getElementsByTagName calls.
>> }
>>
>> However this exact behavior is not defined by spec. But I believe that
>> all major browsers do do something similar for performance reasons.
>> (This API is as old as it is crummy. And it is no surprise that it is
>> poorly used).
>>
>> But it likely would be possible to write an implementation of
>> "sometimes" which doesn't use weak references, at the cost of higher
>> memory usage.
>>
>> / Jonas
>
>
>
>
> --
> github.com/Joris-van-der-Wel
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to