Yeah, you are right. NodeIterators, and presumably Ranges, suffer from the observer problem. I.e. they want to be notified about mutations to the DOM, but only as long as the NodeIterator/Range stay alive.
My understanding is that this is one of the more common scenarios where the need for weak-references come up. Where you want to register something as an observer, but don't want the notification mechanism to hold a strong reference to the observer. Fortunately though, neither NodeIterators nor Ranges expose this in their public API. I.e. there is no way to use them to detect when GC happens. / Jonas On Wed, Feb 17, 2016 at 5:23 AM, Joris van der Wel <[email protected]> wrote: > Here is an example of using a NodeIterator: > > > ``` > const jsdom = require("jsdom"); > const document = jsdom.jsdom(`<a></a><b></b><c></c>`); > > let it = document.createNodeIterator(document.body); > console.log(it.nextNode().nodeName); // BODY > console.log(it.nextNode().nodeName); // A > console.log(it.nextNode().nodeName); // B > console.log(it.nextNode().nodeName); // C > console.log(it.nextNode()); // null > > it = document.createNodeIterator(document.body); > console.log(it.nextNode().nodeName); // BODY > document.body.removeChild(document.body.firstChild); // This remove > operation updates the internal state of the NodeIterator > console.log(it.nextNode().nodeName); // B > console.log(it.nextNode().nodeName); // C > console.log(it.nextNode()); // null > it = null; > ``` > > In the case of NodeIterator, there are currently (read: in ES6) two > spec (DOM whatwg) compliant implementations possible: > > 1. Keep a history of all changes a Document has gone through, forever. > 2. Keep a list of all NodeIterators which have been created for a > Document, forever. > > jsdom uses solution #2. This not only leaks memory, but remove > operations become slower as more and more NodeIterator's are created. > (however as domenic described earlier we limit this list to 10 entries > by default). > > The conflict between the DOM spec and ES6 is that we can not detect if > a NodeIterator is still in use by code outside of jsdom: > > ``` > it = document.createNodeIterator(document.body); > console.log(it.nextNode().nodeName); // BODY > // ... wait an hour ... > console.log(it.nextNode().nodeName); // A > it = null; // and only now we can stop updating the NodeIterator state > ``` > > (There used to be a it.detach() method for this purpose, but this has > been removed from the spec.) > > Being able to keep a list of NodeIterator's weakly would be the only > solution if we want to avoid leaking resources. > > Weak references might also be required for MutationObserver, although > I've not yet looked at this feature extensively, so I could be wrong. > Other features which you could implement using a weak reference (like > in the live collections) could be implemented using ES6 Proxy instead. > > XMLHttpRequest, fetch, WebSocket, etc would even require a something > similar to a phantom reference (like in java) so that we can close the > connection when the object is no longer strongly or weakly referenced. > > I would also really like to use weak references not just for jsdom, > there are some uses cases where they can simplify my code. > > Gr. Joris > > > On Wed, Feb 17, 2016 at 9:41 AM, Jonas Sicking <[email protected]> wrote: >> >> On Tue, Feb 16, 2016 at 11:02 PM, Domenic Denicola <[email protected]> wrote: >> >> For each NodeIterator object iterator whose root’s node document is >> >> node’s node document, run the NodeIterator pre-removing steps given node >> >> and iterator. >> > >> > Rephrased: every time you remove a Node from a document, you must go >> > through all of the document's NodeIterators and run some cleanup steps >> > (which have the effect of changing observable properties and behavior of >> > the NodeIterator). >> >> Could you implement all of this using MutationObservers? I.e. have the >> NodeIterators observe the relevant nodes using MutationObservers? >> >> The only case that I can think of where the DOM could use weak >> references is for the getElementsByTagName(x) function. This function >> will either return a new NodeList object, or an existing one. The >> reason it sometimes returns an existing one is for performance >> reasons. We saw a lot of code doing: >> >> var i; >> for (i = 0; i < document.getElementsByTagName("div").length; i++) { >> var elem = document.getElementsByTagName("div")[i]; >> doStuffWith(elem); >> } >> >> This generated a ton of NodeList objects, which are expensive to >> allocate. Hence browsers started caching these objects and returned an >> existing object "sometimes". >> >> The gecko implementation of "sometimes" uses a hash map keyed on >> tagname containing weak references to the returned NodeList. This is >> observable by for example doing: >> >> document.getElementsByTagName("div").foopy = "foopy"; >> if (document.getElementsByTagName("div").foopy != "foopy") { >> // GC ran between the getElementsByTagName calls. >> } >> >> However this exact behavior is not defined by spec. But I believe that >> all major browsers do do something similar for performance reasons. >> (This API is as old as it is crummy. And it is no surprise that it is >> poorly used). >> >> But it likely would be possible to write an implementation of >> "sometimes" which doesn't use weak references, at the cost of higher >> memory usage. >> >> / Jonas > > > > > -- > github.com/Joris-van-der-Wel _______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

