Re: Error stack strawman
On 18 February 2016 at 02:36, Gary Guo wrote: > * isTail will be set when the frame indicates a frame created by tail call > instead of normal function call. Caller's frame is already removed so we > need some indication for that to help debugging. > This would be fairly difficult to support by implementations. In V8, for example, we currently have no way of reconstructing that information, nor would it be easy or cheap to add that. A frame is created by the callee, but that does not know how it got called. Funnelling through that information would effectively require a hidden extra argument to _every_ call. /Andreas ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Error stack strawman
Mark Knichel has a lot of information regarding stacks and error handling in various browsers here: https://github.com/mknichel/javascript-errors On Wed, Feb 17, 2016 at 4:19 PM, Gary Guo wrote: > The strawman looks very old, so I've created a new one. > > Repo: https://github.com/nbdd0121/es-error-stack > > I've collected many information about current implementation from IE, > Edge, Chrome and Firefox, but missing Safari's. Many thanks if some one can > collect these info and create a pull request. > > I haven't write anything for API part, as you will see from the "concerns" > part, there are many edge cases to be considered: cross-realm, native, > global, eval, new Function, anonymous and tail call. All of these need to > be resolved before we can trying to design an object representation of > stack frame. > > Personally I suggest "(global code)" for global, "(eval code)" for eval, > "(Function code)" for new Function, "(anonymous function)" for anonymous > function/lambda. For native call, we can simply replace filename & line & > column by "(native)". For tail call I suggest add "(tail)" some where. I > also suggest adding "(other realm)" or something alike to indicate realm > boundary is crossed. > > For object representation, I hope something like > ``` > { > name: 'string', // (global code), etc for special case, with parenthesis > source: 'url', // (native) for native code, with parenthesis > line: 'integer', > column: 'integer', > isTail: 'boolean' > } > ``` > And null entry indicating crossing realm. BTW, shall we add reference to > function in the object representation? > > Gary Guo > > ___ > es-discuss mailing list > es-discuss@mozilla.org > https://mail.mozilla.org/listinfo/es-discuss > > ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Error stack strawman
On Wed, Feb 17, 2016 at 5:36 PM, Gary Guo wrote: > * isTail will be set when the frame indicates a frame created by tail call > instead of normal function call. Caller's frame is already removed so we > need some indication for that to help debugging. > Nice > > * For span, I put only one pair of line/column there as it is the common > implementation, but I agree that a starting position and a ending one is > useful. > > * For source, nested frame could be useful but it is not implemented by > all implementations, and in fact we need an extra field to distinguish eval > and new Function. > For eval vs Function (vs GeneratorFunction, vs AsyncFunction, etc), doesn't the name inside the nested frame already deal with that? > > * By reference to function, I mean that shall we be able to retrieve the > function object from the frame? > No, absolutely not. The stack rep should provide only info, not access. > > * I wonder if putting special cases in (), such as (native) will cause any > problem. No one will have a file called "(native)" in reality, isn't it? > If you do this, they will ;) > > Gary Guo > > -- > Date: Wed, 17 Feb 2016 17:04:39 -0800 > Subject: Re: Error stack strawman > From: erig...@google.com > To: nbdd0...@hotmail.com > CC: es-discuss@mozilla.org > > > > > On Wed, Feb 17, 2016 at 4:19 PM, Gary Guo wrote: > > The strawman looks very old, so I've created a new one. > > Repo: https://github.com/nbdd0121/es-error-stack > > I've collected many information about current implementation from IE, > Edge, Chrome and Firefox, but missing Safari's. Many thanks if some one can > collect these info and create a pull request. > > I haven't write anything for API part, as you will see from the "concerns" > part, there are many edge cases to be considered: cross-realm, native, > global, eval, new Function, anonymous and tail call. All of these need to > be resolved before we can trying to design an object representation of > stack frame. > > Personally I suggest "(global code)" for global, "(eval code)" for eval, > "(Function code)" for new Function, "(anonymous function)" for anonymous > function/lambda. For native call, we can simply replace filename & line & > column by "(native)". For tail call I suggest add "(tail)" some where. I > also suggest adding "(other realm)" or something alike to indicate realm > boundary is crossed. > > For object representation, I hope something like > ``` > { > name: 'string', // (global code), etc for special case, with parenthesis > source: 'url', // (native) for native code, with parenthesis > line: 'integer', > column: 'integer', > isTail: 'boolean' > } > ``` > > > Unless the object representation is primary, we will need to agree on > comprehensive escaping rules, and corresponding parsing rules, so that > these stack strings can be unambiguously scraped even when file names and > function names contain parens, slashes, angle brackets, at-signs, spaces, > etc. Therefore, we should focus on the object representation first. > > Your object representation above looks like a good start. It is similar to > the extended Causeway stack format I mentioned earlier > > stacktrace ::= {calls: [frame*]}; > frame ::= {name: functionName, >source: source, >span: [[startLine,startCol?],[endLine,endCol?]?]}; > functionName ::= STRING; > startLine, startCol, endLine, endCol ::= INTEGER > source ::= STRING | frame; > > with the following differences: > > * You added an isTail. This is probably a good thing. I'd like to > understand better what you have in mind. > > * Rather than have a single "span" property with a nested array of numbers > as value, you define separate line and column property names. As long as we > represent all that we need unambiguously, I'm indifferent to minor surface > syntax differences. > > * Causeway's format has room for both start(line,col) and end(line,col). > The format must include room for this, and I would hope any future standard > would mandate that they be included. Such span information makes a huge > usability improvement in reporting diagnostics. > > * The extended Causeway "source" field could be either a string as with > your's, or a nested frame. This is necessary to preserve the information > currently provided on both FF and Chrome of the nested positions in a > single frame, when a call happens at position X in an eval string that was > evaled by an eval call at position Y. (That is what the "extended" means. > Causeway originally only has strings as the value of their "source" > property.) > > The proposed[1] API is: > > System.getStack(err) -> stack-representation > Reflect.stackString(stack-representation) -> stack-string > System.getStackString(err) -> stack-string > > where getStackString is just the obvious composition of getStack and > stackString. > > > > And null entry indicating crossing realm. BTW, shall we add reference to > func
RE: Error stack strawman
* isTail will be set when the frame indicates a frame created by tail call instead of normal function call. Caller's frame is already removed so we need some indication for that to help debugging. * For span, I put only one pair of line/column there as it is the common implementation, but I agree that a starting position and a ending one is useful. * For source, nested frame could be useful but it is not implemented by all implementations, and in fact we need an extra field to distinguish eval and new Function. * By reference to function, I mean that shall we be able to retrieve the function object from the frame? * I wonder if putting special cases in (), such as (native) will cause any problem. No one will have a file called "(native)" in reality, isn't it? Gary Guo Date: Wed, 17 Feb 2016 17:04:39 -0800 Subject: Re: Error stack strawman From: erig...@google.com To: nbdd0...@hotmail.com CC: es-discuss@mozilla.org On Wed, Feb 17, 2016 at 4:19 PM, Gary Guo wrote: The strawman looks very old, so I've created a new one. Repo: https://github.com/nbdd0121/es-error-stack I've collected many information about current implementation from IE, Edge, Chrome and Firefox, but missing Safari's. Many thanks if some one can collect these info and create a pull request. I haven't write anything for API part, as you will see from the "concerns" part, there are many edge cases to be considered: cross-realm, native, global, eval, new Function, anonymous and tail call. All of these need to be resolved before we can trying to design an object representation of stack frame. Personally I suggest "(global code)" for global, "(eval code)" for eval, "(Function code)" for new Function, "(anonymous function)" for anonymous function/lambda. For native call, we can simply replace filename & line & column by "(native)". For tail call I suggest add "(tail)" some where. I also suggest adding "(other realm)" or something alike to indicate realm boundary is crossed. For object representation, I hope something like ```{ name: 'string', // (global code), etc for special case, with parenthesis source: 'url', // (native) for native code, with parenthesis line: 'integer', column: 'integer', isTail: 'boolean'}``` Unless the object representation is primary, we will need to agree on comprehensive escaping rules, and corresponding parsing rules, so that these stack strings can be unambiguously scraped even when file names and function names contain parens, slashes, angle brackets, at-signs, spaces, etc. Therefore, we should focus on the object representation first. Your object representation above looks like a good start. It is similar to the extended Causeway stack format I mentioned earlier stacktrace ::= {calls: [frame*]};frame ::= {name: functionName, source: source, span: [[startLine,startCol?],[endLine,endCol?]?]};functionName ::= STRING; startLine, startCol, endLine, endCol ::= INTEGERsource ::= STRING | frame; with the following differences: * You added an isTail. This is probably a good thing. I'd like to understand better what you have in mind. * Rather than have a single "span" property with a nested array of numbers as value, you define separate line and column property names. As long as we represent all that we need unambiguously, I'm indifferent to minor surface syntax differences. * Causeway's format has room for both start(line,col) and end(line,col). The format must include room for this, and I would hope any future standard would mandate that they be included. Such span information makes a huge usability improvement in reporting diagnostics. * The extended Causeway "source" field could be either a string as with your's, or a nested frame. This is necessary to preserve the information currently provided on both FF and Chrome of the nested positions in a single frame, when a call happens at position X in an eval string that was evaled by an eval call at position Y. (That is what the "extended" means. Causeway originally only has strings as the value of their "source" property.) The proposed[1] API is: System.getStack(err) -> stack-representationReflect.stackString(stack-representation) -> stack-stringSystem.getStackString(err) -> stack-string where getStackString is just the obvious composition of getStack and stackString. And null entry indicating crossing realm. BTW, shall we add reference to function in the object representation? What do you mean by "reference to" above? [1] Hopefully https://github.com/tc39/ecma262/issues/395 will resolve in time that none of these need to be rooted in globals. Gary Guo ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss -- Cheers, --MarkM ___ es-discuss mailing list es-discuss@mozilla.org htt
Re: Error stack strawman
On Wed, Feb 17, 2016 at 4:19 PM, Gary Guo wrote: > The strawman looks very old, so I've created a new one. > > Repo: https://github.com/nbdd0121/es-error-stack > > I've collected many information about current implementation from IE, > Edge, Chrome and Firefox, but missing Safari's. Many thanks if some one can > collect these info and create a pull request. > > I haven't write anything for API part, as you will see from the "concerns" > part, there are many edge cases to be considered: cross-realm, native, > global, eval, new Function, anonymous and tail call. All of these need to > be resolved before we can trying to design an object representation of > stack frame. > > Personally I suggest "(global code)" for global, "(eval code)" for eval, > "(Function code)" for new Function, "(anonymous function)" for anonymous > function/lambda. For native call, we can simply replace filename & line & > column by "(native)". For tail call I suggest add "(tail)" some where. I > also suggest adding "(other realm)" or something alike to indicate realm > boundary is crossed. > > For object representation, I hope something like > ``` > { > name: 'string', // (global code), etc for special case, with parenthesis > source: 'url', // (native) for native code, with parenthesis > line: 'integer', > column: 'integer', > isTail: 'boolean' > } > ``` > Unless the object representation is primary, we will need to agree on comprehensive escaping rules, and corresponding parsing rules, so that these stack strings can be unambiguously scraped even when file names and function names contain parens, slashes, angle brackets, at-signs, spaces, etc. Therefore, we should focus on the object representation first. Your object representation above looks like a good start. It is similar to the extended Causeway stack format I mentioned earlier stacktrace ::= {calls: [frame*]}; frame ::= {name: functionName, source: source, span: [[startLine,startCol?],[endLine,endCol?]?]}; functionName ::= STRING; startLine, startCol, endLine, endCol ::= INTEGER source ::= STRING | frame; with the following differences: * You added an isTail. This is probably a good thing. I'd like to understand better what you have in mind. * Rather than have a single "span" property with a nested array of numbers as value, you define separate line and column property names. As long as we represent all that we need unambiguously, I'm indifferent to minor surface syntax differences. * Causeway's format has room for both start(line,col) and end(line,col). The format must include room for this, and I would hope any future standard would mandate that they be included. Such span information makes a huge usability improvement in reporting diagnostics. * The extended Causeway "source" field could be either a string as with your's, or a nested frame. This is necessary to preserve the information currently provided on both FF and Chrome of the nested positions in a single frame, when a call happens at position X in an eval string that was evaled by an eval call at position Y. (That is what the "extended" means. Causeway originally only has strings as the value of their "source" property.) The proposed[1] API is: System.getStack(err) -> stack-representation Reflect.stackString(stack-representation) -> stack-string System.getStackString(err) -> stack-string where getStackString is just the obvious composition of getStack and stackString. > And null entry indicating crossing realm. BTW, shall we add reference to > function in the object representation? > What do you mean by "reference to" above? [1] Hopefully https://github.com/tc39/ecma262/issues/395 will resolve in time that none of these need to be rooted in globals. > > Gary Guo > > ___ > es-discuss mailing list > es-discuss@mozilla.org > https://mail.mozilla.org/listinfo/es-discuss > > -- Cheers, --MarkM ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
RE: Error stack strawman
The strawman looks very old, so I've created a new one. Repo: https://github.com/nbdd0121/es-error-stack I've collected many information about current implementation from IE, Edge, Chrome and Firefox, but missing Safari's. Many thanks if some one can collect these info and create a pull request. I haven't write anything for API part, as you will see from the "concerns" part, there are many edge cases to be considered: cross-realm, native, global, eval, new Function, anonymous and tail call. All of these need to be resolved before we can trying to design an object representation of stack frame. Personally I suggest "(global code)" for global, "(eval code)" for eval, "(Function code)" for new Function, "(anonymous function)" for anonymous function/lambda. For native call, we can simply replace filename & line & column by "(native)". For tail call I suggest add "(tail)" some where. I also suggest adding "(other realm)" or something alike to indicate realm boundary is crossed. For object representation, I hope something like ```{ name: 'string', // (global code), etc for special case, with parenthesis source: 'url', // (native) for native code, with parenthesis line: 'integer', column: 'integer', isTail: 'boolean'}```And null entry indicating crossing realm. BTW, shall we add reference to function in the object representation? Gary Guo ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Weak Reference proposal
Yeah, you are right. NodeIterators, and presumably Ranges, suffer from the observer problem. I.e. they want to be notified about mutations to the DOM, but only as long as the NodeIterator/Range stay alive. My understanding is that this is one of the more common scenarios where the need for weak-references come up. Where you want to register something as an observer, but don't want the notification mechanism to hold a strong reference to the observer. Fortunately though, neither NodeIterators nor Ranges expose this in their public API. I.e. there is no way to use them to detect when GC happens. / Jonas On Wed, Feb 17, 2016 at 5:23 AM, Joris van der Wel wrote: > Here is an example of using a NodeIterator: > > > ``` > const jsdom = require("jsdom"); > const document = jsdom.jsdom(``); > > let it = document.createNodeIterator(document.body); > console.log(it.nextNode().nodeName); // BODY > console.log(it.nextNode().nodeName); // A > console.log(it.nextNode().nodeName); // B > console.log(it.nextNode().nodeName); // C > console.log(it.nextNode()); // null > > it = document.createNodeIterator(document.body); > console.log(it.nextNode().nodeName); // BODY > document.body.removeChild(document.body.firstChild); // This remove > operation updates the internal state of the NodeIterator > console.log(it.nextNode().nodeName); // B > console.log(it.nextNode().nodeName); // C > console.log(it.nextNode()); // null > it = null; > ``` > > In the case of NodeIterator, there are currently (read: in ES6) two > spec (DOM whatwg) compliant implementations possible: > > 1. Keep a history of all changes a Document has gone through, forever. > 2. Keep a list of all NodeIterators which have been created for a > Document, forever. > > jsdom uses solution #2. This not only leaks memory, but remove > operations become slower as more and more NodeIterator's are created. > (however as domenic described earlier we limit this list to 10 entries > by default). > > The conflict between the DOM spec and ES6 is that we can not detect if > a NodeIterator is still in use by code outside of jsdom: > > ``` > it = document.createNodeIterator(document.body); > console.log(it.nextNode().nodeName); // BODY > // ... wait an hour ... > console.log(it.nextNode().nodeName); // A > it = null; // and only now we can stop updating the NodeIterator state > ``` > > (There used to be a it.detach() method for this purpose, but this has > been removed from the spec.) > > Being able to keep a list of NodeIterator's weakly would be the only > solution if we want to avoid leaking resources. > > Weak references might also be required for MutationObserver, although > I've not yet looked at this feature extensively, so I could be wrong. > Other features which you could implement using a weak reference (like > in the live collections) could be implemented using ES6 Proxy instead. > > XMLHttpRequest, fetch, WebSocket, etc would even require a something > similar to a phantom reference (like in java) so that we can close the > connection when the object is no longer strongly or weakly referenced. > > I would also really like to use weak references not just for jsdom, > there are some uses cases where they can simplify my code. > > Gr. Joris > > > On Wed, Feb 17, 2016 at 9:41 AM, Jonas Sicking wrote: >> >> On Tue, Feb 16, 2016 at 11:02 PM, Domenic Denicola wrote: >> >> For each NodeIterator object iterator whose root’s node document is >> >> node’s node document, run the NodeIterator pre-removing steps given node >> >> and iterator. >> > >> > Rephrased: every time you remove a Node from a document, you must go >> > through all of the document's NodeIterators and run some cleanup steps >> > (which have the effect of changing observable properties and behavior of >> > the NodeIterator). >> >> Could you implement all of this using MutationObservers? I.e. have the >> NodeIterators observe the relevant nodes using MutationObservers? >> >> The only case that I can think of where the DOM could use weak >> references is for the getElementsByTagName(x) function. This function >> will either return a new NodeList object, or an existing one. The >> reason it sometimes returns an existing one is for performance >> reasons. We saw a lot of code doing: >> >> var i; >> for (i = 0; i < document.getElementsByTagName("div").length; i++) { >> var elem = document.getElementsByTagName("div")[i]; >> doStuffWith(elem); >> } >> >> This generated a ton of NodeList objects, which are expensive to >> allocate. Hence browsers started caching these objects and returned an >> existing object "sometimes". >> >> The gecko implementation of "sometimes" uses a hash map keyed on >> tagname containing weak references to the returned NodeList. This is >> observable by for example doing: >> >> document.getElementsByTagName("div").foopy = "foopy"; >> if (document.getElementsByTagName("div").foopy != "foopy") { >> // GC ran between the getElementsByTagName calls. >> } >> >> However this exact b
Re: Weak Reference proposal
On 2/17/16 8:30 AM, Joris van der Wel wrote: XMLHttpRequest, fetch, WebSocket, etc would even require a something similar to a phantom reference (like in java) so that we can close the connection when the object is no longer strongly or weakly referenced. None of these allow closing the connection merely because the object is not referenced from elsewhere. Or put another way, the connection must hold a reference to the object. -Boris ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Garbage collection in generators
On 2/17/16 3:59 AM, Benjamin Gruenbaum wrote: Garbage collection can and does in fact manage resources in JavaScript host environments right now. For example, an XMLHttpRequest /may /abort the underlying HTTP request if the XMLHttpObject is not referenced anywhere and gets garbage collected. If an implementation does that, it's clearly buggy. Consider: function foo() { var xhr = new XMLHttpRequest(); xhr.addEventListener("load", function() { alert(this.responseText); } xhr.open(stuff); xhr.send(); } foo(); The XMLHttpRequest object is not "referenced anywhere" in JS terms between foo() returning and the load event being fired. But the load event really does need to be fired. Can you point me to the spec text that makes you think that in this situation not firing the load event would be an OK thing to do? In practice what this means is that the UA needs to keep the object alive and prevent it being garbage collected until the end of the HTTP response is received. -Boris P.S. https://groups.google.com/d/msg/mozilla.dev.tech.js-engine.internals/V__5zqll3zc/hLJiNqd8Xq8J has some comments on this exact issue of resource management via GC. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Garbage collection in generators
> C++ RAII and Python refcounting are completely different: they are precise, prompt, predictable, and deterministic. C++ RAII is indeed amazingly deterministic - as are languages with built in reference counters like Swift. Python refcounting certainly is not since it performs cycle detection. Had it not performed cycle detection (mark & sweep as far as I recall in CPython). PyPy and other implementations have fuller garbage collection systems https://pypy.readthedocs.org/en/release-2.4.x/garbage_collection.html It appears that Python GC with cycle detection predates the change in generators that gave them "finalization" (At least 2.2 where the change in Generators came in 2.5). Python does however have destructors. They are weak (certainly weaker than C++ destructors). Python also however has context management through the `with` statement like C# (using) and Java (try-with-resource). Interestingly - they even have async/await aware disposers (async disposers - added in 3.5). On Wed, Feb 17, 2016 at 4:54 PM, Mark S. Miller wrote: > Everyone, please keep in mind the following distinctions: > > General GC is not prompt or predicable. There is an unspecified and > unpredictable delay before anything non-reachable is noticed to be > unreachable. > > JavaScript GC is not specified to be precise, and so should be assumed > conservative. Conservative GC may never notice that any particular > unreachable thing is unreachable. The only reliable guarantee is that it > will never collect anything which future computation will reach, i.e., it > will not cause a spontaneous dangling reference. Beyond this, it provides > only unspecified and, at best, probabilistic and partial cleanup. > > C++ RAII and Python refcounting are completely different: they are > precise, prompt, predictable, and deterministic. > > > > On Wed, Feb 17, 2016 at 12:59 AM, Benjamin Gruenbaum > wrote: > >> >> >> On Wed, Feb 17, 2016 at 10:51 AM, Andreas Rossberg >> wrote: >> >>> On 17 February 2016 at 09:40, Benjamin Gruenbaum >>> wrote: >>> If you starve a generator it's not going to get completed, just like > other control flow won't. > I'm not sure starving is what I'd use here - I definitely do see users do a pattern similar to: ```js function getResults*() { try { var resource = acquire(); for(const item of resource) yield process(item); } finally { release(resource); } } ``` >>> >>> Yes, exactly the kind of pattern I was referring to as "bogus forms of >>> resource management". This is an anti-pattern in ES6. It won't work >>> correctly. We should never have given the illusion that it does. >>> >> >> What is or is not an anti-pattern is debatable. Technically if you call >> `.return` it will run the finally block and release the resources (although >> if the finally block itself contains `yield` those will also run). >> Effectively, this will have the same sort of consequences that >> "acquire()" and "release()" had to begin with - so I would not say it makes >> things worse but I definitely agree that it creates a form of false >> expectation. >> >> Still - I'm very curious why languages like Python have chosen to call >> `finally` blocks in this case - this was not a hindsight and according to >> the PEP. They debated it and explicitly decided to call `release`. I'll see >> if I can email the people involved and ask about it. >> >> >> >>> garbage collection is a form of automatic resource management. >>> >>> >>> Most GC experts would strongly disagree, if by resource you mean >>> anything else but memory. >>> >> >> Memory is most certainly a resource. Languages that are not GCd like C++ >> really don't make the distinction we make :) >> >> Garbage collection can and does in fact manage resources in JavaScript >> host environments right now. For example, an XMLHttpRequest *may *abort >> the underlying HTTP request if the XMLHttpObject is not referenced anywhere >> and gets garbage collected. >> >> >> ___ >> es-discuss mailing list >> es-discuss@mozilla.org >> https://mail.mozilla.org/listinfo/es-discuss >> >> > > > -- > Cheers, > --MarkM > ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Garbage collection in generators
Everyone, please keep in mind the following distinctions: General GC is not prompt or predicable. There is an unspecified and unpredictable delay before anything non-reachable is noticed to be unreachable. JavaScript GC is not specified to be precise, and so should be assumed conservative. Conservative GC may never notice that any particular unreachable thing is unreachable. The only reliable guarantee is that it will never collect anything which future computation will reach, i.e., it will not cause a spontaneous dangling reference. Beyond this, it provides only unspecified and, at best, probabilistic and partial cleanup. C++ RAII and Python refcounting are completely different: they are precise, prompt, predictable, and deterministic. On Wed, Feb 17, 2016 at 12:59 AM, Benjamin Gruenbaum wrote: > > > On Wed, Feb 17, 2016 at 10:51 AM, Andreas Rossberg > wrote: > >> On 17 February 2016 at 09:40, Benjamin Gruenbaum >> wrote: >> >>> If you starve a generator it's not going to get completed, just like other control flow won't. >>> >>> I'm not sure starving is what I'd use here - I definitely do see users >>> do a pattern similar to: >>> >>> ```js >>> function getResults*() { >>> try { >>> var resource = acquire(); >>> for(const item of resource) yield process(item); >>> } finally { >>> release(resource); >>> } >>> } >>> ``` >>> >> >> Yes, exactly the kind of pattern I was referring to as "bogus forms of >> resource management". This is an anti-pattern in ES6. It won't work >> correctly. We should never have given the illusion that it does. >> > > What is or is not an anti-pattern is debatable. Technically if you call > `.return` it will run the finally block and release the resources (although > if the finally block itself contains `yield` those will also run). > Effectively, this will have the same sort of consequences that "acquire()" > and "release()" had to begin with - so I would not say it makes things > worse but I definitely agree that it creates a form of false expectation. > > Still - I'm very curious why languages like Python have chosen to call > `finally` blocks in this case - this was not a hindsight and according to > the PEP. They debated it and explicitly decided to call `release`. I'll see > if I can email the people involved and ask about it. > > > >> garbage collection is a form of automatic resource management. >> >> >> Most GC experts would strongly disagree, if by resource you mean anything >> else but memory. >> > > Memory is most certainly a resource. Languages that are not GCd like C++ > really don't make the distinction we make :) > > Garbage collection can and does in fact manage resources in JavaScript > host environments right now. For example, an XMLHttpRequest *may *abort > the underlying HTTP request if the XMLHttpObject is not referenced anywhere > and gets garbage collected. > > > ___ > es-discuss mailing list > es-discuss@mozilla.org > https://mail.mozilla.org/listinfo/es-discuss > > -- Cheers, --MarkM ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Weak Reference proposal
Resending because I the mailing list reject my previous email: Here is an example of using a NodeIterator: ``` const jsdom = require("jsdom"); const document = jsdom.jsdom(``); let it = document.createNodeIterator(document.body); console.log(it.nextNode().nodeName); // BODY console.log(it.nextNode().nodeName); // A console.log(it.nextNode().nodeName); // B console.log(it.nextNode().nodeName); // C console.log(it.nextNode()); // null it = document.createNodeIterator(document.body); console.log(it.nextNode().nodeName); // BODY document.body.removeChild(document.body.firstChild); // This remove operation updates the internal state of the NodeIterator console.log(it.nextNode().nodeName); // B console.log(it.nextNode().nodeName); // C console.log(it.nextNode()); // null it = null; ``` In the case of NodeIterator, there are currently (read: in ES6) two spec (DOM whatwg) compliant implementations possible: 1. Keep a history of all changes a Document has gone through, forever. 2. Keep a list of all NodeIterators which have been created for a Document, forever. jsdom uses solution #2. This not only leaks memory, but remove operations become slower as more and more NodeIterator's are created. (however as domenic described earlier we limit this list to 10 entries by default). The conflict between the DOM spec and ES6 is that we can not detect if a NodeIterator is still in use by code outside of jsdom: ``` it = document.createNodeIterator(document.body); console.log(it.nextNode().nodeName); // BODY // ... wait an hour ... console.log(it.nextNode().nodeName); // A it = null; // and only now we can stop updating the NodeIterator state ``` (There used to be a it.detach() method for this purpose, but this has been removed from the spec.) Being able to keep a list of NodeIterator's weakly would be the only solution if we want to avoid leaking resources. Weak references might also be required for MutationObserver, although I've not yet looked at this feature extensively, so I could be wrong. Other features which you could implement using a weak reference (like in the live collections) could be implemented using ES6 Proxy instead. XMLHttpRequest, fetch, WebSocket, etc would even require a something similar to a phantom reference (like in java) so that we can close the connection when the object is no longer strongly or weakly referenced. I would also really like to use weak references not just for jsdom, there are some uses cases where they can simplify my code. Gr. Joris On Wed, Feb 17, 2016 at 9:41 AM, Jonas Sicking wrote: > On Tue, Feb 16, 2016 at 11:02 PM, Domenic Denicola wrote: >>> For each NodeIterator object iterator whose root’s node document is node’s >>> node document, run the NodeIterator pre-removing steps given node and >>> iterator. >> >> Rephrased: every time you remove a Node from a document, you must go through >> all of the document's NodeIterators and run some cleanup steps (which have >> the effect of changing observable properties and behavior of the >> NodeIterator). > > Could you implement all of this using MutationObservers? I.e. have the > NodeIterators observe the relevant nodes using MutationObservers? > > The only case that I can think of where the DOM could use weak > references is for the getElementsByTagName(x) function. This function > will either return a new NodeList object, or an existing one. The > reason it sometimes returns an existing one is for performance > reasons. We saw a lot of code doing: > > var i; > for (i = 0; i < document.getElementsByTagName("div").length; i++) { > var elem = document.getElementsByTagName("div")[i]; > doStuffWith(elem); > } > > This generated a ton of NodeList objects, which are expensive to > allocate. Hence browsers started caching these objects and returned an > existing object "sometimes". > > The gecko implementation of "sometimes" uses a hash map keyed on > tagname containing weak references to the returned NodeList. This is > observable by for example doing: > > document.getElementsByTagName("div").foopy = "foopy"; > if (document.getElementsByTagName("div").foopy != "foopy") { > // GC ran between the getElementsByTagName calls. > } > > However this exact behavior is not defined by spec. But I believe that > all major browsers do do something similar for performance reasons. > (This API is as old as it is crummy. And it is no surprise that it is > poorly used). > > But it likely would be possible to write an implementation of > "sometimes" which doesn't use weak references, at the cost of higher > memory usage. > > / Jonas -- github.com/Joris-van-der-Wel ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Garbage collection in generators
On Wed, Feb 17, 2016 at 10:51 AM, Andreas Rossberg wrote: > On 17 February 2016 at 09:40, Benjamin Gruenbaum > wrote: > >> If you starve a generator it's not going to get completed, just like >>> other control flow won't. >>> >> >> I'm not sure starving is what I'd use here - I definitely do see users do >> a pattern similar to: >> >> ```js >> function getResults*() { >> try { >> var resource = acquire(); >> for(const item of resource) yield process(item); >> } finally { >> release(resource); >> } >> } >> ``` >> > > Yes, exactly the kind of pattern I was referring to as "bogus forms of > resource management". This is an anti-pattern in ES6. It won't work > correctly. We should never have given the illusion that it does. > What is or is not an anti-pattern is debatable. Technically if you call `.return` it will run the finally block and release the resources (although if the finally block itself contains `yield` those will also run). Effectively, this will have the same sort of consequences that "acquire()" and "release()" had to begin with - so I would not say it makes things worse but I definitely agree that it creates a form of false expectation. Still - I'm very curious why languages like Python have chosen to call `finally` blocks in this case - this was not a hindsight and according to the PEP. They debated it and explicitly decided to call `release`. I'll see if I can email the people involved and ask about it. > garbage collection is a form of automatic resource management. > > > Most GC experts would strongly disagree, if by resource you mean anything > else but memory. > Memory is most certainly a resource. Languages that are not GCd like C++ really don't make the distinction we make :) Garbage collection can and does in fact manage resources in JavaScript host environments right now. For example, an XMLHttpRequest *may *abort the underlying HTTP request if the XMLHttpObject is not referenced anywhere and gets garbage collected. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Garbage collection in generators
On 17 February 2016 at 09:40, Benjamin Gruenbaum wrote: > If you starve a generator it's not going to get completed, just like other >> control flow won't. >> > > I'm not sure starving is what I'd use here - I definitely do see users do > a pattern similar to: > > ```js > function getResults*() { > try { > var resource = acquire(); > for(const item of resource) yield process(item); > } finally { > release(resource); > } > } > ``` > Yes, exactly the kind of pattern I was referring to as "bogus forms of resource management". This is an anti-pattern in ES6. It won't work correctly. We should never have given the illusion that it does. garbage collection is a form of automatic resource management. Most GC experts would strongly disagree, if by resource you mean anything else but memory. /Andreas ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Weak Reference proposal
On Tue, Feb 16, 2016 at 11:02 PM, Domenic Denicola wrote: >> For each NodeIterator object iterator whose root’s node document is node’s >> node document, run the NodeIterator pre-removing steps given node and >> iterator. > > Rephrased: every time you remove a Node from a document, you must go through > all of the document's NodeIterators and run some cleanup steps (which have > the effect of changing observable properties and behavior of the > NodeIterator). Could you implement all of this using MutationObservers? I.e. have the NodeIterators observe the relevant nodes using MutationObservers? The only case that I can think of where the DOM could use weak references is for the getElementsByTagName(x) function. This function will either return a new NodeList object, or an existing one. The reason it sometimes returns an existing one is for performance reasons. We saw a lot of code doing: var i; for (i = 0; i < document.getElementsByTagName("div").length; i++) { var elem = document.getElementsByTagName("div")[i]; doStuffWith(elem); } This generated a ton of NodeList objects, which are expensive to allocate. Hence browsers started caching these objects and returned an existing object "sometimes". The gecko implementation of "sometimes" uses a hash map keyed on tagname containing weak references to the returned NodeList. This is observable by for example doing: document.getElementsByTagName("div").foopy = "foopy"; if (document.getElementsByTagName("div").foopy != "foopy") { // GC ran between the getElementsByTagName calls. } However this exact behavior is not defined by spec. But I believe that all major browsers do do something similar for performance reasons. (This API is as old as it is crummy. And it is no surprise that it is poorly used). But it likely would be possible to write an implementation of "sometimes" which doesn't use weak references, at the cost of higher memory usage. / Jonas ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Garbage collection in generators
On Wed, Feb 17, 2016 at 10:28 AM, Andreas Rossberg wrote: > > The spec does not talk about GC, but in typical implementations you should > expect yes. > Yes, important point since some ECMAScript implementations don't even have GC and are just run to completion. The spec doesn't require cleanup. - If it is - does it run the `finally` blocks? >> > > No, definitely not. Try-finally has nothing to do with GC, it's just > control flow. > I'm not sure that "has nothing to do" is how I'd put it. try/finally is commonly used for resource management and garbage collection is a form of automatic resource management. There is also the assumption that finally is always run. Two other languages have opted into the "run finally" behavior so I wouldn't call it crazy although I tend to agree with the conclusion. > If you starve a generator it's not going to get completed, just like other > control flow won't. > I'm not sure starving is what I'd use here - I definitely do see users do a pattern similar to: ```js function getResults*() { try { var resource = acquire(); for(const item of resource) yield process(item); } finally { release(resource); } } ``` They would then do something like: ```js var res = getResults(); var onlyCareAbout = Array.from(take(10, res)); // ignore res from this point on. ``` Now, in a for.. of loop with a break - `return` would be called freeing the resource - in this case the resource would stay "held up" forever - users can call `.return` explicitly on the generator but as a consumer of such API I might not be aware that I need to. > Even if we want to make GC observable via finalisation, then it should at > least be done in a controlled and explicit manner rather than silently > tacking it onto an unrelated feature. See the revived weakref proposal. > I tend to agree. ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Re: Garbage collection in generators
On 17 February 2016 at 09:08, Benjamin Gruenbaum wrote: > In the following example: > > ```js > > function* foo() { > try { >yield 1; > } finally { > cleanup(); > } > } > (function() { > var f = foo(); > f.next(); > // never reference f again > })() > > ``` > > - Is the iterator created by the function `foo` ever eligible for garbage > collection? > The spec does not talk about GC, but in typical implementations you should expect yes. > - If it is - does it run the `finally` blocks? > No, definitely not. Try-finally has nothing to do with GC, it's just control flow. If you starve a generator it's not going to get completed, just like other control flow won't. (Which is why some of us think that iterator `return` is a misfeature, because it pretends to provide a guarantee that does not exist, and only encourages bogus forms of resource management.) Related resources: > > - Python changed the behavior to "run `return` on gc" in 2.5 > https://docs.python.org/2.5/whatsnew/pep-342.html > - C# doesn't run finalizers, but iterators are disposable and get aborted > automatically by foreach (for... of) - on break. this is similar to what we > do: > http://blogs.msdn.com/b/dancre/archive/2008/03/14/yield-and-usings-your-dispose-may-not-be-called.aspx > - PHP is debating this issue now, I was contacted by PHP internals people > about it which is how I came into the problem in the first place: > https://bugs.php.net/bug.php?id=71604 > - Related issue I opened on async/await : > https://github.com/tc39/ecmascript-asyncawait/issues/89 > Even if we want to make GC observable via finalisation, then it should at least be done in a controlled and explicit manner rather than silently tacking it onto an unrelated feature. See the revived weakref proposal. Python's idea is just confused and crazy. /Andreas ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss
Garbage collection in generators
In the following example: ```js function* foo() { try { yield 1; } finally { cleanup(); } } (function() { var f = foo(); f.next(); // never reference f again })() ``` - Is the iterator created by the function `foo` ever eligible for garbage collection? - If it is - does it run the `finally` blocks? Related resources: - Python changed the behavior to "run `return` on gc" in 2.5 https://docs.python.org/2.5/whatsnew/pep-342.html - C# doesn't run finalizers, but iterators are disposable and get aborted automatically by foreach (for... of) - on break. this is similar to what we do: http://blogs.msdn.com/b/dancre/archive/2008/03/14/yield-and-usings-your-dispose-may-not-be-called.aspx - PHP is debating this issue now, I was contacted by PHP internals people about it which is how I came into the problem in the first place: https://bugs.php.net/bug.php?id=71604 - Related issue I opened on async/await : https://github.com/tc39/ecmascript-asyncawait/issues/89 ___ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss