Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?
On Aug 24, 2012, at 1:11 PM, David Jeske dav...@gmail.com wrote: (1) Why would a call to an instance method not hold this alive for the entire duration of the call? `this` isn't special, it's just an implicit variable passed into the method. If the variable isn't used within the method call, then it's collectible. Rephrased, consider this: // caller: Foo (new StringBuilder ()); // Implementation: static void Foo(StringBuilder b) { Thread.Sleep (1000); } The variable `sb` isn't used at all within Foo(). Consequently, the StringBuilder instance can be collected at any time, and no one will notice (as far as the GC is concerned). (The StringBuilder allocation could be omitted entirely, actually, if the runtime environment were smart enough to determine that it wasn't doing anything...) Since `this` is just a variable, the GC treats it in the same way. The issue isn't so much that the GC is treating P/Invoke specially; the issue is that it's not treating it specially at all, and P/Invoke introduces a different world (native code) which the GC doesn't know anything about. Consequently, the GC can (and will) collect instances that the GC knows are unreachable from managed code, but may still be referenced from native code. It seems this could happen in more cases than just PInvoke. This seems to allow a finalizer to run before an object is done being used anytime the object instance is not stored. (i.e. inside a statement of the form new Foo().Method();) If the finalizer triggers an IDispose pattern, this could cause a managed resource to be torn down before it's done being used as well. The managed resource can't be disposed before it's done being used AS LONG AS the GC knows about all uses of the managed resource. In your `new Foo().Method()` example, it IS possible that the GC will finalize the `Foo` instance before Method() has returned, but it will only do so AS LONG AS `this` is no longer referenced within Method(). Thus, if Method() were empty or didn't use any instance members at all (e.g. the above Foo() body), then the instance can be collected while Method() is executing. Furthermore, it won't matter, as there's no way for Method() to even know that's happening. The real problem is that the GC doesn't know anything about native code, and thus can't ensure that no native code is using the resource. Why isn't this considered a bug in the .NET runtime? How would you fix it? The .NET runtime has no way of knowing what native code is doing, so short of disassembling the native code (magic), what is .NET supposed to do? (2) Does the Mono GC have the same behavior? Yes, because there's no other sane behavior. With Boehm it may be less of an issue, as Boehm is non-moving collector (so the memory won't be invalidated as quickly), and due to Boehm and Sgen's conservative stack walking nature Mono is more likely to preserve managed code which is referenced by native stack frames. However, this can't be relied upon; Linux supports precise stack marking, which prevents conservative scanning of native stack frames. This has the wonderful performance advantage that less memory needs to be pinned, allowing the GC to be more efficient: http://www.mono-project.com/Generational_GC#Precise_Stack_Marking - Jon ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?
On Fri, Aug 24, 2012 at 10:50 AM, Jonathan Pryor jonpr...@vt.edu wrote: It seems this could happen in more cases than just PInvoke. This seems to allow a finalizer to run before an object is done being used anytime the object instance is not stored. (i.e. inside a statement of the form new Foo().Method();) If the finalizer triggers an IDispose pattern, this could cause a managed resource to be torn down before it's done being used as well. The managed resource can't be disposed before it's done being used AS LONG AS the GC knows about all uses of the managed resource. ... snip ... The real problem is that the GC doesn't know anything about native code, and thus can't ensure that no native code is using the resource. Thanks very mych for the detailed reply. It seems to me there is a race that has nothing to do with native code. Consider this example.. class Foo : IDisposable { ManagedObject mo = new ManagedObject(); ~Foo() { this.Dispose(); } public void Dispose() { if (mo != null) { try {mo.Dispose ();} finally { mo = null; } } } void Problem() { mo.doSomething(); } static void Main() { new Foo().Problem(); } } If I understand the MS.NET article, as soon as ms.doSomething enters the vcall, this is no longer referenced. Which means during ManagedObject.DoSomething, Foo could be finalized, and thus Disposed, and since the Dispose explicitly Disposes mo, the code would Dispose mo while it's still inside mo.doSomething(). Did I miss something? Why isn't this considered a bug in the .NET runtime? How would you fix it? The .NET runtime has no way of knowing what native code is doing, so short of disassembling the native code (magic), what is .NET supposed to do? Ohh, I don't think the problem is the way this is handled for native code. I think the above interaction in IDisposable seems like a problem too. To me this seems like a pre-mature finalization bug caused because this isn't considered referenced for the entire body of instance methods. (2) Does the Mono GC have the same behavior? Yes, because there's no other sane behavior. However, this can't be relied upon; Linux supports precise stack marking, which prevents conservative scanning of native stack frames. This has the wonderful performance advantage that less memory needs to be pinned, allowing the GC to be more efficient: http://www.mono-project.com/Generational_GC#Precise_Stack_Marking I'm sorry for my naivety. Why does allowing unused function arguments to be collected before a function returns have such important effects on memory usage? ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?
Am I wrong in thinking that in void Problem() { mo.doSomething(); } mo is contained within the context of the method body of Problem() and therefore cannot be disposed of until the method body of Problem() has done execution. This means that mo will continue to live for the life of the call to doSomething() because the Problem() method body holds onto mo until after doSomething() returns. From: mono-devel-list-boun...@lists.ximian.com [mailto:mono-devel-list-boun...@lists.ximian.com] On Behalf Of David Jeske Sent: Friday, August 24, 2012 1:27 PM To: Jonathan Pryor Cc: mono-devel-list@lists.ximian.com Subject: Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call? On Fri, Aug 24, 2012 at 10:50 AM, Jonathan Pryor jonpr...@vt.edumailto:jonpr...@vt.edu wrote: It seems this could happen in more cases than just PInvoke. This seems to allow a finalizer to run before an object is done being used anytime the object instance is not stored. (i.e. inside a statement of the form new Foo().Method();) If the finalizer triggers an IDispose pattern, this could cause a managed resource to be torn down before it's done being used as well. The managed resource can't be disposed before it's done being used AS LONG AS the GC knows about all uses of the managed resource. ... snip ... The real problem is that the GC doesn't know anything about native code, and thus can't ensure that no native code is using the resource. Thanks very mych for the detailed reply. It seems to me there is a race that has nothing to do with native code. Consider this example.. class Foo : IDisposable { ManagedObject mo = new ManagedObject(); ~Foo() { this.Dispose(); } public void Dispose() { if (mo != null) { try {mo.Dispose();} finally { mo = null; } } } void Problem() { mo.doSomething(); } static void Main() { new Foo().Problem(); } } If I understand the MS.NEThttp://MS.NET article, as soon as ms.doSomething enters the vcall, this is no longer referenced. Which means during ManagedObject.DoSomething, Foo could be finalized, and thus Disposed, and since the Dispose explicitly Disposes mo, the code would Dispose mo while it's still inside mo.doSomething(). Did I miss something? Why isn't this considered a bug in the .NET runtime? How would you fix it? The .NET runtime has no way of knowing what native code is doing, so short of disassembling the native code (magic), what is .NET supposed to do? Ohh, I don't think the problem is the way this is handled for native code. I think the above interaction in IDisposable seems like a problem too. To me this seems like a pre-mature finalization bug caused because this isn't considered referenced for the entire body of instance methods. (2) Does the Mono GC have the same behavior? Yes, because there's no other sane behavior. However, this can't be relied upon; Linux supports precise stack marking, which prevents conservative scanning of native stack frames. This has the wonderful performance advantage that less memory needs to be pinned, allowing the GC to be more efficient: http://www.mono-project.com/Generational_GC#Precise_Stack_Marking I'm sorry for my naivety. Why does allowing unused function arguments to be collected before a function returns have such important effects on memory usage? ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?
mo actually is this.mo, so you do have a reference for this. Diego Frata diego.fr...@gmail.com On Fri, Aug 24, 2012 at 5:47 PM, Lepisto, Stephen P stephen.p.lepi...@intel.com wrote: Am I wrong in thinking that in ** ** void Problem() { mo.doSomething(); } ** ** mo is contained within the context of the method body of Problem() and therefore cannot be disposed of until the method body of Problem() has done execution. This means that mo will continue to live for the life of the call to doSomething() because the Problem() method body holds onto mo until after doSomething() returns. ** ** ** ** *From:* mono-devel-list-boun...@lists.ximian.com [mailto: mono-devel-list-boun...@lists.ximian.com] *On Behalf Of *David Jeske *Sent:* Friday, August 24, 2012 1:27 PM *To:* Jonathan Pryor *Cc:* mono-devel-list@lists.ximian.com *Subject:* Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call? ** ** On Fri, Aug 24, 2012 at 10:50 AM, Jonathan Pryor jonpr...@vt.edu wrote:* *** It seems this could happen in more cases than just PInvoke. This seems to allow a finalizer to run before an object is done being used anytime the object instance is not stored. (i.e. inside a statement of the form new Foo().Method();) If the finalizer triggers an IDispose pattern, this could cause a managed resource to be torn down before it's done being used as well. ** ** The managed resource can't be disposed before it's done being used AS LONG AS the GC knows about all uses of the managed resource. ** ** ... snip ... ** ** The real problem is that the GC doesn't know anything about native code, and thus can't ensure that no native code is using the resource. ** ** Thanks very mych for the detailed reply. It seems to me there is a race that has nothing to do with native code. Consider this example.. ** ** class Foo : IDisposable { ManagedObject mo = new ManagedObject(); ** ** ~Foo() { this.Dispose(); } public void Dispose() { if (mo != null) { try {mo.Dispose();} finally { mo = null; } } } ** ** void Problem() { mo.doSomething(); } ** ** static void Main() { new Foo().Problem(); } } If I understand the MS.NET article, as soon as ms.doSomething enters the vcall, this is no longer referenced. Which means during ManagedObject.DoSomething, Foo could be finalized, and thus Disposed, and since the Dispose explicitly Disposes mo, the code would Dispose mo while it's still inside mo.doSomething(). Did I miss something? Why isn't this considered a bug in the .NET runtime? How would you fix it? The .NET runtime has no way of knowing what native code is doing, so short of disassembling the native code (magic), what is .NET supposed to do? ** ** Ohh, I don't think the problem is the way this is handled for native code. I think the above interaction in IDisposable seems like a problem too. To me this seems like a pre-mature finalization bug caused because this isn't considered referenced for the entire body of instance methods. ** ** (2) Does the Mono GC have the same behavior? Yes, because there's no other sane behavior. However, this can't be relied upon; Linux supports precise stack marking, which prevents conservative scanning of native stack frames. This has the wonderful performance advantage that less memory needs to be pinned, allowing the GC to be more efficient: http://www.mono-project.com/Generational_GC#Precise_Stack_Marking* *** ** ** I'm sorry for my naivety. Why does allowing unused function arguments to be collected before a function returns have such important effects on memory usage? ** ** ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?
On Fri, Aug 24, 2012 at 8:31 PM, Jonathan Pryor jonpr...@vt.edu wrote: I'm sorry for my naivety. Why does allowing unused function arguments to be collected before a function returns have such important effects on memory usage? Java. :-) The context is the JVM, and large methods. Many JVM implementations used to do as you suggested, and wouldn't collect a variable until the method referencing the variable returned. This even applied to local variables! Instead of having precise lifetime semantics (as determined by the instruction pointer), it only cared about stack frames. The result of this behavior is that developers would write huge methods which allocated lots of objects, all of which would be considered live even when a local was no longer being used. Thus came a body of guidelines that you should null out instance/local variables so that the GC could actually collect intra-method garbage: http://stackoverflow.com/questions/473685 http://stackoverflow.com/a/503714/83444 Needing to null out a local variable is, of course, insane -- why can't the GC figure this out! -- so .NET (and modern JVMs!) now precisely track which variables are in-scope and out-of-scope, and will allow collection of any-and-all out-of-scope variables even within the method. Thanks for that detailed description. I don't still don't see why function arguments are handled as aggressively as local variables. One could argue that the contract of a function call implies that the arguments are referenced by the caller until after the call is completed... For example, expect that in new Foo().Bar();, Foo is not eligible for collection until after Bar returns. However, at least I now understand why this idosyancracy exists. Thanks very much! ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?
On Aug 24, 2012, at 4:26 PM, David Jeske dav...@gmail.com wrote: Thanks very mych for the detailed reply. It seems to me there is a race that has nothing to do with native code. Native code just makes it easier to reason about, but as you mention it is quite applicable to managed code. My apologies for not considering that angle. The answer is largely the same, though; you have two threads using the same instance, one of which (the finalizer) is disposing of the instance, and one of which is invoking a method on that instance. If you weren't dealing with the GC but still had the same scenario -- two threads using the same instance -- how would you it? By introducing locking, or otherwise ordering the operations so that they can't overlap. The same is true with the GC, i.e. you ned to ensure that the threads don't stomp on each other, via manual programmer assistance. void Problem() { mo.doSomething(); GC.KeepAlive(this); } The above GC.KeepAlive() will prevent the GC from finalizing the Foo instance (and thus the Foo.mo instance) until after `mo.doSomething()` completes. That's the fix, but why is it necessary? Why can't the GC figure this out? Because auto-parallelism is hard, and the GC isn't fully involved, _you_ are; consider your previous sample app, but let's provide an implementation for ManagedObject: class ManagedObject : IDisposable { static readonly ListManagedObject instances = new ListManagedObject(); public ManagedObject () { lock (instances) instances.Add(this); } public static ManagedObject[] GetInstances () { lock (instances) return instances.ToArray (); } public void Dispose() { // remove? eh... } } This is for illustrative purposes only; the point is that ManagedObject could do _anything_, and the above implementation will result in disposed instances within the static ManagedObject.instances list (and, depending on timing, any callers of the GetInstances() method). The GC will _never_ collect them -- they're rooted! -- but they've been invalided via your Dispose() call. (Sure, ManagedObject.Dispose() could remove itself from the list; complicate the implementation as appropriate to make that infeasible. ;-) All the GC does is track which instances are still live and which are collectible. That's (mostly) it. The fact that the GC may introduce multi-threaded access to member variables is largely beyond it's purview; as such, the onus is on the developer to clear it up. But here's the real rub: even if the GC weren't introducing multi-threaded access to a member variable, it _still_ can't be held responsible for complicated object graphics like the above. Foo isn't referenced by anything, and thus is disposed -- even if it's not at the same time that Foo.Problem() is executing -- but the side effects of the finalizer invocation are WAY beyond the scope of the GC. It's all too easy for an instance to be disposed/finalized while other code is still holding it. The GC doesn't protect you from this; you, the developer, have to protect your code against it. Given that you the programmer are on the hook once you introduce Dispose() and finalizers, having the GC be more proactive at freeing resources doesn't greatly change the game. If you want things to be easy, avoid IDisposable and finalizers entirely. I'm sorry for my naivety. Why does allowing unused function arguments to be collected before a function returns have such important effects on memory usage? Java. :-) The context is the JVM, and large methods. Many JVM implementations used to do as you suggested, and wouldn't collect a variable until the method referencing the variable returned. This even applied to local variables! Instead of having precise lifetime semantics (as determined by the instruction pointer), it only cared about stack frames. The result of this behavior is that developers would write huge methods which allocated lots of objects, all of which would be considered live even when a local was no longer being used. Thus came a body of guidelines that you should null out instance/local variables so that the GC could actually collect intra-method garbage: http://stackoverflow.com/questions/473685 http://stackoverflow.com/a/503714/83444 Needing to null out a local variable is, of course, insane -- why can't the GC figure this out! -- so .NET (and modern JVMs!) now precisely track which variables are in-scope and out-of-scope, and will allow collection of any-and-all out-of-scope variables even within the method. - Jon ___