Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?

2012-08-24 Thread Jonathan Pryor
On Aug 24, 2012, at 1:11 PM, David Jeske dav...@gmail.com wrote:
 (1) Why would a call to an instance method not hold this alive for the 
 entire duration of the call? 

`this` isn't special, it's just an implicit variable passed into the method. If 
the variable isn't used within the method call, then it's collectible.

Rephrased, consider this:

// caller:
Foo (new StringBuilder ());

// Implementation:
static void Foo(StringBuilder b)
{
Thread.Sleep (1000);
}

The variable `sb` isn't used at all within Foo(). Consequently, the 
StringBuilder instance can be collected at any time, and no one will notice (as 
far as the GC is concerned). (The StringBuilder allocation could be omitted 
entirely, actually, if the runtime environment were smart enough to determine 
that it wasn't doing anything...)

Since `this` is just a variable, the GC treats it in the same way. The issue 
isn't so much that the GC is treating P/Invoke specially; the issue is that 
it's not treating it specially at all, and P/Invoke introduces a different 
world (native code) which the GC doesn't know anything about. Consequently, 
the GC can (and will) collect instances that the GC knows are unreachable from 
managed code, but may still be referenced from native code.

 It seems this could happen in more cases than just PInvoke. This seems to 
 allow a finalizer to run before an object is done being used anytime the 
 object instance is not stored. (i.e. inside a statement of the form new 
 Foo().Method();) If the finalizer triggers an IDispose pattern, this could 
 cause a managed resource to be torn down before it's done being used as well.

The managed resource can't be disposed before it's done being used AS LONG AS 
the GC knows about all uses of the managed resource.

In your `new Foo().Method()` example, it IS possible that the GC will finalize 
the `Foo` instance before Method() has returned, but it will only do so AS LONG 
AS `this` is no longer referenced within Method(). Thus, if Method() were empty 
or didn't use any instance members at all (e.g. the above Foo() body), then the 
instance can be collected while Method() is executing. Furthermore, it won't 
matter, as there's no way for Method() to even know that's happening.

The real problem is that the GC doesn't know anything about native code, and 
thus can't ensure that no native code is using the resource.

 Why isn't this considered a bug in the .NET runtime?

How would you fix it? The .NET runtime has no way of knowing what native code 
is doing, so short of disassembling the native code (magic), what is .NET 
supposed to do?

 (2) Does the Mono GC have the same behavior?

Yes, because there's no other sane behavior.

With Boehm it may be less of an issue, as Boehm is non-moving collector (so the 
memory won't be invalidated as quickly), and due to Boehm and Sgen's 
conservative stack walking nature Mono is more likely to preserve managed code 
which is referenced by native stack frames.

However, this can't be relied upon; Linux supports precise stack marking, 
which prevents conservative scanning of native stack frames. This has the 
wonderful performance advantage that less memory needs to be pinned, allowing 
the GC to be more efficient:

http://www.mono-project.com/Generational_GC#Precise_Stack_Marking

 - Jon

___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?

2012-08-24 Thread David Jeske
On Fri, Aug 24, 2012 at 10:50 AM, Jonathan Pryor jonpr...@vt.edu wrote:

  It seems this could happen in more cases than just PInvoke. This seems
 to allow a finalizer to run before an object is done being used anytime
 the object instance is not stored. (i.e. inside a statement of the form
 new Foo().Method();) If the finalizer triggers an IDispose pattern, this
 could cause a managed resource to be torn down before it's done being used
 as well.

 The managed resource can't be disposed before it's done being used AS LONG
 AS the GC knows about all uses of the managed resource.


... snip ...

The real problem is that the GC doesn't know anything about native code,
 and thus can't ensure that no native code is using the resource.


Thanks very mych for the detailed reply. It seems to me there is a race
that has nothing to do with native code. Consider this example..

class Foo : IDisposable { ManagedObject mo = new ManagedObject(); ~Foo() {
this.Dispose(); } public void Dispose() { if (mo != null) { try {mo.Dispose
();} finally { mo = null; } } } void Problem() { mo.doSomething(); } static
void Main() { new Foo().Problem(); } }

If I understand the MS.NET article, as soon as ms.doSomething enters the
vcall, this is no longer referenced. Which means during
ManagedObject.DoSomething, Foo could be finalized, and thus Disposed, and
since the Dispose explicitly Disposes mo, the code would Dispose mo while
it's still inside mo.doSomething(). Did I miss something?



  Why isn't this considered a bug in the .NET runtime?

 How would you fix it? The .NET runtime has no way of knowing what native
 code is doing, so short of disassembling the native code (magic), what is
 .NET supposed to do?


Ohh, I don't think the problem is the way this is handled for native code.
I think the above interaction in IDisposable seems like a problem too. To
me this seems like a pre-mature finalization bug caused because this
isn't considered referenced for the entire body of instance methods.

 (2) Does the Mono GC have the same behavior?

 Yes, because there's no other sane behavior.



 However, this can't be relied upon; Linux supports precise stack
 marking, which prevents conservative scanning of native stack frames. This
 has the wonderful performance advantage that less memory needs to be
 pinned, allowing the GC to be more efficient:

 http://www.mono-project.com/Generational_GC#Precise_Stack_Marking


I'm sorry for my naivety. Why does allowing unused function arguments to be
collected before a function returns have such important effects on memory
usage?
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?

2012-08-24 Thread Lepisto, Stephen P
Am I wrong in thinking that in

   void Problem() { mo.doSomething(); }

mo is contained within the context of the method body of Problem() and 
therefore cannot be disposed of until the method body of Problem() has done 
execution.  This means that mo will continue to live for the life of the call 
to doSomething() because the Problem() method body holds onto mo until after 
doSomething() returns.


From: mono-devel-list-boun...@lists.ximian.com 
[mailto:mono-devel-list-boun...@lists.ximian.com] On Behalf Of David Jeske
Sent: Friday, August 24, 2012 1:27 PM
To: Jonathan Pryor
Cc: mono-devel-list@lists.ximian.com
Subject: Re: [Mono-dev] Why does .NET object lifetime not extend into an 
instance method call?

On Fri, Aug 24, 2012 at 10:50 AM, Jonathan Pryor 
jonpr...@vt.edumailto:jonpr...@vt.edu wrote:
 It seems this could happen in more cases than just PInvoke. This seems to 
 allow a finalizer to run before an object is done being used anytime the 
 object instance is not stored. (i.e. inside a statement of the form new 
 Foo().Method();) If the finalizer triggers an IDispose pattern, this could 
 cause a managed resource to be torn down before it's done being used as well.

The managed resource can't be disposed before it's done being used AS LONG AS 
the GC knows about all uses of the managed resource.

... snip ...

The real problem is that the GC doesn't know anything about native code, and 
thus can't ensure that no native code is using the resource.

Thanks very mych for the detailed reply. It seems to me there is a race that 
has nothing to do with native code. Consider this example..

class Foo : IDisposable {
   ManagedObject mo = new ManagedObject();

   ~Foo() { this.Dispose(); }
   public void Dispose() {
   if (mo != null) {
  try {mo.Dispose();} finally { mo = null; }
   }
   }

   void Problem() { mo.doSomething(); }

   static void Main() { new Foo().Problem(); }
}

If I understand the MS.NEThttp://MS.NET article, as soon as ms.doSomething 
enters the vcall, this is no longer referenced. Which means during 
ManagedObject.DoSomething, Foo could be finalized, and thus Disposed, and since 
the Dispose explicitly Disposes mo, the code would Dispose mo while it's still 
inside mo.doSomething(). Did I miss something?


 Why isn't this considered a bug in the .NET runtime?
How would you fix it? The .NET runtime has no way of knowing what native code 
is doing, so short of disassembling the native code (magic), what is .NET 
supposed to do?

Ohh, I don't think the problem is the way this is handled for native code. I 
think the above interaction in IDisposable seems like a problem too. To me this 
seems like a pre-mature finalization bug caused because this isn't considered 
referenced for the entire body of instance methods.

 (2) Does the Mono GC have the same behavior?
Yes, because there's no other sane behavior.

However, this can't be relied upon; Linux supports precise stack marking, 
which prevents conservative scanning of native stack frames. This has the 
wonderful performance advantage that less memory needs to be pinned, allowing 
the GC to be more efficient:

http://www.mono-project.com/Generational_GC#Precise_Stack_Marking

I'm sorry for my naivety. Why does allowing unused function arguments to be 
collected before a function returns have such important effects on memory usage?


___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?

2012-08-24 Thread Diego Frata
mo actually is this.mo, so you do have a reference for this.


Diego Frata
diego.fr...@gmail.com


On Fri, Aug 24, 2012 at 5:47 PM, Lepisto, Stephen P 
stephen.p.lepi...@intel.com wrote:

  Am I wrong in thinking that in

 ** **

void Problem() { mo.doSomething(); }

 ** **

 mo is contained within the context of the method body of Problem() and
 therefore cannot be disposed of until the method body of Problem() has done
 execution.  This means that mo will continue to live for the life of the
 call to doSomething() because the Problem() method body holds onto mo
 until after doSomething() returns.

 ** **

 ** **

 *From:* mono-devel-list-boun...@lists.ximian.com [mailto:
 mono-devel-list-boun...@lists.ximian.com] *On Behalf Of *David Jeske
 *Sent:* Friday, August 24, 2012 1:27 PM
 *To:* Jonathan Pryor
 *Cc:* mono-devel-list@lists.ximian.com
 *Subject:* Re: [Mono-dev] Why does .NET object lifetime not extend into
 an instance method call?

 ** **

 On Fri, Aug 24, 2012 at 10:50 AM, Jonathan Pryor jonpr...@vt.edu wrote:*
 ***

  It seems this could happen in more cases than just PInvoke. This seems
 to allow a finalizer to run before an object is done being used anytime
 the object instance is not stored. (i.e. inside a statement of the form
 new Foo().Method();) If the finalizer triggers an IDispose pattern, this
 could cause a managed resource to be torn down before it's done being used
 as well.

 ** **

 The managed resource can't be disposed before it's done being used AS LONG
 AS the GC knows about all uses of the managed resource.

 ** **

 ... snip ... 

 ** **

 The real problem is that the GC doesn't know anything about native code,
 and thus can't ensure that no native code is using the resource.

  ** **

 Thanks very mych for the detailed reply. It seems to me there is a race
 that has nothing to do with native code. Consider this example..

 ** **

 class Foo : IDisposable {

ManagedObject mo = new ManagedObject();

 ** **

~Foo() { this.Dispose(); }

public void Dispose() {

if (mo != null) {

   try {mo.Dispose();} finally { mo = null; }

}

} 

 ** **

void Problem() { mo.doSomething(); }

 ** **

static void Main() { new Foo().Problem(); }

 } 


 If I understand the MS.NET article, as soon as ms.doSomething enters the
 vcall, this is no longer referenced. Which means during
 ManagedObject.DoSomething, Foo could be finalized, and thus Disposed, and
 since the Dispose explicitly Disposes mo, the code would Dispose mo while
 it's still inside mo.doSomething(). Did I miss something?

  

  Why isn't this considered a bug in the .NET runtime?

 How would you fix it? The .NET runtime has no way of knowing what native
 code is doing, so short of disassembling the native code (magic), what is
 .NET supposed to do?

 ** **

 Ohh, I don't think the problem is the way this is handled for native code.
 I think the above interaction in IDisposable seems like a problem too. To
 me this seems like a pre-mature finalization bug caused because this
 isn't considered referenced for the entire body of instance methods.

 ** **

   (2) Does the Mono GC have the same behavior?

 Yes, because there's no other sane behavior.

   

 However, this can't be relied upon; Linux supports precise stack
 marking, which prevents conservative scanning of native stack frames. This
 has the wonderful performance advantage that less memory needs to be
 pinned, allowing the GC to be more efficient:

 http://www.mono-project.com/Generational_GC#Precise_Stack_Marking*
 ***

  ** **

 I'm sorry for my naivety. Why does allowing unused function arguments to
 be collected before a function returns have such important effects on
 memory usage? 

  

 ** **

 ___
 Mono-devel-list mailing list
 Mono-devel-list@lists.ximian.com
 http://lists.ximian.com/mailman/listinfo/mono-devel-list


___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?

2012-08-24 Thread David Jeske
On Fri, Aug 24, 2012 at 8:31 PM, Jonathan Pryor jonpr...@vt.edu wrote:

  I'm sorry for my naivety. Why does allowing unused function arguments to
 be collected before a function returns have such important effects on
 memory usage?

 Java. :-)

 The context is the JVM, and large methods. Many JVM implementations used
 to do as you suggested, and wouldn't collect a variable until the method
 referencing the variable returned. This even applied to local variables!
 Instead of having precise lifetime semantics (as determined by the
 instruction pointer), it only cared about stack frames.

 The result of this behavior is that developers would write huge methods
 which allocated lots of objects, all of which would be considered live
 even when a local was no longer being used. Thus came a body of guidelines
 that you should null out instance/local variables so that the GC could
 actually collect intra-method garbage:

 http://stackoverflow.com/questions/473685
 http://stackoverflow.com/a/503714/83444

 Needing to null out a local variable is, of course, insane -- why can't
 the GC figure this out! -- so .NET (and modern JVMs!) now precisely track
 which variables are in-scope and out-of-scope, and will allow collection of
 any-and-all out-of-scope variables even within the method.


Thanks for that detailed description.

I don't still don't see why function arguments are handled as aggressively
as local variables. One could argue that the contract of a function call
implies that the arguments are referenced by the caller until after the
call is completed... For example, expect that in new Foo().Bar();, Foo is
not eligible for collection until after Bar returns.

However, at least I now understand why this idosyancracy exists. Thanks
very much!
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] Why does .NET object lifetime not extend into an instance method call?

2012-08-24 Thread Jonathan Pryor
On Aug 24, 2012, at 4:26 PM, David Jeske dav...@gmail.com wrote:
 Thanks very mych for the detailed reply. It seems to me there is a race that 
 has nothing to do with native code.

Native code just makes it easier to reason about, but as you mention it is 
quite applicable to managed code. My apologies for not considering that angle.

The answer is largely the same, though; you have two threads using the same 
instance, one of which (the finalizer) is disposing of the instance, and one of 
which is invoking a method on that instance.

If you weren't dealing with the GC but still had the same scenario -- two 
threads using the same instance -- how would you it? By introducing locking, or 
otherwise ordering the operations so that they can't overlap.

The same is true with the GC, i.e. you ned to ensure that the threads don't 
stomp on each other, via manual programmer assistance.

void Problem()
{
mo.doSomething();
GC.KeepAlive(this);
}

The above GC.KeepAlive() will prevent the GC from finalizing the Foo instance 
(and thus the Foo.mo instance) until after `mo.doSomething()` completes.

That's the fix, but why is it necessary? Why can't the GC figure this out?

Because auto-parallelism is hard, and the GC isn't fully involved, _you_ are; 
consider your previous sample app, but let's provide an implementation for 
ManagedObject:

class ManagedObject : IDisposable {
static readonly ListManagedObject instances = new 
ListManagedObject();

public ManagedObject ()
{
lock (instances)
instances.Add(this);
}

public static ManagedObject[] GetInstances ()
{
lock (instances)
return instances.ToArray ();
}

public void Dispose()
{
// remove? eh...
}
}

This is for illustrative purposes only; the point is that ManagedObject could 
do _anything_, and the above implementation will result in disposed instances 
within the static ManagedObject.instances list (and, depending on timing, any 
callers of the GetInstances() method). The GC will _never_ collect them -- 
they're rooted! -- but they've been invalided via your Dispose() call. (Sure, 
ManagedObject.Dispose() could remove itself from the list; complicate the 
implementation as appropriate to make that infeasible. ;-)

All the GC does is track which instances are still live and which are 
collectible. That's (mostly) it. The fact that the GC may introduce 
multi-threaded access to member variables is largely beyond it's purview; as 
such, the onus is on the developer to clear it up.

But here's the real rub: even if the GC weren't introducing multi-threaded 
access to a member variable, it _still_ can't be held responsible for 
complicated object graphics like the above. Foo isn't referenced by anything, 
and thus is disposed -- even if it's not at the same time that Foo.Problem() is 
executing -- but the side effects of the finalizer invocation are WAY beyond 
the scope of the GC. It's all too easy for an instance to be disposed/finalized 
while other code is still holding it. The GC doesn't protect you from this; 
you, the developer, have to protect your code against it.

Given that you the programmer are on the hook once you introduce Dispose() and 
finalizers, having the GC be more proactive at freeing resources doesn't 
greatly change the game. If you want things to be easy, avoid IDisposable and 
finalizers entirely.

 I'm sorry for my naivety. Why does allowing unused function arguments to be 
 collected before a function returns have such important effects on memory 
 usage? 

Java. :-)

The context is the JVM, and large methods. Many JVM implementations used to 
do as you suggested, and wouldn't collect a variable until the method 
referencing the variable returned. This even applied to local variables! 
Instead of having precise lifetime semantics (as determined by the 
instruction pointer), it only cared about stack frames.

The result of this behavior is that developers would write huge methods which 
allocated lots of objects, all of which would be considered live even when 
a local was no longer being used. Thus came a body of guidelines that you 
should null out instance/local variables so that the GC could actually collect 
intra-method garbage:

http://stackoverflow.com/questions/473685
http://stackoverflow.com/a/503714/83444

Needing to null out a local variable is, of course, insane -- why can't the GC 
figure this out! -- so .NET (and modern JVMs!) now precisely track which 
variables are in-scope and out-of-scope, and will allow collection of 
any-and-all out-of-scope variables even within the method.

 - Jon

___