Re: [Mono-dev] RFC: GC precise scanning of stacks

2009-10-17 Thread Rodrigo Kumpera
Hey Dick,

Your patch won't work because it doesn't deal neither with on stack
temporaries or spilled variables
and can make the runtime somewhat slower since it marks as volatile all
locals holding managed objects.

It would not work for the top frame due to how threads are stopped by boehm
or sgen. Both collectors
park threads at random points[1] without any sort of safe-point mechanism.

There are a few ways to implement precise stack scanning for unmanaged code,
none are pretty thou:

-Make the whole runtime use gchandles to manipulate managed objects. Safe
parking is possible to be done
in a quite non-intrusive way. Coding using such thing is a major PITA as all
access have to be done using accessor
functions. The main issue with this approach is the __huge__ effort to fix
all code playing with managed objects.

-Only scan unmanaged frames from the runtime or DSOs that have registered
icalls. This is a pretty decent
compromise and should lead to a lot less false positives.

For managed code, we need to extend the JIT to either produce stack maps
that tell at each safepoint[2] which
stack slots are used for managed pointer and which are unknown (callee saved
regs, for example); or we can just
make sure stack slot for managed pointers are not reused for scalars, have a
single description of the stack and live
with some false positives due to uninitialized variables.

We could use a shadow stack for the JITd code, which is quite simples to
implement, but it has the issue of causing
slower code to be generated.


[1] SGen requires parking outside of allocators but otherwise it can be at
arbitrary points.
[2] Safepoints can pretty much be just at method call points since the top
frame will most likely be conservatively scanned.



On Mon, Sep 28, 2009 at 10:28 AM, Dick Porter dpor...@codicesoftware.comwrote:

 Hi all

 We think some of our 'leak' issues can be attributed to libgc's
 false-positive identification of pointers.

 Attached is a proof-of-concept patch to libgc (and a simple
 demonstration program) that I hope will be the start of GC precise stack
 scanning.  The code should apply easily to sgen as well.

 It basically adds an extra variable to the stack which contains specific
 markers and references to all the pointers that will contain GC-alloced
 memory.  There is an optional failsafe mode that will fall back to the
 current 'all stack is scanned' code if the markers are not seen.

 This code will cover objects on unmanaged stacks but I don't know what
 will be needed for managed code.  I presume the JIT can add the same
 sort of marker to the stack?

 So, comments?  Is this technique going to be workable?

 - Dick


 ___
 Mono-devel-list mailing list
 Mono-devel-list@lists.ximian.com
 http://lists.ximian.com/mailman/listinfo/mono-devel-list


___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


[Mono-dev] RFC: GC precise scanning of stacks

2009-09-28 Thread Dick Porter
Hi all

We think some of our 'leak' issues can be attributed to libgc's
false-positive identification of pointers.

Attached is a proof-of-concept patch to libgc (and a simple
demonstration program) that I hope will be the start of GC precise stack
scanning.  The code should apply easily to sgen as well.

It basically adds an extra variable to the stack which contains specific
markers and references to all the pointers that will contain GC-alloced
memory.  There is an optional failsafe mode that will fall back to the
current 'all stack is scanned' code if the markers are not seen.

This code will cover objects on unmanaged stacks but I don't know what
will be needed for managed code.  I presume the JIT can add the same
sort of marker to the stack?

So, comments?  Is this technique going to be workable?

- Dick

diff -urBbw gc6.6-orig/include/gc.h gc6.6/include/gc.h
--- gc6.6-orig/include/gc.h	2005-05-20 18:50:58.0 +0100
+++ gc6.6/include/gc.h	2009-09-28 11:17:45.0 +0100
@@ -859,6 +859,20 @@
 #   define GC_PTR_STORE(p, q) *((p) = (q))
 #endif
 
+#ifdef GC_PRECISE_STACK
+/* Keep the least significant bit set so these don't look like pointers */
+#   define GC_PRECISE_STACK_BEGIN_MARKER 0xF00FF00F
+#   define GC_PRECISE_STACK_END_MARKER 0xD00DD00D
+/* This macro is very gcc specific, with the 'unused' attribute to
+ * shut up the unused-variable warning, the volatile placement to foil the
+ * optimiser, and the '##vars' to suppress the leading comma when the args
+ * list is empty
+ */
+#   define GC_STACK_REFERENCE(x, vars...) __attribute__((unused)) void * volatile __GC_stack_references_ ## x[] = {(void *)GC_PRECISE_STACK_BEGIN_MARKER, ##vars, (void *)GC_PRECISE_STACK_END_MARKER};
+#else
+#   define GC_STACK_REFERENCE(x, vars...)
+#endif
+
 /* Functions called to report pointer checking errors */
 GC_API void (*GC_same_obj_print_proc) GC_PROTO((GC_PTR p, GC_PTR q));
 
diff -urBbw gc6.6-orig/pthread_stop_world.c gc6.6/pthread_stop_world.c
--- gc6.6-orig/pthread_stop_world.c	2005-09-09 18:54:32.0 +0100
+++ gc6.6/pthread_stop_world.c	2009-09-28 11:11:48.0 +0100
@@ -256,6 +256,10 @@
 /* On IA64, we also need to scan the register backing store. */
 IF_IA64(ptr_t bs_lo; ptr_t bs_hi;)
 pthread_t me = pthread_self();
+#if GC_PRECISE_STACK
+ptr_t stack_ptr;
+int found_markers;
+#endif
 
 if (!GC_thr_initialized) GC_thr_init();
 #if DEBUG_THREADS
@@ -284,7 +288,7 @@
 hi = GC_stackbottom;
 	IF_IA64(bs_lo = BACKING_STORE_BASE;)
 }
-#if DEBUG_THREADS
+#if defined(DEBUG_THREADS) || defined(DEBUG_PRECISE_STACK)
 GC_printf3(Stack for thread 0x%lx = [%lx,%lx)\n,
 	(unsigned long) p - id,
 		(unsigned long) lo, (unsigned long) hi);
@@ -292,10 +296,70 @@
 	if (0 == lo) ABORT(GC_push_all_stacks: sp not set!\n);
 #   ifdef STACK_GROWS_UP
 	  /* We got them backwards! */
+#	  ifdef GC_PRECISE_STACK
+	  found_markers = 0;
+
+	  for (stack_ptr = hi; stack_ptr = lo; stack_ptr += sizeof(ptr_t)) {
+		word content = *(word *)stack_ptr;
+	  	if (content == GC_PRECISE_STACK_BEGIN_MARKER) {
+			ptr_t stack_ptr_end = stack_ptr;
+#			ifdef DEBUG_PRECISE_STACK
+			GC_printf1(Found precise begin marker at 0x%lx\n, stack_ptr);
+#			endif
+			found_markers = 1;
+			do {
+stack_ptr_end += sizeof(ptr_t);
+content = *(word *)stack_ptr_end;
+if (content != GC_PRECISE_STACK_END_MARKER) {
+	GC_push_all_stack ((ptr_t)content, (ptr_t)(content + sizeof(ptr_t)));
+}
+			} while (content != GC_PRECISE_STACK_END_MARKER 
+stack_ptr_end  lo);
+			stack_ptr = stack_ptr_end;
+		}
+	  }
+
+#	  ifdef GC_PRECISE_STACK_FAILSAFE
+	  if (!found_markers) {
+		  GC_push_all_stack(hi, lo);
+	  }
+#	  endif
+#	  else
   GC_push_all_stack(hi, lo);
+#	  endif
+#   else
+#	  ifdef GC_PRECISE_STACK
+	  found_markers = 0;
+
+	  for (stack_ptr = lo; stack_ptr = hi; stack_ptr += sizeof(ptr_t)) {
+		word content = *(word *)stack_ptr;
+	  	if (content == GC_PRECISE_STACK_BEGIN_MARKER) {
+			ptr_t stack_ptr_end = stack_ptr;
+#			ifdef DEBUG_PRECISE_STACK
+			GC_printf1(Found precise begin marker at 0x%lx\n, stack_ptr);
+#			endif
+			found_markers = 1;
+			do {
+stack_ptr_end += sizeof(ptr_t);
+content = *(word *)stack_ptr_end;
+if (content != GC_PRECISE_STACK_END_MARKER) {
+	GC_push_all_stack ((ptr_t)content, (ptr_t)(content + sizeof(ptr_t)));
+}
+			} while (content != GC_PRECISE_STACK_END_MARKER 
+stack_ptr_end  hi);
+			stack_ptr = stack_ptr_end;
+		}
+	  }
+
+#	  ifdef GC_PRECISE_STACK_FAILSAFE
+	  if (!found_markers) {
+		  GC_push_all_stack(lo, hi);
+	  }
+#	  endif
 #   else
   GC_push_all_stack(lo, hi);
 #	endif
+#	endif
 #	ifdef IA64
 # if DEBUG_THREADS
 GC_printf3(Reg stack for thread 0x%lx = [%lx,%lx)\n,

#define GC_PRECISE_STACK

#include gc.h
#include stdio.h
#include stdlib.h

typedef struct {
	int val;
	void *ptr;
} Object;

void 

Re: [Mono-dev] RFC: GC precise scanning of stacks

2009-09-28 Thread Miguel de Icaza
Hello Dick,

 Attached is a proof-of-concept patch to libgc (and a simple
 demonstration program) that I hope will be the start of GC precise stack
 scanning.  The code should apply easily to sgen as well.

Thanks;   This is a nice start, I think there should be a bit more
checking for the markers, something along the lines of having a size
argument and checking that mem [start + size] = end_marker as well, just
for the sake of avoiding false positives, give or take more checks.

The challenge is to make the JIT compiler group all of the managed
object references in a contiguous space and then decorating that block
with this. 

I like the idea myself, it will not be 100% precise, but it will get us
very very close. 

The VM team of course needs to weight in.

___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list