Re: std.allocator: primitives for helping GC

2014-05-01 Thread Dmitry Olshansky via Digitalmars-d

01-May-2014 08:05, Andrei Alexandrescu пишет:

So far allocators have been quite GC-agnostic; deallocation would be
needed for each allocation, and there was little structure to help tracing.

The primitive resolveInternalPointer has made a step toward more precise
structuring of heaps, and now it's time to look at some real GC helpers.

These are markAllAsUnused, markAsUsed, and doneMarking. Find their
description here:

http://erdani.com/d/phobos-prerelease/std_allocator.html

There are a few proof of concept implementations here:

https://github.com/andralex/phobos/blob/allocator/std/allocator.d

The basic idea is that the GC would initiate a collection by calling
markAllAsUnused. That would cause the underlying untyped allocator to
softly mark all memory as free, but without actually messing it up.
(Many allocators are capable of implementing such a primitive cheaply.)


It would be interesting to define where GC sits and how it integrates
with allocators to begin with. Given that currently GCAllocator is 
mentioned
somewhere at the bottom of heap layers I'm really at loss about how 
these helpers fit into the picture.



Then, the GC would trace objects starting from roots. Whenever it finds
a used pointer, it would mark the corresponding memory as allocated by
using markAsUsed, effectively undoing the soft freeing for it.



How GC determines which allocator an object belongs to ?



--
Dmitry Olshansky


Re: std.allocator: primitives for helping GC

2014-05-01 Thread via Digitalmars-d
On Thursday, 1 May 2014 at 04:06:02 UTC, Andrei Alexandrescu 
wrote:
So far allocators have been quite GC-agnostic; deallocation 
would be needed for each allocation, and there was little 
structure to help tracing.


The primitive resolveInternalPointer has made a step toward 
more precise structuring of heaps, and now it's time to look at 
some real GC helpers.


These are markAllAsUnused, markAsUsed, and doneMarking. Find 
their description here:


http://erdani.com/d/phobos-prerelease/std_allocator.html

There are a few proof of concept implementations here:

https://github.com/andralex/phobos/blob/allocator/std/allocator.d

The basic idea is that the GC would initiate a collection by 
calling markAllAsUnused. That would cause the underlying 
untyped allocator to softly mark all memory as free, but 
without actually messing it up. (Many allocators are capable of 
implementing such a primitive cheaply.)


Then, the GC would trace objects starting from roots. Whenever 
it finds a used pointer, it would mark the corresponding memory 
as allocated by using markAsUsed, effectively undoing the soft 
freeing for it.


At the end of the process, what's left is the memory 
effectively in use. Everything else got free by construction.


It seems to me that these three primitives (together with 
resolveInternalPointer) are sufficient for conservative 
tracing. I plan to make tracing more precise by adding more 
structure, but the scanning logic will stay the same.


Now is the time to destroy. I might be missing something by a 
mile.


This marking will usually be implemented by setting a flag near 
the memory in question, which is not COW and cache friendly, 
because it tends to write all over the place.


For this reason, some GCs use a temporary bitmap for marking, and 
leave the marked memory untouched. By pushing the marking to the 
specific allocators, this can no longer be implemented globally.


But I guess a GC is not required to use these helper functions, 
so I guess it's fine...


And just to be sure: doneMarking is _not_ supposed to actually 
free the memory (and call the destructors), is it? That's still 
the GC's job, right?


Re: std.allocator: primitives for helping GC

2014-05-01 Thread Orvid King via Digitalmars-d
On 4/30/14, Andrei Alexandrescu via Digitalmars-d
digitalmars-d@puremagic.com wrote:
 So far allocators have been quite GC-agnostic; deallocation would be
 needed for each allocation, and there was little structure to help tracing.

 The primitive resolveInternalPointer has made a step toward more precise
 structuring of heaps, and now it's time to look at some real GC helpers.

 These are markAllAsUnused, markAsUsed, and doneMarking. Find their
 description here:

 http://erdani.com/d/phobos-prerelease/std_allocator.html

 There are a few proof of concept implementations here:

 https://github.com/andralex/phobos/blob/allocator/std/allocator.d

 The basic idea is that the GC would initiate a collection by calling
 markAllAsUnused. That would cause the underlying untyped allocator to
 softly mark all memory as free, but without actually messing it up.
 (Many allocators are capable of implementing such a primitive cheaply.)

 Then, the GC would trace objects starting from roots. Whenever it finds
 a used pointer, it would mark the corresponding memory as allocated by
 using markAsUsed, effectively undoing the soft freeing for it.

 At the end of the process, what's left is the memory effectively in use.
 Everything else got free by construction.

 It seems to me that these three primitives (together with
 resolveInternalPointer) are sufficient for conservative tracing. I plan
 to make tracing more precise by adding more structure, but the scanning
 logic will stay the same.

 Now is the time to destroy. I might be missing something by a mile.


 Andrei


I actually do have a design in mind for integrating your allocators
with my GC design, but there are really only 3 methods that a
user-land allocator needs to implement. Below are a few snippets from
my current working design describing the `markAsReferenced`, `sweep`,
and `free` methods, as implemented by the GC allocators. A user-space
allocator will need to call `GC.takeOwnership` passing in the pointer
to the memory and it's length in order to tell the GC that it wants to
handle the scanning and sweeping of the block of memory itself.


A method by the name of `markAsReferenced` accepting a single `size_t`
containing a pointer to an object, or a pointer to the interior of an
object that, if valid, lies within the allocator’s page. This should
mark the object, and any heap-allocated value referenced by the object
by passing a pointer to the object to `GC.markAsReferenced`, so that a
subsequent call to `sweep` doesn’t result in one of the values
referenced by this object being freed by another allocator. This
should treat the final possible pointer as a special case to allow the
compiler to perform tail-call optimizations.

A method by the name of `sweep` accepting no parameters, returning a
`size_t` with the number of objects freed. This return value will only
ever be used for statistics, so, while it is encouraged for an
allocator to implement this, it is permissible to unconditionally
return `0`, as the return value will never be essential to the
functioning of the GC. This method is to be invoked only after every
root has been recursively marked as referenced. This method must not
move any marked heap-allocated objects unless the global setting
asyncSweep is disabled. An allocator is free to do whatever it wants
with the freed allocations, however all freed allocations need to be
passed to `GC.finalize` which will return true if the value is
actually free due to pending finalizers. It is encouraged for
allocator implementations to take lock contention into account when
implementing this method, as the Allocator Dispatch may request an
allocation while the sweep is still being performed. It is recommended
for an allocator to treat a completely un-marked heap as a special
case, so that, if there are finalizers pending, time is not wasted
attempting to add them to the finalizer list. If there are no
remaining live allocations belonging to an allocator, that allocator
should pass a pointer to itself to `GC.releaseAllocator`, so that it
can be released if it is no longer needed.

A method by the name of `free` accepting a single pointer to an
allocation owned by this allocator. This method should assume that the
finalizer has already been called if applicable, and is free to use
the block of memory in whatever way it wants. It is not required for
an allocator to immediately make the memory available to be
re-allocated, as it is permitted for it to wait until the next time
that `sweep` is called. While generally errors should not be thrown in
the GC, if a pointer to an object is passed to an allocator’s free
method that is not in fact owned by that allocator, the allocator
should throw an Error, due to the fact there are likely even bigger
problems. It will be possible to catch this error via a global
handler.



Re: std.allocator: primitives for helping GC

2014-05-01 Thread Andrei Alexandrescu via Digitalmars-d

On 5/1/14, 8:00 AM, Marc Schütz schue...@gmx.net wrote:


This marking will usually be implemented by setting a flag near the
memory in question, which is not COW and cache friendly, because it
tends to write all over the place.


Depends. HeapBlockWithInternalPointers uses the same flag as the 
allocation flag (nice!), and all flags are together at the front of the 
managed chunk. One perk is that once tracing is done, there's nothing 
else to do - free memory stays free.



For this reason, some GCs use a temporary bitmap for marking, and leave
the marked memory untouched. By pushing the marking to the specific
allocators, this can no longer be implemented globally.


Yah, probably I'll implement such a tracer too. For more generality (to 
work on discontiguous blocks of varied granularity) I'm thinking of 
having it use an interval tree instead of a bitmap.



But I guess a GC is not required to use these helper functions, so I
guess it's fine...

And just to be sure: doneMarking is _not_ supposed to actually free the
memory (and call the destructors), is it? That's still the GC's job, right?


Yah, doneMarking would only do whatever low-level postprocessing the 
untype GC would need to do. Calling dtors will be done right after that.



Andrei



Re: std.allocator: primitives for helping GC

2014-05-01 Thread Andrei Alexandrescu via Digitalmars-d

On 5/1/14, 12:43 AM, Dmitry Olshansky wrote:

01-May-2014 08:05, Andrei Alexandrescu пишет:

So far allocators have been quite GC-agnostic; deallocation would be
needed for each allocation, and there was little structure to help
tracing.

The primitive resolveInternalPointer has made a step toward more precise
structuring of heaps, and now it's time to look at some real GC helpers.

These are markAllAsUnused, markAsUsed, and doneMarking. Find their
description here:

http://erdani.com/d/phobos-prerelease/std_allocator.html

There are a few proof of concept implementations here:

https://github.com/andralex/phobos/blob/allocator/std/allocator.d

The basic idea is that the GC would initiate a collection by calling
markAllAsUnused. That would cause the underlying untyped allocator to
softly mark all memory as free, but without actually messing it up.
(Many allocators are capable of implementing such a primitive cheaply.)


It would be interesting to define where GC sits and how it integrates
with allocators to begin with. Given that currently GCAllocator is
mentioned
somewhere at the bottom of heap layers I'm really at loss about how
these helpers fit into the picture.


Well currently GCAllocator is only included for completeness. The 
purpose of std.allocator/std.typed_allocator is to be a complete 
redesign, not an adaptation, of the current GC.



Then, the GC would trace objects starting from roots. Whenever it finds
a used pointer, it would mark the corresponding memory as allocated by
using markAsUsed, effectively undoing the soft freeing for it.



How GC determines which allocator an object belongs to ?


The idea is, for each root (whether conservative or typed), the GC first 
calls resolveInternalPointer. That gives the memory chunk encompassing 
that pointer. The GC uses that chunk to retrieve metadata about it and 
then passes it to setAsUsed.



Andrei



Re: std.allocator: primitives for helping GC

2014-05-01 Thread Dmitry Olshansky via Digitalmars-d

01-May-2014 20:41, Andrei Alexandrescu пишет:

On 5/1/14, 12:43 AM, Dmitry Olshansky wrote:

01-May-2014 08:05, Andrei Alexandrescu пишет:

So far allocators have been quite GC-agnostic; deallocation would be
needed for each allocation, and there was little structure to help
tracing.

The primitive resolveInternalPointer has made a step toward more precise
structuring of heaps, and now it's time to look at some real GC helpers.

These are markAllAsUnused, markAsUsed, and doneMarking. Find their
description here:

http://erdani.com/d/phobos-prerelease/std_allocator.html

There are a few proof of concept implementations here:

https://github.com/andralex/phobos/blob/allocator/std/allocator.d

The basic idea is that the GC would initiate a collection by calling
markAllAsUnused. That would cause the underlying untyped allocator to
softly mark all memory as free, but without actually messing it up.
(Many allocators are capable of implementing such a primitive cheaply.)


It would be interesting to define where GC sits and how it integrates
with allocators to begin with. Given that currently GCAllocator is
mentioned
somewhere at the bottom of heap layers I'm really at loss about how
these helpers fit into the picture.


Well currently GCAllocator is only included for completeness. The
purpose of std.allocator/std.typed_allocator is to be a complete
redesign, not an adaptation, of the current GC.



So a replacement for current GC built on top of std.allocator.


Then, the GC would trace objects starting from roots. Whenever it finds
a used pointer, it would mark the corresponding memory as allocated by
using markAsUsed, effectively undoing the soft freeing for it.



How GC determines which allocator an object belongs to ?


The idea is, for each root (whether conservative or typed), the GC first
calls resolveInternalPointer. That gives the memory chunk encompassing
that pointer. The GC uses that chunk to retrieve metadata about it and
then passes it to setAsUsed.



The Q is about which _allocator_ an object belongs to, in case there are 
many heaps. How to work given an allocator and a pointer is more or less 
obvious.




Andrei




--
Dmitry Olshansky


Re: std.allocator: primitives for helping GC

2014-05-01 Thread Andrei Alexandrescu via Digitalmars-d

On 5/1/14, 10:25 AM, Dmitry Olshansky wrote:

01-May-2014 20:41, Andrei Alexandrescu пишет:
So a replacement for current GC built on top of std.allocator.


That's the plan. There's a lot more work between here and there.


How GC determines which allocator an object belongs to ?


The idea is, for each root (whether conservative or typed), the GC first
calls resolveInternalPointer. That gives the memory chunk encompassing
that pointer. The GC uses that chunk to retrieve metadata about it and
then passes it to setAsUsed.



The Q is about which _allocator_ an object belongs to, in case there are
many heaps.


Oh I see. All allocators that compose other allocators dispatch/forward 
calls to resolveInternalPointer appropriately. See in the code how some 
(like the Bucketizer) use a rather inefficient linear search through 
buckets. But there are simple ways to improve that.



Andrei




Re: std.allocator: primitives for helping GC

2014-05-01 Thread Andrei Alexandrescu via Digitalmars-d

On 5/1/14, 8:13 AM, Orvid King via Digitalmars-d wrote:

I actually do have a design in mind for integrating your allocators
with my GC design,

[snip]

Just wanted to mention that I'll need to get to this a bit later, after 
getting to the higher-level bits of design. Thanks!


Andrei



std.allocator: primitives for helping GC

2014-04-30 Thread Andrei Alexandrescu via Digitalmars-d
So far allocators have been quite GC-agnostic; deallocation would be 
needed for each allocation, and there was little structure to help tracing.


The primitive resolveInternalPointer has made a step toward more precise 
structuring of heaps, and now it's time to look at some real GC helpers.


These are markAllAsUnused, markAsUsed, and doneMarking. Find their 
description here:


http://erdani.com/d/phobos-prerelease/std_allocator.html

There are a few proof of concept implementations here:

https://github.com/andralex/phobos/blob/allocator/std/allocator.d

The basic idea is that the GC would initiate a collection by calling 
markAllAsUnused. That would cause the underlying untyped allocator to 
softly mark all memory as free, but without actually messing it up. 
(Many allocators are capable of implementing such a primitive cheaply.)


Then, the GC would trace objects starting from roots. Whenever it finds 
a used pointer, it would mark the corresponding memory as allocated by 
using markAsUsed, effectively undoing the soft freeing for it.


At the end of the process, what's left is the memory effectively in use. 
Everything else got free by construction.


It seems to me that these three primitives (together with 
resolveInternalPointer) are sufficient for conservative tracing. I plan 
to make tracing more precise by adding more structure, but the scanning 
logic will stay the same.


Now is the time to destroy. I might be missing something by a mile.


Andrei