Re: Migrating an existing more modern GC to D's gc.d

2018-05-24 Thread Per Nordlöw via Digitalmars-d
On Thursday, 24 May 2018 at 13:13:03 UTC, Steven Schveighoffer 
wrote:
Really though, the issues with D's GC are partly to blame from 
the language itself rather than the GC design. Having certain 
aspects of the language precludes certain GCs. Java as a 
language is much more conducive to more advanced GC designs.


I'm hoping for a tough long-term deprecation process that 
alleviates these issues eventhough they will cause big breakage. 
I believe it will be worth it.


Re: Migrating an existing more modern GC to D's gc.d

2018-05-24 Thread Steven Schveighoffer via Digitalmars-d

On 5/24/18 8:35 AM, Chris wrote:

On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
How difficult would it be to migrate an existing modern 
GC-implementation into D's?


Which kinds of GC's would be of interest?

Which attempts have been made already?


IBM has open sourced its JVM:

https://www.eclipse.org/openj9/

They claim they have good GCs. So maybe someone knowledgeable wants to 
have a look at it.


It's GPL, Apache, or EPL. I'm not sure about EPL, but I know that the 
former 2 are not convertible to Boost, so we couldn't accept a port from 
there.


Really though, the issues with D's GC are partly to blame from the 
language itself rather than the GC design. Having certain aspects of the 
language precludes certain GCs. Java as a language is much more 
conducive to more advanced GC designs.


-Steve


Re: Migrating an existing more modern GC to D's gc.d

2018-05-24 Thread Chris via Digitalmars-d

On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
How difficult would it be to migrate an existing modern 
GC-implementation into D's?


Which kinds of GC's would be of interest?

Which attempts have been made already?


IBM has open sourced its JVM:

https://www.eclipse.org/openj9/

They claim they have good GCs. So maybe someone knowledgeable 
wants to have a look at it.


Re: Migrating an existing more modern GC to D's gc.d

2018-04-13 Thread H. S. Teoh via Digitalmars-d
On Sat, Apr 14, 2018 at 01:40:58AM +0200, Timon Gehr via Digitalmars-d wrote:
> On 13.04.2018 23:40, Jonathan M Davis wrote:
> > On Friday, April 13, 2018 22:36:31 Timon Gehr via Digitalmars-d wrote:
> > > On 10.04.2018 10:56, Jonathan M Davis wrote:
> > > > CTFE only ever happens when it must happen. The compiler never
> > > > does it as an optimization.
> > > 
> > > The frontend doesn't. The backend might.
> > 
> > The optimizer may do constant folding or inline the code so far that
> > it just gives the result, but it doesn't do actual CTFE. That's all
> > in the frontend.
> > 
> > - Jonathan M Davis
> > 
> 
> CTFE just stands for "compile-time function evaluation". Claiming that
> the compiler never does this as an optimization is a bit misleading,
> but fine.

CTFE, as currently implemented in the compiler front-end (i.e., common
across dmd, gdc, ldc), is actually only invoked when a value is
*required* at compile-time, e.g., as a template argument or enum.  While
the CTFE code did grow out of the constant-folding code, the two are
actually distinct, and the front-end never calls CTFE when performing
constant-folding (even though CTFE could be construed to be a souped-up
form of constant-folding).  This is a rather fine distinction in the
current implementation that I wasn't aware of until recently.

Certain backends, like ldc's, may also perform their own "compile-time
function evaluation", e.g., on the LLVM IR, as part of their
optimization pass.  For example, the LDC optimizer can literally execute
LLVM IR at compile-time and replace an entire function-call tree with a
single instruction that loads the computed value as a literal.  This has
nothing to do with CTFE (as we know it in D) per se, but is a feature of
the LDC optimizer.


T

-- 
It only takes one twig to burn down a forest.


Re: Migrating an existing more modern GC to D's gc.d

2018-04-13 Thread Timon Gehr via Digitalmars-d

On 13.04.2018 23:40, Jonathan M Davis wrote:

On Friday, April 13, 2018 22:36:31 Timon Gehr via Digitalmars-d wrote:

On 10.04.2018 10:56, Jonathan M Davis wrote:

CTFE only ever happens when it must happen. The compiler never does it
as an optimization.


The frontend doesn't. The backend might.


The optimizer may do constant folding or inline the code so far that it just
gives the result, but it doesn't do actual CTFE. That's all in the frontend.

- Jonathan M Davis



CTFE just stands for "compile-time function evaluation". Claiming that 
the compiler never does this as an optimization is a bit misleading, but 
fine.


Re: Migrating an existing more modern GC to D's gc.d

2018-04-13 Thread Jonathan M Davis via Digitalmars-d
On Friday, April 13, 2018 22:36:31 Timon Gehr via Digitalmars-d wrote:
> On 10.04.2018 10:56, Jonathan M Davis wrote:
> > CTFE only ever happens when it must happen. The compiler never does it
> > as an optimization.
>
> The frontend doesn't. The backend might.

The optimizer may do constant folding or inline the code so far that it just
gives the result, but it doesn't do actual CTFE. That's all in the frontend.

- Jonathan M Davis



Re: Migrating an existing more modern GC to D's gc.d

2018-04-13 Thread Timon Gehr via Digitalmars-d

On 10.04.2018 10:56, Jonathan M Davis wrote:

CTFE only ever happens when it must happen. The compiler never does it as an
optimization.


The frontend doesn't. The backend might.


Re: Migrating an existing more modern GC to D's gc.d

2018-04-12 Thread David Bennett via Digitalmars-d
On Wednesday, 11 April 2018 at 19:38:59 UTC, Dmitry Olshansky 
wrote:

On Tuesday, 10 April 2018 at 07:22:14 UTC, David Bennett wrote:
People cast from thread local to shared? ...okay thats no 
fun...  :(


I can understand the other way, thats why i was leaning on the 
conservative side and putting more stuff in the global pools.


Well you might want to build something as thread-local and then 
publish as shared.


Yeah I can see if your trying to share types like classes, shared 
would get in the way quite quick.


I think it could be __to_shared(ptr, length) to let GC know 
that block should be added to global set of sorts. That will 
foobar the GC design quite a bit but to have per thread GCs I’d 
take that risk.


Yeah I had this idea also, the runtime gets a hook on 
cast(shared) and the GC then just sets a flag and that part of 
memory will never be freed inside a thread-local mark/sweep. No 
move needed.


But then keeping in mind transitive nature of shared Maybe 
not ;)


Yeah shared is quite locked down so should have less ways people 
could foil my plans.


It's __gshared that im worried about now, ie if you had a class 
(stored in global pool) that you then assigned a local class to 
one of it's members. When a thread-local mark/sweep happened it 
wouldn't see the ref in the global pool and the member might get 
removed...


---
class A{}

class B{
__gshared A a;
this(A a){
this.a=a;
}
}

void main()
{
A a = new A();
B b = new B(a);
}
---

Currently my idea of storing classes with __gshared members would 
put B on the global poll but theres no cast so A would not be 
hoocked with __to_shared(). I guess the compiler could in theory 
inject the same __to_shared() in this case also, but it would be 
a lot harder and would probably be a mess as theres no cast to 
hook.


So maybe with __gshared it should be on the thread-local pool but 
marked as global.. but you might be able to mix shared and 
__gshared in a way that wouldn't work.


Maybe it should work the other way around - keep all in global 
pool, and have per-thread ref-sets of some form. Tricky anyway.


Would be worth some thought, I'll keep it in mind.

For now, I'm seeing if I can just make it so each thread has it's 
own Bin list, this way the data is stored in a way where the 
thread-local stuff is generally packed closer together and theres 
a higher chance to have a whole free page after a global 
mark/sweep.


If there a good benchmark for the GC I can run to see if I'm 
actually improving things?


Re: Migrating an existing more modern GC to D's gc.d

2018-04-11 Thread Ikeran via Digitalmars-d

On Tuesday, 10 April 2018 at 06:47:53 UTC, Jonathan M Davis wrote:
As it stands, it's impossible to have thread-local memory 
pools. It's quite legal to construct an object as shared or 
thread-local and cast it to the other. In fact, it's _highly_ 
likely that that's how any shared object of any complexity is 
going to be constructed. Similarly, it's extremely common to 
allocate an object as mutable and then cast it to immutable 
(either using assumeUnique or by using a pure function where 
the compiler does the cast implicitly for you if it can 
guarantee that the return value is unique), and immutable 
objects are implicitly shared. At minimum, there would have to 
be runtime hooks to do something like move an object between 
pools when it is cast to shared or immutable (or back) in order 
to ensure that an object was in the right pool, but if that 
requires copying the object rather than just moving the memory 
block, then it can't be done, because every pointer or 
reference pointing to that object would have to be rewritten 
(which isn't supported by the language).


It's a bit easier than that. When you cast something to shared or 
immutable, or allocate it as shared or immutable, you pin the 
object on the local heap. When the thread-local collector runs, 
it won't collect that object, since another thread might know 
about it. Then, when you run the global collector, it will 
determine which shared objects are still reachable and unpin 
things as appropriate.


That unpinning process requires a way to look up the owning 
thread for a piece of memory, which can be done in logarithmic 
time relative to the number of contiguous segments of address 
space.


Casting away from shared would not call any runtime functions; 
even if it were guaranteed that the cast were done on the 
allocating thread, it's likely that there exists another 
reference to the item in another thread.


This would discourage the use of immutable, since it wouldn't 
benefit from thread-local heaps.


Re: Migrating an existing more modern GC to D's gc.d

2018-04-11 Thread Dmitry Olshansky via Digitalmars-d

On Tuesday, 10 April 2018 at 07:22:14 UTC, David Bennett wrote:
On Tuesday, 10 April 2018 at 06:43:28 UTC, Dmitry Olshansky 
wrote:

On Tuesday, 10 April 2018 at 06:10:10 UTC, David Bennett wrote:
I was thinking about messing with the GC in my free time just 
yesterday... how hard would it be:


[snip]


Lost immutable and that thread-local is often casted to 
immutable, sometimes by compiler.

See assumeUnique and its ilk in Phobos.

Same with shared - it’s still often the case that you allocate 
thread-local then cast to shared.


People cast from thread local to shared? ...okay thats no 
fun...  :(


I can understand the other way, thats why i was leaning on the 
conservative side and putting more stuff in the global pools.


Well you might want to build something as thread-local and then 
publish as shared.



That is indeed something we should at some point have. Needs 
cooperation from the language such as explicit functions for 
shared<->local conversions that run-time is aware of.




So the language could (in theory) inject a __move_to_global(ref 
local, ref global) when casting to shared and the GC would need 
to update all the references in the local pages to point to the 
new global address?


I think it could be __to_shared(ptr, length) to let GC know that 
block should be added to global set of sorts. That will foobar 
the GC design quite a bit but to have per thread GCs I’d take 
that risk.


But then keeping in mind transitive nature of shared Maybe 
not ;)


Maybe it should work the other way around - keep all in global 
pool, and have per-thread ref-sets of some form. Tricky anyway.





Re: Migrating an existing more modern GC to D's gc.d

2018-04-11 Thread Paulo Pinto via Digitalmars-d

On Tuesday, 10 April 2018 at 18:31:28 UTC, Jacob Carlborg wrote:

On 2018-04-10 08:47, Jonathan M Davis wrote:

Regardless, I think that it's clear that in order to do 
anything with
thread-local pools, we'd have to lock down the type system 
even further to
disallow casts to or from shared or immutable, and that would 
really be a
big problem given the inherent restrictions on those types and 
how shared is

intended to be used.


Apple's GC for Objective-C (before it had ARC) was using 
thread-local pools. I wonder how they manged to do that in a 
language that doesn't have a type system that differentiates 
between TLS and shared memory.


They were doing it quite bad.

One of the reasons that always gets lost when discussing the 
merits of ARC over GC in Objective-C, is that Apple never managed 
to make the GC work without issues given its underlying C 
semantics.


So naturally having the compiler do what developers were already 
doing by hand with Framework derived classes was a safer way than 
ensuring Objective-C's GC would never crash.


Apple used to have a GC caveats document that was long taken down 
from their site.


This is one of the few surviving ones,

https://developer.apple.com/library/content/releasenotes/Cocoa/RN-ObjectiveC/#//apple_ref/doc/uid/TP40004309-CH1-DontLinkElementID_1


Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread Jacob Carlborg via Digitalmars-d

On 2018-04-10 08:47, Jonathan M Davis wrote:


Regardless, I think that it's clear that in order to do anything with
thread-local pools, we'd have to lock down the type system even further to
disallow casts to or from shared or immutable, and that would really be a
big problem given the inherent restrictions on those types and how shared is
intended to be used.


Apple's GC for Objective-C (before it had ARC) was using thread-local 
pools. I wonder how they manged to do that in a language that doesn't 
have a type system that differentiates between TLS and shared memory.


--
/Jacob Carlborg


Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread Steven Schveighoffer via Digitalmars-d

On 4/10/18 4:37 AM, David Bennett wrote:

On Tuesday, 10 April 2018 at 08:10:32 UTC, Jonathan M Davis wrote:


Yes. They expect it to work, and as the language is currently 
designed, it works perfectly well. In fact, it's even built into the 
language. e.g.


    int[] foo() pure
    {
    return [1, 2, 3, 4];
    }

    void main()
    {
    immutable arr = foo();
    }

compiles thanks to the fact that the compiler can guarantee from the 
signature of foo that its return value is unique.




Oh is that run at runtime? I thought D was just smart and did it using 
CTFE.


Well, D could be smart enough and call a runtime function that says it's 
moving data from thread-local to shared (or vice versa).






We also have std.exception.assumeUnique (which is just a cast to 
immutable) as a way to document that you're guaranteeing that a 
reference to an object is unique and therefore can be safely cast to 
immutable.




Can't say I've used std.exception.assumeUnique, but I guess other people 
have a use for it as it exists.


Would be nice if you could inject type checking information at compile 
time without effecting the storage class. But thats a bit OT now.


assumeUnique is a library function, it could be instrumented to do the 
right thing.


I think it's possible to do this in D, but you need language support.

-Steve


Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread Jonathan M Davis via Digitalmars-d
On Tuesday, April 10, 2018 08:37:47 David Bennett via Digitalmars-d wrote:
> On Tuesday, 10 April 2018 at 08:10:32 UTC, Jonathan M Davis wrote:
> > Yes. They expect it to work, and as the language is currently
> > designed, it works perfectly well. In fact, it's even built
> > into the language. e.g.
> >
> > int[] foo() pure
> > {
> > return [1, 2, 3, 4];
> > }
> >
> > void main()
> > {
> > immutable arr = foo();
> > }
> >
> > compiles thanks to the fact that the compiler can guarantee
> > from the signature of foo that its return value is unique.
>
> Oh is that run at runtime? I thought D was just smart and did it
> using CTFE.

CTFE only ever happens when it must happen. The compiler never does it as an
optimization. So, if you did

enum arr = foo();

or

static arr = foo();

then it would use CTFE, because an enum's value must be known at compile
time, and if a static variable is directly initialized instead of
initialized via a static constructor, its value must be known at compile
time. But if you're initializing a variable whose value does not need to be
known at compile time, then no CTFE occurs.

It would be a serious rabbit hole for the compiler to attempt CTFE when it
wasn't told to, particularly since it can't look at a function and know
whether it's going to work with CTFE or not. It has to actually call it with
a specific set of arguments to find out (and depending on what the function
does, it might even work with CTFE with some arguments and not with others -
e.g. if a particular branch of an if statement works with CTFE while another
does an operation that doesn't work with CTFE).

- Jonathan M Davis



Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread David Bennett via Digitalmars-d

On Tuesday, 10 April 2018 at 08:10:32 UTC, Jonathan M Davis wrote:


Yes. They expect it to work, and as the language is currently 
designed, it works perfectly well. In fact, it's even built 
into the language. e.g.


int[] foo() pure
{
return [1, 2, 3, 4];
}

void main()
{
immutable arr = foo();
}

compiles thanks to the fact that the compiler can guarantee 
from the signature of foo that its return value is unique.




Oh is that run at runtime? I thought D was just smart and did it 
using CTFE.




We also have std.exception.assumeUnique (which is just a cast 
to immutable) as a way to document that you're guaranteeing 
that a reference to an object is unique and therefore can be 
safely cast to immutable.




Can't say I've used std.exception.assumeUnique, but I guess other 
people have a use for it as it exists.


Would be nice if you could inject type checking information at 
compile time without effecting the storage class. But thats a bit 
OT now.




Because of how restrictive shared and immutable are, you 
frequently have to build them from thread-local, mutable data. 
And while it's preferable to have as little in your program be 
shared as possible and to favor solutions such as doing message 
passing with std.concurrency, there are situations where you 
pretty much need to have complex shared objects. And since D is 
a systems language, we're a lot more restricted in the 
assumptions that we can make in comparison to a language such 
as Java or C#.




Yeah i agree that any solution should keep in mind that D is a 
systems language and should allow you to do stuff when you need 
to.


Oh, I just had a much simpler idea that shouldn't have any 
issues, I'll see if that makes the GC faster to allocate. 
(everything else is the same)


Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread Jonathan M Davis via Digitalmars-d
On Tuesday, April 10, 2018 07:55:00 David Bennett via Digitalmars-d wrote:
> On Tuesday, 10 April 2018 at 06:47:53 UTC, Jonathan M Davis wrote:
> > As it stands, it's impossible to have thread-local memory
> > pools. It's quite legal to construct an object as shared or
> > thread-local and cast it to the other. In fact, it's _highly_
> > likely that that's how any shared object of any complexity is
> > going to be constructed. Similarly, it's extremely common to
> > allocate an object as mutable and then cast it to immutable
> > (either using assumeUnique or by using a pure function where
> > the compiler does the cast implicitly for you if it can
> > guarantee that the return value is unique), and immutable
> > objects are implicitly shared.
>
> (Honest question:) Do people really cast from local to
> shared/immutable and expect it to work?
> (when ever I cast something more complex then a size_t I almost
> expect it to blow up... or break sometime in the future)

Yes. They expect it to work, and as the language is currently designed, it
works perfectly well. In fact, it's even built into the language. e.g.

int[] foo() pure
{
return [1, 2, 3, 4];
}

void main()
{
immutable arr = foo();
}

compiles thanks to the fact that the compiler can guarantee from the
signature of foo that its return value is unique. We also have
std.exception.assumeUnique (which is just a cast to immutable) as a way to
document that you're guaranteeing that a reference to an object is unique
and therefore can be safely cast to immutable.

> That said, I can understanding building a shared object from
> parts of local data... though I try to keep my thread barriers as
> thin as possible myself. (meaning I tend to copy stuff to the
> shared and have as few shared's as possible)

Because of how restrictive shared and immutable are, you frequently have to
build them from thread-local, mutable data. And while it's preferable to
have as little in your program be shared as possible and to favor solutions
such as doing message passing with std.concurrency, there are situations
where you pretty much need to have complex shared objects. And since D is a
systems language, we're a lot more restricted in the assumptions that we can
make in comparison to a language such as Java or C#.

- Jonathan M Davis



Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread David Bennett via Digitalmars-d

On Tuesday, 10 April 2018 at 06:47:53 UTC, Jonathan M Davis wrote:
As it stands, it's impossible to have thread-local memory 
pools. It's quite legal to construct an object as shared or 
thread-local and cast it to the other. In fact, it's _highly_ 
likely that that's how any shared object of any complexity is 
going to be constructed. Similarly, it's extremely common to 
allocate an object as mutable and then cast it to immutable 
(either using assumeUnique or by using a pure function where 
the compiler does the cast implicitly for you if it can 
guarantee that the return value is unique), and immutable 
objects are implicitly shared.




(Honest question:) Do people really cast from local to 
shared/immutable and expect it to work?
(when ever I cast something more complex then a size_t I almost 
expect it to blow up... or break sometime in the future)


That said, I can understanding building a shared object from 
parts of local data... though I try to keep my thread barriers as 
thin as possible myself. (meaning I tend to copy stuff to the 
shared and have as few shared's as possible)




At minimum, there would have to be runtime hooks to do 
something like move an object between pools when it is cast to 
shared or immutable (or back) in order to ensure that an object 
was in the right pool, but if that requires copying the object 
rather than just moving the memory block, then it can't be 
done, because every pointer or reference pointing to that 
object would have to be rewritten (which isn't supported by the 
language).




A hook for local to cast(shared) could work... but would require 
a DIP I guess. I was hoping to make a more incremental 
improvement the the GC.




Also, it would be a disaster for shared, because the typical 
way to use shared is to protect the shared object with a mutex, 
cast away shared so that it can be operated on as thread-local 
within that section of code, and then before the mutex is 
released, all thread-local references then need to be gone. e.g.



synchronized(mutex)
{
auto threadLocal = cast(MyType)mySharedObject;

// do something with threadLocal...

// threadLocal leaves scope and is gone without being cast 
back

}

// all references to the shared object should now be shared



Yeah thats why I was still scanning all thread stacks and pages 
when marking global data.

So a shared -> local is a no op but the other way needs thought.



You really _don't_ want the shared object to move between pools
because of that cast (since it would hurt performance), and in 
such a
situation, you don't usually cast back to shared. Rather, you 
have a shared
reference, cast it to get a thread-local reference, and then 
let the
thread-local reference leave scope. So, the same object 
temporarily has both
a thread-local and a shared reference to it, and if it were 
moved to the
thread-local pool with the cast, it would never be moved back 
when the

thread-local references left scope and the mutex was released.

Having synchronized classes as described in TDPL would make the 
above code cleaner in the cases where a synchronized class 
would work, but the basic concept is the same. It would still 
be doing a cast underneath the hood, and it would still have 
the same problems. It just wouldn't involve explicit casting. 
shared's design inherently requires casting away shared, so it 
just plain isn't going to play well with anything that doesn't 
play well with such casts - such as having thread-local heaps.




I would think a shared class would never be marked as a 
THREAD_LOCAL as it has a shared member.




Also, IIRC, at one point, Daniel Murphy explained to me some 
problem with classes with regards to the virtual table or the 
TypeInfo that inherently wouldn't work with trying to move it 
between threads. Unfortunately, I don't remember the details 
now, but I do remember that there's _something_ there that 
wouldn't work with thread-local heaps. And if anyone were to 
seriously try it, I expect that he could probably come up with 
the reasons again.


Regardless, I think that it's clear that in order to do 
anything with thread-local pools, we'd have to lock down the 
type system even further to disallow casts to or from shared or 
immutable, and that would really be a big problem given the 
inherent restrictions on those types and how shared is intended 
to be used. So, while it's a common idea as to how the GC could 
be improved, and it would be great if we could do it, I think 
that it goes right along with all of the other ideas that 
require stuff like read and write barriers everywhere and thus 
will never be in D's GC.


- Jonathan M Davis


Yeah I thought it would have issues, thanks for your feedback!

I'll see if I can come up with a better idea that doesn't break 
as much stuff.


Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread David Bennett via Digitalmars-d

On Tuesday, 10 April 2018 at 06:43:28 UTC, Dmitry Olshansky wrote:

On Tuesday, 10 April 2018 at 06:10:10 UTC, David Bennett wrote:
I was thinking about messing with the GC in my free time just 
yesterday... how hard would it be:


[snip]


Lost immutable and that thread-local is often casted to 
immutable, sometimes by compiler.

See assumeUnique and its ilk in Phobos.

Same with shared - it’s still often the case that you allocate 
thread-local then cast to shared.


People cast from thread local to shared? ...okay thats no fun...  
:(


I can understand the other way, thats why i was leaning on the 
conservative side and putting more stuff in the global pools.




Lastly - thanks to 0-typesafety of delegates it’s trivial to 
share a single GC-backed stack with multiple threads. So what 
you deemed thread-local might be used in other thread, 
transitively so.


Oh thats a good point I didn't think of!



D is thread-local except when it’s not.



If thats possible we could also Just(TM) scan the current 
thread stack and mark/sweep only those pages. (without a stop 
the world)




That is indeed something we should at some point have. Needs 
cooperation from the language such as explicit functions for 
shared<->local conversions that run-time is aware of.




So the language could (in theory) inject a __move_to_global(ref 
local, ref global) when casting to shared and the GC would need 
to update all the references in the local pages to point to the 
new global address?


And when a thread ends we could give the pages to the global 
pool without a mark/sweep.


The idea is it works like it does currently unless something 
is invisible to other threads, Or am i missing something 
obvious? (quite likely)


Indeed there are ugly details that while would allow per thread 
GC in principle will in general crash and burn on most 
non-trivial programs.


Okay, thanks for the points they were very clear so I assume you 
have spent a lot more brain power on this then I have.


Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread Jonathan M Davis via Digitalmars-d
On Tuesday, April 10, 2018 06:10:10 David Bennett via Digitalmars-d wrote:
> On Tuesday, 10 April 2018 at 05:26:28 UTC, Dmitry Olshansky wrote:
> > On Monday, 9 April 2018 at 19:50:16 UTC, H. S. Teoh wrote:
> >> Last I remembered, you were working on a GC prototype for D?
> >
> > Still there, but my spare time is super limited lately, the
> > other project preempted that for the moment.
> >
> >> Any news on that, or have you basically given it up?
> >
> > Might try to hack to the finish line in one good night, it was
> > pretty close to complete. Debugging would be fun though ;)
>
> I was thinking about messing with the GC in my free time just
> yesterday... how hard would it be:
>
> Add a BlkAttr.THREAD_LOCAL, and set it from the runtime if the
> type or it's members are not shared or __gshared.
>
> Then we could store BlkAttr.THREAD_LOCAL memory in different
> pages (per thread) without having to setting a mutex. (if we need
> to get new page from the global pool we set a mutex for that)
>
> If thats possible we could also Just(TM) scan the current thread
> stack and mark/sweep only those pages. (without a stop the world)
>
> And when a thread ends we could give the pages to the global pool
> without a mark/sweep.
>
> The idea is it works like it does currently unless something is
> invisible to other threads, Or am i missing something obvious?
> (quite likely)

As it stands, it's impossible to have thread-local memory pools. It's quite
legal to construct an object as shared or thread-local and cast it to the
other. In fact, it's _highly_ likely that that's how any shared object of
any complexity is going to be constructed. Similarly, it's extremely common
to allocate an object as mutable and then cast it to immutable (either using
assumeUnique or by using a pure function where the compiler does the cast
implicitly for you if it can guarantee that the return value is unique), and
immutable objects are implicitly shared. At minimum, there would have to be
runtime hooks to do something like move an object between pools when it is
cast to shared or immutable (or back) in order to ensure that an object was
in the right pool, but if that requires copying the object rather than just
moving the memory block, then it can't be done, because every pointer or
reference pointing to that object would have to be rewritten (which isn't
supported by the language).

Also, it would be a disaster for shared, because the typical way to use
shared is to protect the shared object with a mutex, cast away shared so
that it can be operated on as thread-local within that section of code, and
then before the mutex is released, all thread-local references then need to
be gone. e.g.

synchronized(mutex)
{
auto threadLocal = cast(MyType)mySharedObject;

// do something with threadLocal...

// threadLocal leaves scope and is gone without being cast back
}

// all references to the shared object should now be shared

You really _don't_ want the shared object to move between pools
because of that cast (since it would hurt performance), and in such a
situation, you don't usually cast back to shared. Rather, you have a shared
reference, cast it to get a thread-local reference, and then let the
thread-local reference leave scope. So, the same object temporarily has both
a thread-local and a shared reference to it, and if it were moved to the
thread-local pool with the cast, it would never be moved back when the
thread-local references left scope and the mutex was released.

Having synchronized classes as described in TDPL would make the above code
cleaner in the cases where a synchronized class would work, but the basic
concept is the same. It would still be doing a cast underneath the hood, and
it would still have the same problems. It just wouldn't involve explicit
casting. shared's design inherently requires casting away shared, so it just
plain isn't going to play well with anything that doesn't play well with
such casts - such as having thread-local heaps.

Also, IIRC, at one point, Daniel Murphy explained to me some problem with
classes with regards to the virtual table or the TypeInfo that inherently
wouldn't work with trying to move it between threads. Unfortunately, I don't
remember the details now, but I do remember that there's _something_ there
that wouldn't work with thread-local heaps. And if anyone were to seriously
try it, I expect that he could probably come up with the reasons again.

Regardless, I think that it's clear that in order to do anything with
thread-local pools, we'd have to lock down the type system even further to
disallow casts to or from shared or immutable, and that would really be a
big problem given the inherent restrictions on those types and how shared is
intended to be used. So, while it's a common idea as to how the GC could be
improved, and it would be great if we could do it, I think that it goes
right along with all of the other ideas that require stuff like read and
write barriers 

Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread Dmitry Olshansky via Digitalmars-d

On Tuesday, 10 April 2018 at 06:10:10 UTC, David Bennett wrote:
On Tuesday, 10 April 2018 at 05:26:28 UTC, Dmitry Olshansky 
wrote:

On Monday, 9 April 2018 at 19:50:16 UTC, H. S. Teoh wrote:

Last I remembered, you were working on a GC prototype for D?


Still there, but my spare time is super limited lately, the 
other project preempted that for the moment.



Any news on that, or have you basically given it up?


Might try to hack to the finish line in one good night, it was 
pretty close to complete. Debugging would be fun though ;)


I was thinking about messing with the GC in my free time just 
yesterday... how hard would it be:


Add a BlkAttr.THREAD_LOCAL, and set it from the runtime if the 
type or it's members are not shared or __gshared.


Then we could store BlkAttr.THREAD_LOCAL memory in different 
pages (per thread) without having to setting a mutex. (if we 
need to get new page from the global pool we set a mutex for 
that)


Lost immutable and that thread-local is often casted to 
immutable, sometimes by compiler.

See assumeUnique and its ilk in Phobos.

Same with shared - it’s still often the case that you allocate 
thread-local then cast to shared.


Lastly - thanks to 0-typesafety of delegates it’s trivial to 
share a single GC-backed stack with multiple threads. So what you 
deemed thread-local might be used in other thread, transitively 
so.


D is thread-local except when it’s not.



If thats possible we could also Just(TM) scan the current 
thread stack and mark/sweep only those pages. (without a stop 
the world)




That is indeed something we should at some point have. Needs 
cooperation from the language such as explicit functions for 
shared<->local conversions that run-time is aware of.


And when a thread ends we could give the pages to the global 
pool without a mark/sweep.


The idea is it works like it does currently unless something is 
invisible to other threads, Or am i missing something obvious? 
(quite likely)


Indeed there are ugly details that while would allow per thread 
GC in principle will in general crash and burn on most 
non-trivial programs.






Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread David Bennett via Digitalmars-d

On Tuesday, 10 April 2018 at 06:10:10 UTC, David Bennett wrote:
I was thinking about messing with the GC in my free time just 
yesterday... how hard would it be:


[snip]

The idea is it works like it does currently unless something is 
invisible to other threads, Or am i missing something obvious? 
(quite likely)


Forgot to mention that a non-thread local mark/sweep would still 
scan all thread stacks and pages like it does currently as a 
thread local could hold a pointer the the global data (ie a copy 
of __gshared, void*).


The only why I can think of to break this idea is using cast() or 
sending something to a C function that then does and adds 
pointers in global data to the thread local stuff...


Re: Migrating an existing more modern GC to D's gc.d

2018-04-10 Thread David Bennett via Digitalmars-d

On Tuesday, 10 April 2018 at 05:26:28 UTC, Dmitry Olshansky wrote:

On Monday, 9 April 2018 at 19:50:16 UTC, H. S. Teoh wrote:

Last I remembered, you were working on a GC prototype for D?


Still there, but my spare time is super limited lately, the 
other project preempted that for the moment.



Any news on that, or have you basically given it up?


Might try to hack to the finish line in one good night, it was 
pretty close to complete. Debugging would be fun though ;)


I was thinking about messing with the GC in my free time just 
yesterday... how hard would it be:


Add a BlkAttr.THREAD_LOCAL, and set it from the runtime if the 
type or it's members are not shared or __gshared.


Then we could store BlkAttr.THREAD_LOCAL memory in different 
pages (per thread) without having to setting a mutex. (if we need 
to get new page from the global pool we set a mutex for that)


If thats possible we could also Just(TM) scan the current thread 
stack and mark/sweep only those pages. (without a stop the world)


And when a thread ends we could give the pages to the global pool 
without a mark/sweep.


The idea is it works like it does currently unless something is 
invisible to other threads, Or am i missing something obvious? 
(quite likely)


Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Dmitry Olshansky via Digitalmars-d

On Tuesday, 10 April 2018 at 03:59:33 UTC, Ikeran wrote:

On Monday, 9 April 2018 at 19:43:00 UTC, Dmitry Olshansky wrote:

None of of even close to advanced GCs are pluggable


Eclipse OMR contains a pluggable GC, and it's used in OpenJ9,


Or rather Eclipse OMR is a toolkit for runtimes/VMs and GC plugs 
into that. I encourage you to try it to implement D-like 
semantics with this run-time and you’ll see just how pluggable it 
is.



which claims to be an enterprise-grade JVM.


I once used OpenJ9, which was IBM J9 I think, right beforce open 
sourcing.  It was about x2 slower then Hotspot, didn’t dig too 
deep as to preciese reason. The fact that it was on Power8 was 
especially surprising, I thought IBM would take advantage of 
their own hardware.




Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Dmitry Olshansky via Digitalmars-d

On Monday, 9 April 2018 at 19:50:16 UTC, H. S. Teoh wrote:
On Mon, Apr 09, 2018 at 07:43:00PM +, Dmitry Olshansky via 
Digitalmars-d wrote:

On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:

[...]

> Which kinds of GC's would be of interest?

I believe we can get away with parallel mark-sweep + 
snapshot-based concurrency. It has some limitations but in D 
land with GC not being the single source of memory it should 
work fine.


> Which attempts have been made already?

I still think that mostly precise Immix style GC would also 
work, it won’t be 1:1 porting job though. Many things to 
figure out.


Last I remembered, you were working on a GC prototype for D?


Still there, but my spare time is super limited lately, the other 
project preempted that for the moment.



Any news on that, or have you basically given it up?


Might try to hack to the finish line in one good night, it was 
pretty close to complete. Debugging would be fun though ;)


Will likely try to complete it at DConf hackathon, I’d be glad 
should anyone want to help.




T





Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Ikeran via Digitalmars-d

On Monday, 9 April 2018 at 19:43:00 UTC, Dmitry Olshansky wrote:

None of of even close to advanced GCs are pluggable


Eclipse OMR contains a pluggable GC, and it's used in OpenJ9, 
which claims to be an enterprise-grade JVM.


Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Ali via Digitalmars-d

On Monday, 9 April 2018 at 23:21:23 UTC, Nordlöw wrote:

Through allocators solely or will the GC adapt in some way?


Here is the relevant line from the vision document
"@nogc: Use of D without a garbage collector, most likely by 
using reference counting and related methods Unique/Weak 
references) for reclamation of resources. This task is made 
challenging by the safety requirement. We believe we have an 
attack in the upcoming allocators/collections combos."


And the link to the vision document 
https://wiki.dlang.org/Vision/2018H1


In general, I do recommend you read the document carefully, and 
it important to note that is in it, and what is not in it


Obviously, there is no mention on working on the GC
There is also no direct mention of changing Phobos or modifying 
Phobos


Also it might be important to read the vision document in order
priority number 1 is "1. Lock down the language definition"
this very much align with many comments I've seen here from 
Andrei or Walter
that they are more interested in seeing the existing features 
used, rather than adding new features


The vision document doesn't seem to introduce any new feature, 
mostly improvements to existing features, or making existing 
feature more usable


Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Jonathan M Davis via Digitalmars-d
On Monday, April 09, 2018 23:21:23 Nordlöw via Digitalmars-d wrote:
> On Monday, 9 April 2018 at 20:20:39 UTC, Ali wrote:
> > On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
> >> How difficult would it be to migrate an existing modern
> >> GC-implementation into D's?
> >>
> >> Which kinds of GC's would be of interest?
> >>
> >> Which attempts have been made already?
> >
> > I think the priority is not having pluggable GC's, or a better
> > GC, but to fully support @nogc and deterministic and manual
> > memory management
> > which as I understood is on the roadmap
>
> Through allocators solely or will the GC adapt in some way?

I don't think that there are any plans to fundamentally change how the GC
works from the language perspective. The implementation may be improved or
replaced, but the GC isn't going anywhere, and any code that uses the GC
should continue to be able to do so as it has. Certainly, we're not getting
rid of or marginalizing the GC. We just want to make sure that code doesn't
use the GC when it doesn't need to or doesn't seriously benefit from using
the GC. More of Phobos should be @nogc than is currently, but it's never
going to be the case that all of Phobos is @nogc. There are real benefits to
using the GC, and we don't want to throw that away. We just don't want to
rely on it when it doesn't make sense.

There has been some discussion of adding some sort of RC capabilities to the
language with the idea that a type could be designed to be RC-ed that way,
but I don't think that the details have been sorted out yet, and I'm not
sure that it's even clear whether that's going to involve anything other
than GC-allocated memory (e.g. if the GC is used, then it can take care of
circular references, whereas if it isn't, then we have to get into weak
references and all of the complications that go with that). I believe that
Walter started looking into it, but I don't know how far he got before he
got sidetracked.

In particular, as I understand it, Walter's work with scope and DIP 1000 was
primarily motivated by whatever he was trying to do with RC, because without
something like DIP 1000, it becomes much harder (if not impossible) to do RC
in a fully @safe manner. So, whatever we end up seeing with regards to RC
support in the language is going to have to wait until DIP 1000 has been
fully sorted out, which will probably be a while.

Also, any work that's done to improve the GC at this point isn't something
that's going to be done by Walter. So, improvement to the GC is the sort of
thing that's likely to happen in parallel to any language improvements like
adding better RC support.

- Jonathan M Davis




Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Nordlöw via Digitalmars-d

On Monday, 9 April 2018 at 20:20:39 UTC, Ali wrote:

On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
How difficult would it be to migrate an existing modern 
GC-implementation into D's?


Which kinds of GC's would be of interest?

Which attempts have been made already?


I think the priority is not having pluggable GC's, or a better 
GC, but to fully support @nogc and deterministic and manual 
memory management

which as I understood is on the roadmap


Through allocators solely or will the GC adapt in some way?


Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Ali via Digitalmars-d

On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
How difficult would it be to migrate an existing modern 
GC-implementation into D's?


Which kinds of GC's would be of interest?

Which attempts have been made already?


I think the priority is not having pluggable GC's, or a better 
GC, but to fully support @nogc and deterministic and manual 
memory management

which as I understood is on the roadmap


Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread H. S. Teoh via Digitalmars-d
On Mon, Apr 09, 2018 at 07:43:00PM +, Dmitry Olshansky via Digitalmars-d 
wrote:
> On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
[...]
> > Which kinds of GC's would be of interest?
> 
> I believe we can get away with parallel mark-sweep + snapshot-based
> concurrency. It has some limitations but in D land with GC not being
> the single source of memory it should work fine.
> 
> > Which attempts have been made already?
> 
> I still think that mostly precise Immix style GC would also work, it
> won’t be 1:1 porting job though. Many things to figure out.

Last I remembered, you were working on a GC prototype for D?  Any news
on that, or have you basically given it up?


T

-- 
Life is complex. It consists of real and imaginary parts. -- YHL


Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Dmitry Olshansky via Digitalmars-d

On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
How difficult would it be to migrate an existing modern 
GC-implementation into D's?


Which one? None of of even close to advanced GCs are pluggable, 
most in addition to being hardwired to a runtime/VM codebase, 
also rely on things like:
- particular object layout as in object header (Java, Dart + many 
JavaScript engines certainly do this)

- safe points and custom stackmaps
- some use tagged pointers and forbid explicit pointer arithmetic
- most heavily rely on GC pointers not being mixed with non-GC 
pointers
- generational ones need write barriers (pieces of code that 
guard each assignment of reference)

- most concurrent ones use read-barriers as well


Which kinds of GC's would be of interest?


I believe we can get away with parallel mark-sweep + 
snapshot-based concurrency. It has some limitations but in D land 
with GC not being the single source of memory it should work fine.



Which attempts have been made already?


I still think that mostly precise Immix style GC would also work, 
it won’t be 1:1 porting job though. Many things to figure out.




Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Per Nordlöw via Digitalmars-d

On Monday, 9 April 2018 at 18:39:11 UTC, Jack Stouffer wrote:

On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
How difficult would it be to migrate an existing modern 
GC-implementation into D's?


Considering no one has done it, very.


What's the reason for this being so hard?

A too unstrict programming model that enables (has enabled) to 
much bit-fiddling with pointers (classes)?


Re: Migrating an existing more modern GC to D's gc.d

2018-04-09 Thread Jack Stouffer via Digitalmars-d

On Monday, 9 April 2018 at 18:27:26 UTC, Per Nordlöw wrote:
How difficult would it be to migrate an existing modern 
GC-implementation into D's?


Considering no one has done it, very.


Which kinds of GC's would be of interest?


There's been threads about this. I'd do a search for "precise GC" 
in general.



Which attempts have been made already?


https://github.com/dlang/druntime/pull/1603