Re: Discusssion on the Discussion of the Design for a new GC
Am Wed, 30 Apr 2014 19:10:04 + schrieb Orvid King blah38...@gmail.com: Just thought of a minor addition to the guidelines. While discussion of the naming of the public API should occur on the newsgroup, discussion of the names of locals, or non-public APIs should occur on Github. Any objections/concerns/improvements to this? No and I think it comes natural that we don't need to discuss internal variable names here. :) -- Marco
Re: Discusssion on the Discussion of the Design for a new GC
Just thought of a minor addition to the guidelines. While discussion of the naming of the public API should occur on the newsgroup, discussion of the names of locals, or non-public APIs should occur on Github. Any objections/concerns/improvements to this?
Re: Discusssion on the Discussion of the Design for a new GC
On 23.04.2014 20:35, Messenger wrote: On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. The implementation relies on correct RTInfo generation, but that still doesn't work as this pull request is sitting there for 8 months now without getting much review: https://github.com/D-Programming-Language/dmd/pull/2480 Coincidentally, I updated the PR just a couple of days ago. The precise GC changes for druntime are here: https://github.com/rainers/druntime/tree/gcx_precise2
Re: Discusssion on the Discussion of the Design for a new GC
On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What specific problems do you hope to solve with the new design? Is it only to improve the speed at which the GC runs to completion on a single pass? What about premptability so the GC could run as a low priority task, and be interrupted at any time to run something at a higher priority? Is that out of scope? Mike
Re: Discusssion on the Discussion of the Design for a new GC
On 4/23/14, Joakim via Digitalmars-d digitalmars-d@puremagic.com wrote: It might help if you put forth a tentative proposal, that the D goons can then proceed to destroy... I mean, critically evaluate. Btw guys, what's stopping us from simply porting over Leandro's CDGC to D2? I've looked at the codebase and it doesn't seem to be that large (the following link is the most recent version of the GC I could find): http://www.dsource.org/projects/tango/browser/trunk/tango/core/rt/gc/cdgc I know there was some talk about a missing Windows implementation, but if we had a Posix implementation and it turned out to be really good I don't think it would take a long time before we step up and make a Windows implementation.
Re: Discusssion on the Discussion of the Design for a new GC
On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote: On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. Also CDGC http://www.llucax.com.ar/proj/dgc/
Re: Discusssion on the Discussion of the Design for a new GC
On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote: On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. While I do admire the work that has gone into it, there are multiple limitations in his implementation that I believe will prevent his implementation to being the best for D. The single biggest thing is that it was created before a preliminary std.allocators was published, and doesn't take them into account in it's design. I believe it is possible to make these a huge strength of the GC, because it is possible to use them, with minor extensions to the design, to even implement a generational compacting allocator, along with the allocation speed that it comes with. I believe it's design also forgoes the possibility of async collection, which would fail to address the existence of realtime and server applications. I do however intend to have a heap-precise allocator as the default. The extensibility of my design would even allow for DMD style allocation, where it simply never deallocates anything. I'll post a full design soon, but I have to type it all out first, and make sure that it doesn't just end up as some incoherent babbling, and also so that it will make sense to everyone else who's reading it.
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 10:14:07 UTC, Kagamin wrote: On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote: On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. Also CDGC http://www.llucax.com.ar/proj/dgc/ Ah, I hadn't realized he had actually implemented the concurrent GC he gave a talk on, I will make sure to look over the code before writing out the design. I was also planning on watching the full talk before posting the design, to make sure that it doesn't prevent any aspects of the design from working.
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 14:15:45 UTC, Orvid King wrote: On Thursday, 24 April 2014 at 10:14:07 UTC, Kagamin wrote: On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote: On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. Also CDGC http://www.llucax.com.ar/proj/dgc/ Ah, I hadn't realized he had actually implemented the concurrent GC he gave a talk on, I will make sure to look over the code before writing out the design. I was also planning on watching the full talk before posting the design, to make sure that it doesn't prevent any aspects of the design from working. Leandro's CDGC is used as default GC in Sociomantic applications, so it is not only implemented but also extensively production-tested :) I have pinged him to give some feedback about issues he has encountered when trying to port it to D2.
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 07:09:27 UTC, Mike wrote: On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What specific problems do you hope to solve with the new design? Is it only to improve the speed at which the GC runs to completion on a single pass? What about premptability so the GC could run as a low priority task, and be interrupted at any time to run something at a higher priority? Is that out of scope? Mike The goals of my design are to first off reduce the impact of the GC on D as a whole, I would love it if I could get it down to the same, or less impact as the MS.net GC. Secondly I want to make extending, or customizing the GC for each particular use-case significantly easier. Reducing the number of false positives would also be a good outcome. Overall I intend to significantly improve on D's current GC.
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 14:22:17 UTC, Dicebot wrote: On Thursday, 24 April 2014 at 14:15:45 UTC, Orvid King wrote: Ah, I hadn't realized he had actually implemented the concurrent GC he gave a talk on, I will make sure to look over the code before writing out the design. I was also planning on watching the full talk before posting the design, to make sure that it doesn't prevent any aspects of the design from working. Leandro's CDGC is used as default GC in Sociomantic applications, so it is not only implemented but also extensively production-tested :) I have pinged him to give some feedback about issues he has encountered when trying to port it to D2. In that case I'll have to take a much deeper look at it than I was originally planning, I will however definitely attempt to include a concurrent collection mode with my GC design, especially as there is already a well-tested version.
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 14:22:17 UTC, Dicebot wrote: On Thursday, 24 April 2014 at 14:15:45 UTC, Orvid King wrote: On Thursday, 24 April 2014 at 10:14:07 UTC, Kagamin wrote: On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote: On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. Also CDGC http://www.llucax.com.ar/proj/dgc/ Ah, I hadn't realized he had actually implemented the concurrent GC he gave a talk on, I will make sure to look over the code before writing out the design. I was also planning on watching the full talk before posting the design, to make sure that it doesn't prevent any aspects of the design from working. Leandro's CDGC is used as default GC in Sociomantic applications, so it is not only implemented but also extensively production-tested :) I have pinged him to give some feedback about issues he has encountered when trying to port it to D2. I didn't find any particular issues with the porting, I just didn't get far enough with the porting. My idea was to add patches on top of the current implementation instead of rolling out a complete new implementation from the scratch, since there were a few changes to the GC in D2, the porting should be done with a lot of care. There is still also the problem with threading, see https://www.dsource.org/projects/tango/ticket/2087 for details. BTW, if someone wants to take over the porting, and also do it patch by patch I recommend consulting this repo: http://git.llucax.com.ar/w/software/dgc/cdgc.git (here is where the complete history of the changes live, not in tango). And of course, feel free to contact me for any questions advice you might need!
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 14:15:45 UTC, Orvid King wrote: On Thursday, 24 April 2014 at 10:14:07 UTC, Kagamin wrote: On Wednesday, 23 April 2014 at 18:35:26 UTC, Messenger wrote: On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. Also CDGC http://www.llucax.com.ar/proj/dgc/ Ah, I hadn't realized he had actually implemented the concurrent GC he gave a talk on Haha, you thought I was just inventing the numbers?!? :D
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 06:10:00 UTC, Rainer Schuetze wrote: On 23.04.2014 20:35, Messenger wrote: What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. The implementation relies on correct RTInfo generation, but that still doesn't work as this pull request is sitting there for 8 months now without getting much review: https://github.com/D-Programming-Language/dmd/pull/2480 So this PR is the main blocker to a working heap-precise GC? Given all of the talk and energy spent around the GC issue I'd have thought this would be on some super duper high priority fast path for inclusion. Nothing wrong with other GC projects but this seems like the lowest hanging fruit. Coincidentally, I updated the PR just a couple of days ago. The precise GC changes for druntime are here: https://github.com/rainers/druntime/tree/gcx_precise2
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 17:10:20 UTC, Brian Rogoff wrote: The implementation relies on correct RTInfo generation, but that still doesn't work as this pull request is sitting there for 8 months now without getting much review: https://github.com/D-Programming-Language/dmd/pull/2480 So this PR is the main blocker to a working heap-precise GC? Given all of the talk and energy spent around the GC issue I'd have thought this would be on some super duper high priority fast path for inclusion. Nothing wrong with other GC projects but this seems like the lowest hanging fruit. Another issue is that in 64-bit address space precise GC gives no advantage as false pointers have low probability, so precise GC seems to be not worth the effort.
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 17:33:08 UTC, Kagamin wrote: Another issue is that in 64-bit address space precise GC gives no advantage as false pointers have low probability, so precise GC seems to be not worth the effort. Except a precise gc is type-aware, no? And you could basically ask it to please print out everything that you have currently allocated so I can debug what's allocating.
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 17:33:08 UTC, Kagamin wrote: Another issue is that in 64-bit address space precise GC gives no advantage as false pointers have low probability, so precise GC seems to be not worth the effort. I disagree, in the current state of things there is benefit for 64 bit systems: Currently there are many ways to accidentally allocate without the NO_SCAN flag for data without pointers. The overhead of scanning this non-pointer data can often be significant enough to warrant hunting for mis-flagged allocations.
Re: Discusssion on the Discussion of the Design for a new GC
On Thursday, 24 April 2014 at 18:39:34 UTC, Messenger wrote: Except a precise gc is type-aware, no? And you could basically ask it to please print out everything that you have currently allocated so I can debug what's allocating. Depends on the implementation. One would want to not store data which doesn't help with collection - a pointers bitmask is enough for scanning, but not enough to know the type.
Discusssion on the Discussion of the Design for a new GC
So, in order to get the ball rolling on the new GC I intend to implement for D, I want to facilitate a lively discussion of the design of it, so that it can be designed to be both robust and free of design flaws. To keep the discussion from getting derailed, I want to lay out a few guidelines, but want to get feedback on those guidelines before I actually implement them. My current draft of them is as follows: First we’ll start with a brief overview of the development process: A PR will be created for DMD, DRuntime, and, although it may stay empty, Phobos. A new commit will be created for each update of the implementation, this includes bug fixes, and continuing work on the implementation, in as many iterations as are required. This is done to allow progressive review of the code rather than trying to review the PRs as a whole, because, as it is likely to include several thousand lines of changes to the code, it would be impractical to review all at once. No force push should ever be done to the PRs except to fix a typo in or clarify a detail of the commit message for the newest commit. If there is a typo in a commit message, or it is not very clear on what was actually done, and another commit has already been pushed, the typo or un-clear message shall remain for all eternity. The suggested remedy in this case is to make a note of the typo or clarify the commit message with a comment on the commit. PRs to the PRs are welcome, it is however encouraged to coordinate any work you do with the others actively working on the GC. The primary outlet for this should be the IRC, however, should the need arise, the mailing list is a valid venue for this. Github should be used as the primary outlet for discussion of actual code, due to the ease of referencing code, as well as the ability to tell if a comment is about a piece of code that was already changed. The mailing list should be used exclusively for discussion of the design. It should not be used for discussing snippets of code in the actual implementation. It can, and should be, used to discuss snippets of code that may demonstrate a flaw, weakness, or strength in the design. The IRC should be used for rapid-fire QA, or bringing someone up-to-date with the discussion and progression of the GC so far. Discussion about inconsistencies in the coding style of the implementation (whitespaces, newlines, etc.) should reside exclusively on the IRC, as they are things that a future reader of the discussions doesn’t really care about. If a discussion of the overall code style used in the implementation is required, a thread should be created on the mailing list. The IRC should not be used to facilitate a design discussion. The reason for this is twofold, firstly the IRC has a limited audience, thus limited feedback, and secondly, I want the discussion of the design to stand as documentation for why the GC is designed the way it is. Now, on to the guidelines for the design discussion. ARC does not exist. We are implementing a GC, however, if the opportunity arises to allow an efficient implementation of interfacing with an external ARC platform, such as what is used in Objective C, discussion of that interfacing mechanism is permitted. If DMD support is needed, it exists. This means that if the GC needs DMD to be capable of something such as scope analysis in order to make a particular optimization, then DMD should be assumed to be capable of doing that. While language additions may be proposed, the design must still be able to function should the additions not be done, as the additions should only be to allow for additional optimization opportunities. For instance, re-introducing scoped class locals. After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results?
Re: Discusssion on the Discussion of the Design for a new GC
On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: So, in order to get the ball rolling on the new GC I intend to implement for D, I want to facilitate a lively discussion of the design of it, so that it can be designed to be both robust and free of design flaws. To keep the discussion from getting derailed, I want to lay out a few guidelines, but want to get feedback on those guidelines before I actually implement them. My current draft of them is as follows: First we’ll start with a brief overview of the development process: A PR will be created for DMD, DRuntime, and, although it may stay empty, Phobos. A new commit will be created for each update of the implementation, this includes bug fixes, and continuing work on the implementation, in as many iterations as are required. This is done to allow progressive review of the code rather than trying to review the PRs as a whole, because, as it is likely to include several thousand lines of changes to the code, it would be impractical to review all at once. No force push should ever be done to the PRs except to fix a typo in or clarify a detail of the commit message for the newest commit. If there is a typo in a commit message, or it is not very clear on what was actually done, and another commit has already been pushed, the typo or un-clear message shall remain for all eternity. The suggested remedy in this case is to make a note of the typo or clarify the commit message with a comment on the commit. PRs to the PRs are welcome, it is however encouraged to coordinate any work you do with the others actively working on the GC. The primary outlet for this should be the IRC, however, should the need arise, the mailing list is a valid venue for this. Github should be used as the primary outlet for discussion of actual code, due to the ease of referencing code, as well as the ability to tell if a comment is about a piece of code that was already changed. The mailing list should be used exclusively for discussion of the design. It should not be used for discussing snippets of code in the actual implementation. It can, and should be, used to discuss snippets of code that may demonstrate a flaw, weakness, or strength in the design. The IRC should be used for rapid-fire QA, or bringing someone up-to-date with the discussion and progression of the GC so far. Discussion about inconsistencies in the coding style of the implementation (whitespaces, newlines, etc.) should reside exclusively on the IRC, as they are things that a future reader of the discussions doesn’t really care about. If a discussion of the overall code style used in the implementation is required, a thread should be created on the mailing list. The IRC should not be used to facilitate a design discussion. The reason for this is twofold, firstly the IRC has a limited audience, thus limited feedback, and secondly, I want the discussion of the design to stand as documentation for why the GC is designed the way it is. Now, on to the guidelines for the design discussion. ARC does not exist. We are implementing a GC, however, if the opportunity arises to allow an efficient implementation of interfacing with an external ARC platform, such as what is used in Objective C, discussion of that interfacing mechanism is permitted. If DMD support is needed, it exists. This means that if the GC needs DMD to be capable of something such as scope analysis in order to make a particular optimization, then DMD should be assumed to be capable of doing that. While language additions may be proposed, the design must still be able to function should the additions not be done, as the additions should only be to allow for additional optimization opportunities. For instance, re-introducing scoped class locals. After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? Wow, this takes the D forums tradition of all talk, no code, or as the original saying goes, all hat, no cattle, to a new peak. ;) I think most would agree with you on most of these guidelines, better to get onto the actual design. It might help if you put forth a tentative proposal, that the D goons can then proceed to destroy... I mean, critically evaluate.
Re: Discusssion on the Discussion of the Design for a new GC
On Wednesday, 23 April 2014 at 15:33:36 UTC, Orvid King wrote: After all of that, I intend to include a base draft of the design of the GC, along with opening the PRs and committing the starting API. So, is there something I’m missing? Am I overlooking the obvious? Is there a more practical way to produce the same results? What is the state of Rainer Schütze's precise gc? Duplication of effort and all that.
Re: Discusssion on the Discussion of the Design for a new GC
Am Wed, 23 Apr 2014 18:35:25 + schrieb Messenger d...@shoot.me: What is the state of Rainer Schütze's precise gc? Duplication of effort and all that. +1. And I hope you know what you are up to :D. Some people may expect a magic pill to emerge from your efforts that makes the GC approx. as fast as manual memory management for typical uses or at least as good as the one in Java. We must not forget that Java has just-in-time compilation and no raw pointer access. They might have found clever ways to make use of features/restrictions in Java, that are not available to D. Memory compaction is one from the top of my head. I only know for sure that you are looking into using some ideas from TCMalloc. Other than that, what are the exact problems you are trying to solve? That would be good to know, since different goals might require different implementations. E.g. a precise GC is generally slowed down by checking data types, but it doesn't keep memory alive because some float variable happens to look like a pointer to it. What are the limitations of garbage collection? As an example: If someone loads some million items graph structure into memory, can they still make any assumption about the run time of GC.alloc()? Can generational collection be implemented? -- Marco