Re: Battle-plan for CTFE
On Sunday, 17 July 2016 at 15:57:28 UTC, Rory McGuire wrote: Thanks that would be great, however I think its a while off before it can work on your ctfe implementation. It uses Pegged, so getting the Pegged JSON parser or pegged.peg working would be a first step. pegged uses templates. There is an intersection on which it can profit from CTFE improvments. But for the most part, it's a different sub-system in the compiler. That said, templates are next on my TODO list.
Re: Battle-plan for CTFE
On Sun, Jul 17, 2016 at 3:28 PM, Stefan Koch via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote: > On Sunday, 17 July 2016 at 13:04:33 UTC, Rory McGuire wrote: > >> On Sun, Jul 17, 2016 at 2:41 PM, Stefan Koch via Digitalmars-d-announce < >> digitalmars-d-announce@puremagic.com> wrote: >> >> [...] >>> >> >> A commit does pretty much the same thing. Tags/Branches just make UIs show >> a "menu" for selecting a change, not important really. I had just wondered >> if you were marking these milestones. >> Someone could always branch off of any of your commits, it they wanted to >> try an alternative approach (or learn from you). >> >> Thanks for all the work you are doing. I love CTFE. I have a lot of big >> vibe (diet) templates, and my own Jade ctfe implementation, really looking >> forward to faster compile times. >> >> R >> > > If you let me have a look at your jade-impl. I can probably work on > speeding up it up. Vibed is still going to take a while. I need to figure > all this UTF stuff out to make string handling robust. > Thanks that would be great, however I think its a while off before it can work on your ctfe implementation. It uses Pegged, so getting the Pegged JSON parser or pegged.peg working would be a first step. If you were just meaning to show me any improvements I could make, the source code is at: https://github.com/rjmcguire/jaded Cheers, R
Re: Battle-plan for CTFE
On Sunday, 17 July 2016 at 13:04:33 UTC, Rory McGuire wrote: On Sun, Jul 17, 2016 at 2:41 PM, Stefan Koch via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote: [...] A commit does pretty much the same thing. Tags/Branches just make UIs show a "menu" for selecting a change, not important really. I had just wondered if you were marking these milestones. Someone could always branch off of any of your commits, it they wanted to try an alternative approach (or learn from you). Thanks for all the work you are doing. I love CTFE. I have a lot of big vibe (diet) templates, and my own Jade ctfe implementation, really looking forward to faster compile times. R If you let me have a look at your jade-impl. I can probably work on speeding up it up. Vibed is still going to take a while. I need to figure all this UTF stuff out to make string handling robust.
Re: Battle-plan for CTFE
On Sun, Jul 17, 2016 at 2:41 PM, Stefan Koch via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote: > On Sunday, 17 July 2016 at 12:12:17 UTC, Rory McGuire wrote: > >> Nice! how are you keeping track of changes? Are you tagging / branching >> each time you get something new working? >> > > I just push a new commit. > And add code to the gist. > > Why would I branch ? > It all goes towards the same goal. > A commit does pretty much the same thing. Tags/Branches just make UIs show a "menu" for selecting a change, not important really. I had just wondered if you were marking these milestones. Someone could always branch off of any of your commits, it they wanted to try an alternative approach (or learn from you). Thanks for all the work you are doing. I love CTFE. I have a lot of big vibe (diet) templates, and my own Jade ctfe implementation, really looking forward to faster compile times. R
Re: Battle-plan for CTFE
On Sunday, 17 July 2016 at 12:12:17 UTC, Rory McGuire wrote: Nice! how are you keeping track of changes? Are you tagging / branching each time you get something new working? I just push a new commit. And add code to the gist. Why would I branch ? It all goes towards the same goal.
Re: Battle-plan for CTFE
Nice! how are you keeping track of changes? Are you tagging / branching each time you get something new working? On Sun, Jul 17, 2016 at 12:12 PM, Stefan Koch via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote: > On Friday, 15 July 2016 at 20:36:46 UTC, Stefan Koch wrote: > >> I decided to keep a gist updated to represent the current state the new >> engine can handle. >> https://gist.github.com/UplinkCoder/89faa06311e417aa93ea99bc92934d3e >> > > Internal changes to the bytecode engine make the manipulation of values on > the ctfe stack possible. > > this allows the following code to properly compile now. > > struct V3 {int x; int y; int z;} > int fn() { > > auto b = V3(1,2,3); > b.y += 19; > return b.y; > } > > static assert(fn() == 21); >
Re: Battle-plan for CTFE
On Friday, 15 July 2016 at 20:36:46 UTC, Stefan Koch wrote: I decided to keep a gist updated to represent the current state the new engine can handle. https://gist.github.com/UplinkCoder/89faa06311e417aa93ea99bc92934d3e Internal changes to the bytecode engine make the manipulation of values on the ctfe stack possible. this allows the following code to properly compile now. struct V3 {int x; int y; int z;} int fn() { auto b = V3(1,2,3); b.y += 19; return b.y; } static assert(fn() == 21);
Re: Battle-plan for CTFE
On Thursday, 14 July 2016 at 00:45:53 UTC, Stefan Koch wrote: On Saturday, 9 July 2016 at 20:09:14 UTC, Stefan Koch wrote: I decided to keep a gist updated to represent the current state the new engine can handle. https://gist.github.com/UplinkCoder/89faa06311e417aa93ea99bc92934d3e This is the currentState after approx. 50 hours of work First StringIndexing works now. The next step is getting structs right, Support for _basic_ support for structs has just landed in my branch. only structs with int or uint members are supported for now. And no arrays. But this is coming soon :)
Re: Battle-plan for CTFE
On Saturday, 9 July 2016 at 20:09:14 UTC, Stefan Koch wrote: I decided to keep a gist updated to represent the current state the new engine can handle. https://gist.github.com/UplinkCoder/89faa06311e417aa93ea99bc92934d3e This is the currentState after approx. 50 hours of work First StringIndexing works now. The next step is getting structs right,
Re: Battle-plan for CTFE
On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: Hi Guys, I have been looking into the DMD now to see what I can do about CTFE. [ ] I will post more details as soon as I dive deeper into the code. I decided to keep a gist updated to represent the current state the new engine can handle. https://gist.github.com/UplinkCoder/89faa06311e417aa93ea99bc92934d3e This is the currentState after approx. 50 hours of work
Re: Battle-plan for CTFE
p.s. it is, btw, completely possible to add at least something like "ctfe tracer" (something like old basic "trace on" command), or even some kind of gdb-like debugger to ctfe engine. it won't, of course, find it's way into mainline, but may still be a fun project to do.
Re: Battle-plan for CTFE
On Friday, 8 July 2016 at 23:31:38 UTC, Stefan Koch wrote: and mine is segfaulting in some bizarre ways (i failed my basic ++ and -- math, and so the stack ;-). still, it is almost working, with support for both compiled and interpreted function calls, almost full range of integer math and some string ops. druntime still compiles and passes unittests, but phobos segfaulted somewhere deep in my CTFE engine. i really need to re-check all added opcodes and codegen. writing that was surprisingly easy. and sometimes simply surprisingly: "how did you came to me with this, DMD?! eh? aha, i see now, thank you, grep." ;-) some stats: virtual machine is <30KB of code now. compiler is ~70KB, but there is alot of copypasta in there (you know, it happens when the code grows organically). if someone is interested in interop between frontend and backend, writing engine like this is a nice way to make yourself familiar, as CTFE interpreter effectively gets already semanticed AST, with most things correctly lowered and so on. so you only have to copy dummy CtfeCompiler from dinterpret.d (it only count vars there, otherwise doing nothing), and start extending it. even when we'll get new shiny engine from Stefan, i still recommend to anyone who wish to understand how compiled and processed code looks inside the frontend at least try to implement some simple bytecode compiler. with some care seeing that you are effectively replacing compiler parts, and it is still able to compile complex unittests is... yeah, go find the word. ;-)
Re: Battle-plan for CTFE
On Friday, 8 July 2016 at 12:17:29 UTC, Rene Zwanenburg wrote: On Friday, 8 July 2016 at 11:32:10 UTC, Stefan Koch wrote: I forgot to mention I posted a short article about the CTFE design on my blog. https://codesoldier.blogspot.com Feel free to comment or give suggestions. Thanks! Posts like these are always interesting to read. I noticed a few mistakes: I have been working a rewrite -> I have been working on a rewrite complicated corer-case -> complicated corner-case that have to handled correctly -> that have to be handled correctly a cononical from -> a canonical from I addressed the spelling mistakes. Thanks for pointed them out. Also more complex SwitchStatements work now.
Re: Battle-plan for CTFE
On Friday, 8 July 2016 at 11:32:10 UTC, Stefan Koch wrote: I forgot to mention I posted a short article about the CTFE design on my blog. https://codesoldier.blogspot.com Feel free to comment or give suggestions. Thanks! Posts like these are always interesting to read. I noticed a few mistakes: I have been working a rewrite -> I have been working on a rewrite complicated corer-case -> complicated corner-case that have to handled correctly -> that have to be handled correctly a cononical from -> a canonical from
Re: Battle-plan for CTFE
On Thursday, 7 July 2016 at 17:47:28 UTC, Stefan Koch wrote: On Thursday, 7 July 2016 at 13:55:44 UTC, Stefan Koch wrote: I just made a PR to fix it for ctfe. It's a hack but then again ... The whole handling of PowExp is a hack. Clarification now it works on Literals. It is still not available at ctfe and the full non-hackish fix will take a while. I forgot to mention I posted a short article about the CTFE design on my blog. https://codesoldier.blogspot.com Feel free to comment or give suggestions.
Re: Battle-plan for CTFE
On Thursday, 7 July 2016 at 13:55:44 UTC, Stefan Koch wrote: I just made a PR to fix it for ctfe. It's a hack but then again ... The whole handling of PowExp is a hack. Clarification now it works on Literals. It is still not available at ctfe and the full non-hackish fix will take a while.
Re: Battle-plan for CTFE
On Tuesday, 10 May 2016 at 00:36:21 UTC, Walter Bright wrote: On 5/9/2016 2:32 PM, Andrej Mitrovic via Digitalmars-d-announce wrote: On 5/9/16, Stefan Koch via Digitalmars-d-announcewrote: I was shocked to discover that the PowExpression actually depends on phobos! I seem to remember having to implement diagnostics in DMD asking for the user to import std.math. I'm fairly confident it had something to do with power expressions. Yah, it's a mess. :) The pow stuff should just be done in dmd without reference to the library. I just made a PR to fix it for ctfe. It's a hack but then again ... The whole handling of PowExp is a hack.
Re: Battle-plan for CTFE
yay. ackemann(3, 7) in 70 milliseconds! and it does TCO. default interpreter is unable to execute that -- recursion is too deep.
Re: Battle-plan for CTFE
just... can't... stop... some current statistics: druntime total interpreted calls: 27,146 total complied calls : 1,330 phobos total interpreted calls: 6,492,826 total complied calls : 19,975 i can't do function calls from compiled code yet, so number of compiled calls is low, and no speedup can be seen. also, i can do only integer math and string concatenation. yes, phobos has ALOT of CTFE! ;-) this is statistics from "make -f posix.mak unittest" (yes, i can pass all unittests too!) still, not that bad for a toy implementation! ;-)
Re: Battle-plan for CTFE
On Wednesday, 6 July 2016 at 08:16:32 UTC, Rory McGuire wrote: Thought as much, wondered if the ideas about how they work were being shared; often one idea sparks another. sure, some ideas flows; mostly very high-level. for example, proper fallback to the original interpreter was invented by Stefan, but i was faster to make it work. ;-) after all, we aren't competitors. he is still working hard on full implementation, and i am just having some fun, investigating creation of DMD backends by the way.
Re: Battle-plan for CTFE
On Wed, Jul 6, 2016 at 8:09 AM, ketmar via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote: > On Wednesday, 6 July 2016 at 05:54:48 UTC, Rory McGuire wrote: > >> Are you sharing this code >> > > yes. we are chatting with Stefan in IRC, and the repository is public (i > mean, the link was there ;-). yet it won't compile with "vanilla" dmd > anyway -- i require my dmd fork ("aliced"). and i don't really want to make > it more public, as i have no intentions to turn that into full-blown engine. > Nice! Sounds like fun. > > also, not only our codebases are completely different, but our designs and > approaches are drastically different. we just can't share/reuse the code, > those projects are as compatible as WLIV and Z80. ;-) Thought as much, wondered if the ideas about how they work were being shared; often one idea sparks another. > > how you did it with the GSOC intern? >> > Stefan in not GSOC intern. ;-) > doh! my bad.
Re: Battle-plan for CTFE
On Tuesday, 5 July 2016 at 21:11:39 UTC, deadalnix wrote: On Monday, 4 July 2016 at 07:29:49 UTC, ZombineDev wrote: Nice work! Any chance that you could also improve AliasSeq algorithms, like those in std.meta to compile faster and use less memory during compilation? Or is that too different from CTFE? Not that I opposes this, but please keep it focused. Doing a bytecode interpreter is a huge task on its own and this seems largely orthogonal. I was actually wondering if they're orthogonal or some of the CTFE improvements could be reused. I was not saying the he should work on both if they're not related.
Re: Battle-plan for CTFE
On Wednesday, 6 July 2016 at 05:54:48 UTC, Rory McGuire wrote: Are you sharing this code i mean "my code is publicly available, but we aren't sharing any code between our projects".
Re: Battle-plan for CTFE
On Wednesday, 6 July 2016 at 05:54:48 UTC, Rory McGuire wrote: Are you sharing this code yes. we are chatting with Stefan in IRC, and the repository is public (i mean, the link was there ;-). yet it won't compile with "vanilla" dmd anyway -- i require my dmd fork ("aliced"). and i don't really want to make it more public, as i have no intentions to turn that into full-blown engine. also, not only our codebases are completely different, but our designs and approaches are drastically different. we just can't share/reuse the code, those projects are as compatible as WLIV and Z80. ;-) how you did it with the GSOC intern? Stefan in not GSOC intern. ;-)
Re: Battle-plan for CTFE
On Tue, Jul 5, 2016 at 11:54 PM, ketmar via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote: > and just to make sure that my approach is working: bytecode compiler now > compiles simple free functions without args and returning int (yep, there > are some in phobos! ;-), while passing everything other to old interpreter. > it works, and all phobos unittests are passed (and some functions were > actually executed with VM). > > that's where i should stop, i think, until it comes too far. ;-) > Are you sharing this code / how you did it with the GSOC intern?
Re: Battle-plan for CTFE
and just to make sure that my approach is working: bytecode compiler now compiles simple free functions without args and returning int (yep, there are some in phobos! ;-), while passing everything other to old interpreter. it works, and all phobos unittests are passed (and some functions were actually executed with VM). that's where i should stop, i think, until it comes too far. ;-)
Re: Battle-plan for CTFE
On Monday, 4 July 2016 at 07:29:49 UTC, ZombineDev wrote: Nice work! Any chance that you could also improve AliasSeq algorithms, like those in std.meta to compile faster and use less memory during compilation? Or is that too different from CTFE? Not that I opposes this, but please keep it focused. Doing a bytecode interpreter is a huge task on its own and this seems largely orthogonal.
Re: Battle-plan for CTFE
On Tue, Jul 05, 2016 at 05:44:14PM +, Ola Fosheim Grøstad via Digitalmars-d-announce wrote: > On Tuesday, 5 July 2016 at 16:40:05 UTC, ketmar wrote: > > so, we played a little game with Stefan: i wrote a simple > > stack-based VM implementation for our beloved bug6498, and got > > What is «bug6498»? https://issues.dlang.org/show_bug.cgi?id=6498 T -- Give me some fresh salted fish, please.
Re: Battle-plan for CTFE
On Tuesday, 5 July 2016 at 16:40:05 UTC, ketmar wrote: so, we played a little game with Stefan: i wrote a simple stack-based VM implementation for our beloved bug6498, and got What is «bug6498»?
Re: Battle-plan for CTFE
On Monday, 4 July 2016 at 07:33:48 UTC, ZombineDev wrote: BTW, do you plan to handle stuff like exceptions, virtual calls and associative arrays and if so how? Sounds quite challenging. on the lower level, it's all just arrays and jumps. ;-)
Re: Battle-plan for CTFE
so, we played a little game with Stefan: i wrote a simple stack-based VM implementation for our beloved bug6498, and got 270 msecs with debug build, 140 msecs with release build. can be made even faster with some simple peephole optimizer (it has two hacks to get fair times on 6498, but this is not a good peephole opt ;-). of course, it scales linearly too. don't expect me to join the race, though: i was just curious how well stack-based VM can do.
Re: Battle-plan for CTFE
On Monday, 4 July 2016 at 07:33:48 UTC, ZombineDev wrote: On Monday, 4 July 2016 at 07:29:49 UTC, ZombineDev wrote: Nice work! Any chance that you could also improve AliasSeq algorithms, like those in std.meta to compile faster and use less memory during compilation? Or is that too different from CTFE? BTW, do you plan to handle stuff like exceptions, virtual calls and associative arrays and if so how? Sounds quite challenging. Templates do interact with CTFE. They will improve as well as a consequence of CTFE being less wasteful. I will do it bit by bit. For now getting a simple parser to work with the new engine is the goal. Everything else comes later.
Re: Battle-plan for CTFE
On Monday, 4 July 2016 at 07:29:49 UTC, ZombineDev wrote: On Monday, 4 July 2016 at 01:54:29 UTC, Stefan Koch wrote: On Thursday, 30 June 2016 at 14:26:29 UTC, Stefan Koch wrote: On Thursday, 30 June 2016 at 14:17:30 UTC, Timon Gehr wrote: Sorry, I had missed this. I see you were able to make progress. It's fine. Honestly the hardest thing was to actually get started. Once I see something working the excitement carries me forward :) I would appreciate a critical opinion, anytime! Another update. _very_ Basic SwtichStatements work now :) I expect a simple project like a compile-time bf interpreter to work by the end of next week. Nice work! Any chance that you could also improve AliasSeq algorithms, like those in std.meta to compile faster and use less memory during compilation? Or is that too different from CTFE? BTW, do you plan to handle stuff like exceptions, virtual calls and associative arrays and if so how? Sounds quite challenging.
Re: Battle-plan for CTFE
On Monday, 4 July 2016 at 01:54:29 UTC, Stefan Koch wrote: On Thursday, 30 June 2016 at 14:26:29 UTC, Stefan Koch wrote: On Thursday, 30 June 2016 at 14:17:30 UTC, Timon Gehr wrote: Sorry, I had missed this. I see you were able to make progress. It's fine. Honestly the hardest thing was to actually get started. Once I see something working the excitement carries me forward :) I would appreciate a critical opinion, anytime! Another update. _very_ Basic SwtichStatements work now :) I expect a simple project like a compile-time bf interpreter to work by the end of next week. Nice work! Any chance that you could also improve AliasSeq algorithms, like those in std.meta to compile faster and use less memory during compilation? Or is that too different from CTFE?
Re: Battle-plan for CTFE
On Thursday, 30 June 2016 at 14:26:29 UTC, Stefan Koch wrote: On Thursday, 30 June 2016 at 14:17:30 UTC, Timon Gehr wrote: Sorry, I had missed this. I see you were able to make progress. It's fine. Honestly the hardest thing was to actually get started. Once I see something working the excitement carries me forward :) I would appreciate a critical opinion, anytime! Another update. _very_ Basic SwtichStatements work now :) I expect a simple project like a compile-time bf interpreter to work by the end of next week.
Re: Battle-plan for CTFE
On Thursday, 30 June 2016 at 14:17:30 UTC, Timon Gehr wrote: Sorry, I had missed this. I see you were able to make progress. It's fine. Honestly the hardest thing was to actually get started. Once I see something working the excitement carries me forward :) I would appreciate a critical opinion, anytime!
Re: Battle-plan for CTFE
On 08.06.2016 03:20, Stefan Koch wrote: On Friday, 3 June 2016 at 15:46:27 UTC, Stefan Koch wrote: On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: I will post more details as soon as I dive deeper into the code. Okay I briefly evaluated the current IR dmd uses for backend communication, and it seems usable for the purposes of a CTFE-Interpreter/JIT. The Memory-Management issue is mostly orthogonal to CTFE. But nonetheless very important. I have to admit currently I am a bit stuck. @Timon Gehr cooperation would be highly appreciated. Sorry, I had missed this. I see you were able to make progress.
Re: Battle-plan for CTFE
On Thursday, 30 June 2016 at 07:15:30 UTC, Nordlöw wrote: On Thursday, 30 June 2016 at 03:17:38 UTC, Stefan Koch wrote: Until then you can see my progress at https://github.com/UplinkCoder/dmd/tree/newCTFE I will try to always keep the branch in a healthy state. I can't wait to see the benchmarks. Keep up the good work! Currently the interpreter is about 10x-15x slower then native execution. and about 2x-4000x faster then the old Interpreter :) code I can currently compile : uint test2(int x) { uint r = 0; const uint end = 9; for(int i = 0;i < end;++i) { for(int j = 0;j < 9;++j) { r += j; r += i; } } return r; } dmd's interpreter and mine return the same value :)
Re: Battle-plan for CTFE
On 06/30/2016 05:17 AM, Stefan Koch wrote: > Both. Actually I could not imagine fixing the memory problem without > doing IR interpretation. > I will tackle compiling more difficult code later today. > As soon as I can run my compiletime brainfuck I will open a PR. > > Until then you can see my progress at > https://github.com/UplinkCoder/dmd/tree/newCTFE > I will try to always keep the branch in a healthy state. Looks good and will definitely lead to a proper CTFE interpreter. You might want to go back and look at https://github.com/MartinNowak/dmd/blob/28ffb0ab4fa6950f60c085f33f8a2ce23df7c0cd/src/interpret.c, it could already do a few more things than your interpreter atm. [¹]: https://github.com/MartinNowak/dmd/commits/ctfe
Re: Battle-plan for CTFE
On Thursday, 30 June 2016 at 01:32:47 UTC, Martin Nowak wrote: On Thursday, 30 June 2016 at 01:20:08 UTC, Stefan Koch wrote: First small code example compiles! int bug6498(int x) { int n = 0; while (n < x) { n++; } return n; } evaluation of bug6498(100_000_00) took 226 msecs. evaluation of bug6498(100_000_000) took 2228 msecs. The memory allocated by the Evaluator is exactly 12 bytes. The speedup comes from interpreting the IR or fixing the memory leaking? Both. Actually I could not imagine fixing the memory problem without doing IR interpretation. I will tackle compiling more difficult code later today. As soon as I can run my compiletime brainfuck I will open a PR. Until then you can see my progress at https://github.com/UplinkCoder/dmd/tree/newCTFE I will try to always keep the branch in a healthy state.
Re: Battle-plan for CTFE
On Thursday, 30 June 2016 at 01:20:08 UTC, Stefan Koch wrote: First small code example compiles! int bug6498(int x) { int n = 0; while (n < x) { n++; } return n; } evaluation of bug6498(100_000_00) took 226 msecs. evaluation of bug6498(100_000_000) took 2228 msecs. The memory allocated by the Evaluator is exactly 12 bytes. The speedup comes from interpreting the IR or fixing the memory leaking?
Re: Battle-plan for CTFE
First small code example compiles! int bug6498(int x) { int n = 0; while (n < x) { n++; } return n; } evaluation of bug6498(100_000_00) took 226 msecs. evaluation of bug6498(100_000_000) took 2228 msecs. The memory allocated by the Evaluator is exactly 12 bytes.
Re: Battle-plan for CTFE
On Friday, 3 June 2016 at 15:46:27 UTC, Stefan Koch wrote: On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: I will post more details as soon as I dive deeper into the code. Okay I briefly evaluated the current IR dmd uses for backend communication, and it seems usable for the purposes of a CTFE-Interpreter/JIT. The Memory-Management issue is mostly orthogonal to CTFE. But nonetheless very important. I have to admit currently I am a bit stuck. @Timon Gehr cooperation would be highly appreciated.
Re: Battle-plan for CTFE
On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: I will post more details as soon as I dive deeper into the code. Okay I briefly evaluated the current IR dmd uses for backend communication, and it seems usable for the purposes of a CTFE-Interpreter/JIT. The Memory-Management issue is mostly orthogonal to CTFE. But nonetheless very important.
Re: Battle-plan for CTFE
On Saturday, 28 May 2016 at 12:27:26 UTC, Stefan Koch wrote: On Friday, 27 May 2016 at 23:31:24 UTC, Stefan Koch wrote: On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: Hi Guys, I have been looking into the DMD now to see what I can do about CTFE. I will post more details as soon as I dive deeper into the code. Update : int bug6498(int x) { int n = 0; while (n < x) ++n; return n; } static assert(bug6498(10_000_000)==10_000_000); This compiles now for me. Although it still takes 10 seconds ;) Big Whoop! Just make the number slightly lager and it fails again. but smaller than 2_147_483_647 right :)
Re: Battle-plan for CTFE
On Friday, 27 May 2016 at 23:31:24 UTC, Stefan Koch wrote: On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: Hi Guys, I have been looking into the DMD now to see what I can do about CTFE. I will post more details as soon as I dive deeper into the code. Update : int bug6498(int x) { int n = 0; while (n < x) ++n; return n; } static assert(bug6498(10_000_000)==10_000_000); This compiles now for me. Although it still takes 10 seconds ;) Big Whoop! Just make the number slightly lager and it fails again.
Re: Battle-plan for CTFE
On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: Hi Guys, I have been looking into the DMD now to see what I can do about CTFE. I will post more details as soon as I dive deeper into the code. Update : int bug6498(int x) { int n = 0; while (n < x) ++n; return n; } static assert(bug6498(10_000_000)==10_000_000); This compiles now for me. Although it still takes 10 seconds ;)
Re: Battle-plan for CTFE
On Saturday, 21 May 2016 at 21:20:54 UTC, Martin Nowak wrote: On 05/21/2016 11:18 PM, Martin Nowak wrote: The debugging metaphor would be comparing a program that only uses pointer arithmetic against one that is memory safe, the former can randomly write everywhere from anywhere, the latter could use the wrong reference. It's also similar to comparing assembly code to C. I am just working on a change that introduces a temporary-Stack to the memory management of dmd. So far it crashes :)
Re: Battle-plan for CTFE
On 05/21/2016 11:18 PM, Martin Nowak wrote: > The debugging metaphor would be comparing a program that only uses > pointer arithmetic against one that is memory safe, the former can > randomly write everywhere from anywhere, the latter could use the wrong > reference. It's also similar to comparing assembly code to C.
Re: Battle-plan for CTFE
On 05/18/2016 04:59 PM, Daniel Murphy wrote: > The bytecode generator and bytecode interpreter can be debugged (and > tested!) independently. So the total amount of code will increase but > the components themselves will be better isolated and easier to work with. It's simpler to debug an AST interpreter working with a stack of high-level values, than it is to debug a bytecode interpreter where lots of context has been converted to jumping goto code. Just to illustrate my point, here are an AST and a BC interpreter for very simplistic functional language I wrote recently. The later one still missing the actual interpretation. https://github.com/MartinNowak/CC1/commit/ed28b8966de86e7449f93ce4e4cf7aed3082180b https://github.com/MartinNowak/CC1/commit/899e67cf7038050b86eed533c9165bd2ba06e609 There is nothing simpler about a BC interpreter. Instead you have to deal with converting control flow and computing addressing. The debugging metaphor would be comparing a program that only uses pointer arithmetic against one that is memory safe, the former can randomly write everywhere from anywhere, the latter could use the wrong reference. > I don't think a possible future need for a JIT is a good reason to avoid > an bytecode interpreter. It's a very good reason, b/c once you work on JIT, there is no benefit for BC left, e.g. all the extra work for nothing. That said, I doubt we'll need a JIT anytime soon. > A large part of the work of adding a new (JIT) backend is pinning down the > semantics, > and adding a bytecode interpreter will certainly help to do that. The semantic problem is already solved, in a file called interpret.d by sth. that's an AST interpreter, that just happens to use the wrong value structures and leaks memory. Converting that to BC will be quite difficult, cleaning it up and changing it to use a better stack and deterministic memory management is rather straightforward. Last but not least, don't forget that we have the same situation since over 3 years already. It has always been similarly easy to write a better interpreter, it just never happened b/c the ambitions never matched the available resources. -Martin
Re: Battle-plan for CTFE
On 05/18/2016 07:50 PM, Stefan Koch wrote: > Indeed. > > I am currently designing an IR to feed into the CTFE Evaluator. > I am aware that this could potentially make it harder to get things > merged since DMD already has the glue-layer. As a compat layer between different interpreters or as a compat layer between all backends? Adding another translation might not be acceptable, at least for real backends. > However I do think that the benefits outweigh the drawbacks by far. > Especially when one looks at the possibility to eventually plug llvm or > the gcc-jit in. Indeed, but it heavily increases the chance that your project lands on the unfinished D project pile. > My CTFE needs are rather heavy weight. So I will go for a solution that > can support this. > I believe the pressure on CTFE performance will increase as soon as the > preformance increases. Since this will enable much more things. > I.E. running a query optimizer at compile-time. That might be true, but scripting languages are still fast enough to be used everywhere. You won't need native CTFE performance for it to be an enabling technique. -Martin
Re: Battle-plan for CTFE
On 19/05/2016 3:50 AM, Stefan Koch wrote: I am currently designing an IR to feed into the CTFE Evaluator. I am aware that this could potentially make it harder to get things merged since DMD already has the glue-layer. It's always more difficult to justify merging more complexity. But if you can present a working and superior solution the specific implementation is less important. It is still important that it matches the existing style of the compiler, especially with respect to adding dependencies. Also be aware that even with agreement on the eventual goal, it is still very slow to get big changes approved. eg ddmd took more than twice as long as it should have. This is why I suggested looking for incremental improvements, so you can overlap getting earlier things merged and developing later parts. I would be on the lookout for things that can be justified on their own (untangling AssignExp :) ) and try to push those through first.
Re: Battle-plan for CTFE
On 18/05/2016 9:01 AM, Martin Nowak wrote: Yes, this https://github.com/dlang/dmd/blob/7d00095301c4780b41addcfeb50f4743a9a6c5d4/src/dinterpret.d#L3418 is really ugly and complex, but you won't get rid of this inherent complexity. The e2ir code for AssingExp looks almost the same https://github.com/dlang/dmd/blob/7d00095301c4780b41addcfeb50f4743a9a6c5d4/src/e2ir.c#L2466. IMO this is a different problem, that AssignExp is stupidly complex and needs to be split up. You might imagine that it's easier to work with syntax trees than to start from scratch but I'm certain that's not true. I'm pretty sure that the simplest approach is to use the simplest possible machine-independent bytecode that you can come up with. I had got to the point of starting that, but I just couldn't face doing it in C++. All I'm saying is that interpreting the AST to generate bytecode is going to be as complex as interpreting the AST directly, but then you still a bytecode interpreter and later might still want a JIT. The bytecode generator and bytecode interpreter can be debugged (and tested!) independently. So the total amount of code will increase but the components themselves will be better isolated and easier to work with. I don't think a possible future need for a JIT is a good reason to avoid an bytecode interpreter. A large part of the work of adding a new (JIT) backend is pinning down the semantics, and adding a bytecode interpreter will certainly help to do that. Using dedicated value types during interpretation instead of recycling the AST for that will also make the transitions clearer and get rid of some of the lifetime complexities in the current code. Meh, sure. But this feels just as difficult as switching to a simple bytecode, without all the benefits.
Re: Battle-plan for CTFE
On 05/17/2016 12:42 PM, Don Clugston wrote: > There's no need for grandiose plans, as if there is some > almost-insurmountable problem to be solved. THIS IS NOT DIFFICULT. With > the interface cleaned up, it is the well-studied problem of creating an > interpreter. Everyone knows how to do this, it's been done thousands of > times. The complete test suite is there for you. Someone just needs to > do it. Yes, exactly my words. > I think I took the approach of using syntax trees about as far as it can > go. It's possible, but it's really vile. Look at the code for doing > assignments. Bleagh. The only thing in its favour is that originally it > was the only implementation that was possible at all. Even the first, > minimal step towards creating a ctfe backend -- introducing a > syntax-tree-validation step -- simplified parts of the code immensely. Yes, this https://github.com/dlang/dmd/blob/7d00095301c4780b41addcfeb50f4743a9a6c5d4/src/dinterpret.d#L3418 is really ugly and complex, but you won't get rid of this inherent complexity. The e2ir code for AssingExp looks almost the same https://github.com/dlang/dmd/blob/7d00095301c4780b41addcfeb50f4743a9a6c5d4/src/e2ir.c#L2466. > You might imagine that it's easier to work with syntax trees than to > start from scratch but I'm certain that's not true. I'm pretty sure that > the simplest approach is to use the simplest possible > machine-independent bytecode that you can come up with. I had got to the > point of starting that, but I just couldn't face doing it in C++. All I'm saying is that interpreting the AST to generate bytecode is going to be as complex as interpreting the AST directly, but then you still a bytecode interpreter and later might still want a JIT. Using dedicated value types during interpretation instead of recycling the AST for that will also make the transitions clearer and get rid of some of the lifetime complexities in the current code. -Martin
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 10:29:21 UTC, Martin Nowak wrote: On 05/10/2016 08:45 AM, Jacob Carlborg wrote: I was listening to a discussion Don and Daniel had about the current implementation of CTFE. They talked about using a byte code interpreter. Even implementing a really crappy byte code interpreter would be a huge improvement. No need for a byte-code interpreter, it mostly just adds overhead and complexity over an AST interpreter. If you want to go really fast you need some sort of JIT anyhow, but a proper interpreter will be orders of mangnitude faster than the current implementation. I might refer you to http://dconf.org/2013/talks/chevalier_boisvert.pdf page 59 ff. +1 . One need to walk the tree anyway to generate bytecode, which makes it impossible to make it faster for a one time execution. For frequent executions, then a JIT is preferable, which let the bytecode the favorite choice for more than one, but not too many executions.
Re: Battle-plan for CTFE
On Tuesday, 17 May 2016 at 10:42:30 UTC, Don Clugston wrote: TL;DR: CTFE is actually a backend, so don't be afraid of creating a glue layer for it. My point exactly. The AST is not something I want to handle while inside the interpreter. It introduces too much complexity. There needs to be some kind of further lowering.
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 12:17:30 UTC, Daniel Murphy wrote: On 15/05/2016 9:57 PM, Martin Nowak wrote: On 05/15/2016 01:58 PM, Daniel Murphy wrote: The biggest advantage of bytecode is not the interpreter speed, it's that by lowering you can substitute VarExps etc with actual references to memory without modifying the AST. By working with something lower level than the AST, you should end up with something much less complex and with fewer special cases. Which is a bad assessment, you can stick variable indexes into VarDeclaration (we already do that) and thereby access them in O(1). Converting control flow and references into byte code is far from trivial, we're talking about another s2ir and e2ir here. -Martin For simple types that's true. For more complicated reference types... Variable indexes are not enough, you also need heap memory, but slices and pointers (and references) can refer to values either on the heap or the stack, and you can have a slice of a member static array of a class on the stack, etc. Then there are closures... Neither e2ir or s2ir are actually that complex. A lot of the mess there comes from the backend IR interface being rather difficult to work with. We can already save a big chunk of complexity by not having to translate the frontend types. E.g. implementing the logic in the interpreter to correctly unwind through destructors is unlikely to be simpler than lowering to an IR. Exactly. I think the whole idea of trying to avoid a glue layer is a mistake. CTFE is a backend. It really is. And it should be treated as one. A very simple one, of course. Once you do this, you'll find all sorts of commonalities with the existing glue layers. We should end up with at least 4 backends: DMD, GCD, LDC, and CTFE. Many people here are acting like this is something complicated, and making dangerous suggestions like using Phobos inside the compiler. (I think everyone who has fixed a compiler bug that was discovered in Phobos, will know what a nightmare that would be. The last thing compiler development needs is another level of complexity in the compiler). As I've tried to explain, the problems with CTFE historically were never with the CTFE engine itself. They were always with the interface between CTFE and the remainder of the compiler -- finding every case where CTFE can be called, finding all the bizarre cases (tuple variables, variables without a stack because they are local variables declared in comma expressions in global scope, local 'ref' variables, etc), finding all the cases where the syntax trees were invalid... There's no need for grandiose plans, as if there is some almost-insurmountable problem to be solved. THIS IS NOT DIFFICULT. With the interface cleaned up, it is the well-studied problem of creating an interpreter. Everyone knows how to do this, it's been done thousands of times. The complete test suite is there for you. Someone just needs to do it. I think I took the approach of using syntax trees about as far as it can go. It's possible, but it's really vile. Look at the code for doing assignments. Bleagh. The only thing in its favour is that originally it was the only implementation that was possible at all. Even the first, minimal step towards creating a ctfe backend -- introducing a syntax-tree-validation step -- simplified parts of the code immensely. You might imagine that it's easier to work with syntax trees than to start from scratch but I'm certain that's not true. I'm pretty sure that the simplest approach is to use the simplest possible machine-independent bytecode that you can come up with. I had got to the point of starting that, but I just couldn't face doing it in C++. TL;DR: CTFE is actually a backend, so don't be afraid of creating a glue layer for it.
Re: Battle-plan for CTFE
On Monday, 16 May 2016 at 12:13:14 UTC, Martin Nowak wrote: Last time people forced me to spend several hours on reimplementing and debugging a BitArray implementation Ouch. src/tk/vec.(h|c) already contained an implementation.
Re: Battle-plan for CTFE
On Tuesday, 10 May 2016 at 11:31:33 UTC, Stefan Koch wrote: Yes I do know the llvm jit, it is slow as a three legged dog. But I do plan for a way of plugging it in. This is not a main priority however. What about libjit?
Re: Battle-plan for CTFE
On 16/05/2016 9:20 PM, Martin Nowak wrote: On Monday, 16 May 2016 at 10:01:47 UTC, Kagamin wrote: Wasn't it possible to enable GC for entire compiler? There can be hybrid approach: 1) first allocate from bump heap 2) when it reaches, say, 200MB, switch to GC. Well, I wouldn't use D's GC for that dedicated heap. Allocation of CTFE values are completely independent and call be freed once the evaluation is finished. Maybe you wouldn't, but you certainly could...
Re: Battle-plan for CTFE
On 05/16/2016 03:03 PM, Martin Nowak wrote: > ~this() > { > if (impl.onHeap && --impl.heap.refCount == 0) > heapAllocator.free(impl.heap); > } Of course missing the increment for copies. this(this) { if (impl.onHeap) ++impl.heap.refCount; }
Re: Battle-plan for CTFE
On 05/16/2016 01:36 PM, Andrei Alexandrescu wrote: > > A reap would be great there! std.experimental.allocator offers that and > a variety of others. -- Andrei Yes indeed, a malloc backed Region Allocator w/ a FreeList or a BitmappedBlock would be a good starting point. That might finally be a compelling enough case to start using phobos in dmd. Last time people forced me to spend several hours on reimplementing and debugging a BitArray implementation [¹]. [¹]: https://github.com/dlang/dmd/pull/5426#discussion_r52833955
Re: Battle-plan for CTFE
On 5/16/16 7:20 AM, Martin Nowak wrote: On Monday, 16 May 2016 at 10:01:47 UTC, Kagamin wrote: Wasn't it possible to enable GC for entire compiler? There can be hybrid approach: 1) first allocate from bump heap 2) when it reaches, say, 200MB, switch to GC. Well, I wouldn't use D's GC for that dedicated heap. Allocation of CTFE values are completely independent and call be freed once the evaluation is finished. A reap would be great there! std.experimental.allocator offers that and a variety of others. -- Andrei
Re: Battle-plan for CTFE
On Monday, 16 May 2016 at 10:01:47 UTC, Kagamin wrote: Wasn't it possible to enable GC for entire compiler? There can be hybrid approach: 1) first allocate from bump heap 2) when it reaches, say, 200MB, switch to GC. Well, I wouldn't use D's GC for that dedicated heap. Allocation of CTFE values are completely independent and call be freed once the evaluation is finished.
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 13:25:42 UTC, Martin Nowak wrote: So we do need a GC or RC for arrays, structs, classes (anything heapish). Values for those could be allocated by a simple bump/region allocator or a dedicated allocator that support individual freeing (via RC or GC). Wasn't it possible to enable GC for entire compiler? There can be hybrid approach: 1) first allocate from bump heap 2) when it reaches, say, 200MB, switch to GC.
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 15:09:17 UTC, Stefan Koch wrote: You want it ? Write it. I don't «want» anything.
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 14:56:02 UTC, Ola Fosheim Grøstad wrote: Well, this looks really bad. But a solver would get you much more than an interpreter. E.g. proving that asserts always hold etc. You want it ? Write it.
Re: Battle-plan for CTFE
On 05/15/2016 04:00 PM, Daniel Murphy wrote: > The problem is, if index refers to a single variable on the stack, then > it's insufficient to refer to a variable inside an aggregate on the > stack. Then you need to start building constructs for member of struct > in array of struct pointers and it gets fairly messy... It's all > solvable, I'm not sure the simplicity would survive. Taking what I said further down below there would be one union Value type (untagged b/c the compiler knows what it is). Then an Array could be implementd as HeapValueRef[], a hash table as HeapValueRef[ValueRef], a struct/class as HeapValueRef[] (with HeapValueRef being a pointer or int to a heap allocated Value and ValueRef being a pointer or int to either a stack value or a heap value). A D Pointer/Ref would just be a ValueRef in the interpreter and the aforementioned int (2B + for stack, 2B - for heap) encoding should still work for that. Depending on how much pointer arithmetic we support, Pointer must contain a ValueRef to the outermost aggregate and needs to store the type of that an additional byte offset. The type and offset could then be used to compute the actual pointee. While that sounds a bit complex, it's merely just https://en.wikipedia.org/wiki/Offsetof. > Flow control is really not where the complexity lies IMO. The weird > ways in which different types of reference types can combine leads to > either very complicated or very low level descriptions of memory. Maybe you're right, but it'll be hard to figure out w/o an actual implementation. And the AST still looks like a lot less to code and execute.
Re: Battle-plan for CTFE
On 05/15/2016 02:54 PM, Daniel Murphy wrote: > > We really should have discussed this last week! I talked about it w/ Stefan, and asked him to volunteer for an implementation, that's why we have this thread ;). In any case I'm convinced that the simple-first strategy has a much higher chance to be implemented this year, whereas the bug report [¹] for slow CTFE is already 5 years old. [¹]: https://issues.dlang.org/show_bug.cgi?id=6498
Re: Battle-plan for CTFE
On 15/05/2016 11:25 PM, Martin Nowak wrote: On 05/15/2016 02:17 PM, Daniel Murphy wrote: For simple types that's true. For more complicated reference types... Variable indexes are not enough, you also need heap memory, but slices and pointers (and references) can refer to values either on the heap or the stack, and you can have a slice of a member static array of a class on the stack, etc. Then there are closures... So we do need a GC or RC for arrays, structs, classes (anything heapish). Values for those could be allocated by a simple bump/region allocator or a dedicated allocator that support individual freeing (via RC or GC). In any case struct Pointer { int index; /* 2B positive values for stack, 2B negative for heap*/ } wouldn't be much more complicated than a raw pointer (and a bit simpler to garbage collect). The problem is, if index refers to a single variable on the stack, then it's insufficient to refer to a variable inside an aggregate on the stack. Then you need to start building constructs for member of struct in array of struct pointers and it gets fairly messy... It's all solvable, I'm not sure the simplicity would survive. Neither e2ir or s2ir are actually that complex. A lot of the mess there comes from the backend IR interface being rather difficult to work with. Think of a simple switch statement where you even need to introduce relocations (or keep a list of fixup addresses) b/c you don't know the jump addresses in advance. This is not exactly difficult to do. In a visitor you simply test the cases and execute the first case body. Not to mention that we can reuse existing solutions from the current interpreter (e.g. for gotos see https://github.com/dlang/dmd/blob/0757504342e48e272372b7ac52cda5a333b2a2bc/src/dinterpret.d#L1014 and https://github.com/dlang/dmd/blob/0757504342e48e272372b7ac52cda5a333b2a2bc/src/dinterpret.d#L1094). Flow control is really not where the complexity lies IMO. The weird ways in which different types of reference types can combine leads to either very complicated or very low level descriptions of memory.
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 12:54:33 UTC, Daniel Murphy wrote: We really should have discussed this last week! I agree. Maybe we can have a skype conference or something in the next days. About the whole to BC or not to BC discussion. As Daniel already outlined, the main purpose it not speed, but having a simple lowered representation to interpret. Many AST interpreters switched to a byte-code because it's much easier to maintain.
Re: Battle-plan for CTFE
On 05/15/2016 02:13 PM, Ola Fosheim Grøstad wrote: > > Well, you can, but it won't bring improvements to the language down the > line. Maybe you don't know the actual problem of the current interpreter? I leaks memory like hell b/c it allocates new AST nodes for almost every expression evaluation. Even an interpreter that is as slow as ruby will fly during compile time in comparision to the current one. Let me illustrate the point. --- import std.algorithm, std.array, std.random, std.range; enum count = 2 ^^ 10; enum sorted = { auto gen = Random(123); return generate!(() => uniform(byte.min, byte.max, gen)).take(count).array.sort().release; }(); pragma(msg, sorted.length); --- count = 2 ** 10 nums = Enumerator.new do |yielder| prng = Random.new(123) loop do yielder.yield prng.rand(-128 .. 127) end end.take(count).sort print nums.length --- N | CTFE | Ruby | Time | Mem | Time | Mem ---|---|--|---|-- 2^^10 | 0.16s | 82M | 0.11s | 9.3M 2^^11 | 0.22s | 110M | 0.12s | 9.3M 2^^12 | 0.4s | 190M | 0.12s | 9.4M 2^^13 | 0.7s | 450M | 0.12s | 9.5M 2^^14 | 1.5s | 1.4G | 0.12s | 9.7M 2^^15 | 3.7s | 4.8G | 0.13s | 10.0M 2^^16 | 5:30m | 15G | 0.13s | 10.8M D's CTFE grows O(N^2) b/c it leaks for almost every operation. We don't currently need a superfast interpreter, even the simplest possible interpreter will allow so much more that we're more likely limited by the lack of I/O before we need a faster interpreter.
Re: Battle-plan for CTFE
On 05/15/2016 02:17 PM, Daniel Murphy wrote: > > For simple types that's true. For more complicated reference types... > > Variable indexes are not enough, you also need heap memory, but slices > and pointers (and references) can refer to values either on the heap or > the stack, and you can have a slice of a member static array of a class > on the stack, etc. Then there are closures... So we do need a GC or RC for arrays, structs, classes (anything heapish). Values for those could be allocated by a simple bump/region allocator or a dedicated allocator that support individual freeing (via RC or GC). In any case struct Pointer { int index; /* 2B positive values for stack, 2B negative for heap*/ } wouldn't be much more complicated than a raw pointer (and a bit simpler to garbage collect). > Neither e2ir or s2ir are actually that complex. A lot of the mess there > comes from the backend IR interface being rather difficult to work with. Think of a simple switch statement where you even need to introduce relocations (or keep a list of fixup addresses) b/c you don't know the jump addresses in advance. In a visitor you simply test the cases and execute the first case body. Not to mention that we can reuse existing solutions from the current interpreter (e.g. for gotos see https://github.com/dlang/dmd/blob/0757504342e48e272372b7ac52cda5a333b2a2bc/src/dinterpret.d#L1014 and https://github.com/dlang/dmd/blob/0757504342e48e272372b7ac52cda5a333b2a2bc/src/dinterpret.d#L1094).
Re: Battle-plan for CTFE
On 15/05/2016 9:57 PM, Martin Nowak wrote: On 05/15/2016 01:58 PM, Daniel Murphy wrote: The biggest advantage of bytecode is not the interpreter speed, it's that by lowering you can substitute VarExps etc with actual references to memory without modifying the AST. By working with something lower level than the AST, you should end up with something much less complex and with fewer special cases. Which is a bad assessment, you can stick variable indexes into VarDeclaration (we already do that) and thereby access them in O(1). Converting control flow and references into byte code is far from trivial, we're talking about another s2ir and e2ir here. -Martin We really should have discussed this last week!
Re: Battle-plan for CTFE
On 15/05/2016 9:57 PM, Martin Nowak wrote: On 05/15/2016 01:58 PM, Daniel Murphy wrote: The biggest advantage of bytecode is not the interpreter speed, it's that by lowering you can substitute VarExps etc with actual references to memory without modifying the AST. By working with something lower level than the AST, you should end up with something much less complex and with fewer special cases. Which is a bad assessment, you can stick variable indexes into VarDeclaration (we already do that) and thereby access them in O(1). Converting control flow and references into byte code is far from trivial, we're talking about another s2ir and e2ir here. -Martin For simple types that's true. For more complicated reference types... Variable indexes are not enough, you also need heap memory, but slices and pointers (and references) can refer to values either on the heap or the stack, and you can have a slice of a member static array of a class on the stack, etc. Then there are closures... Neither e2ir or s2ir are actually that complex. A lot of the mess there comes from the backend IR interface being rather difficult to work with. We can already save a big chunk of complexity by not having to translate the frontend types. E.g. implementing the logic in the interpreter to correctly unwind through destructors is unlikely to be simpler than lowering to an IR.
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 12:00:43 UTC, Martin Nowak wrote: On 05/15/2016 01:55 PM, Ola Fosheim Grøstad wrote: If you are going to have fast evaluation of loops/recursion then you need to use a solver. And well, doing worse than O(log N) at compile time is a very bad idea. Why not start with the most difficult case first? Then the simple cases will resolve themselves for free, most likely. Why not do something that takes about a month and is much more likely to succeed? Well, you can, but it won't bring improvements to the language down the line. If typical CTFE can be rewritten as simple tail recursion then you probably can find existing libraries that will do it for you.
Re: Battle-plan for CTFE
On 05/15/2016 02:02 PM, Stefan Koch wrote: > > Correct. A ByteCode Interpreter will add even more implementation > overhead, and the benefit is only realizable if the ByteCode is a > standard format that can be read other backends such as a jit. This indeed would be an interesting proposal, interpretable IR that is portable between the backends. But I guess we won't find such IR or at least need to lower it into RealIR for dmd/gcc/llvm. Sounds like an interesting candidate for the next GSoC.
Re: Battle-plan for CTFE
On 05/15/2016 01:55 PM, Ola Fosheim Grøstad wrote: > If you are going to have fast evaluation of loops/recursion then you > need to use a solver. And well, doing worse than O(log N) at compile > time is a very bad idea. > > Why not start with the most difficult case first? Then the simple cases > will resolve themselves for free, most likely. Why not do something that takes about a month and is much more likely to succeed? If someone has more time and needs an even faster interpreter she can write a new one, or add optimizations or JIT to the simple interpreter.
Re: Battle-plan for CTFE
On 05/15/2016 01:58 PM, Daniel Murphy wrote: > The biggest advantage of bytecode is not the interpreter speed, it's > that by lowering you can substitute VarExps etc with actual references > to memory without modifying the AST. > > By working with something lower level than the AST, you should end up > with something much less complex and with fewer special cases. Which is a bad assessment, you can stick variable indexes into VarDeclaration (we already do that) and thereby access them in O(1). Converting control flow and references into byte code is far from trivial, we're talking about another s2ir and e2ir here. -Martin
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 10:29:21 UTC, Martin Nowak wrote: On 05/10/2016 08:45 AM, Jacob Carlborg wrote: I was listening to a discussion Don and Daniel had about the current implementation of CTFE. They talked about using a byte code interpreter. Even implementing a really crappy byte code interpreter would be a huge improvement. No need for a byte-code interpreter, it mostly just adds overhead and complexity over an AST interpreter. If you want to go really fast you need some sort of JIT anyhow, but a proper interpreter will be orders of mangnitude faster than the current implementation. I might refer you to http://dconf.org/2013/talks/chevalier_boisvert.pdf page 59 ff. Correct. A ByteCode Interpreter will add even more implementation overhead, and the benefit is only realizable if the ByteCode is a standard format that can be read other backends such as a jit.
Re: Battle-plan for CTFE
On 15/05/2016 8:29 PM, Martin Nowak wrote: No need for a byte-code interpreter, it mostly just adds overhead and complexity over an AST interpreter. If you want to go really fast you need some sort of JIT anyhow, but a proper interpreter will be orders of mangnitude faster than the current implementation. The biggest advantage of bytecode is not the interpreter speed, it's that by lowering you can substitute VarExps etc with actual references to memory without modifying the AST. By working with something lower level than the AST, you should end up with something much less complex and with fewer special cases. The current goal is not a full JIT, just something that manages memory in a less insane way.
Re: Battle-plan for CTFE
On 05/09/2016 06:57 PM, Stefan Koch wrote: > I was shocked to discover that the PowExpression actually depends on > phobos! (depending on the exact codePath it may or may not compile...) > which let to me prematurely stating that it worked at ctfe > [http://forum.dlang.org/thread/ukcoibejffinknrbz...@forum.dlang.org] There is a really old bug report for that [Issue 3749 – cannot evaluate yl2x (log) and exp functions at compile time](https://issues.dlang.org/show_bug.cgi?id=3749). The lack of exp is really limiting for many nice table precomputation use-cases in scientific contexts. -Martin
Re: Battle-plan for CTFE
On Sunday, 15 May 2016 at 10:29:21 UTC, Martin Nowak wrote: On 05/10/2016 08:45 AM, Jacob Carlborg wrote: overhead and complexity over an AST interpreter. If you want to go really fast you need some sort of JIT anyhow, but a proper interpreter will be orders of mangnitude faster than the current implementation. If you are going to have fast evaluation of loops/recursion then you need to use a solver. And well, doing worse than O(log N) at compile time is a very bad idea. Why not start with the most difficult case first? Then the simple cases will resolve themselves for free, most likely.
Re: Battle-plan for CTFE
On 05/13/2016 06:32 PM, Stefan Koch wrote: > I would like to work on a solution that does scale. The Problem is > not making a byteCode-interpreter. That part is relatively easy. > Currently I am trying to get a detailed understanding of dmd and > it's data-structures. (mainly it's AST.) > > Generating the byte-code seems to be non-trivial. > > I wonder in how far the glue layer can be of help... Seems like I've to repeat this once more, b/c everyone including me didn't got it in the first place. We don't need a bytecode interpreter, it mostly adds overhead and a complicated second layer between walking the AST and interpreting it (think of transforming a for loop with goto into linear bytecode, almost as complicated as in the glue layer). What we basically need is a stack of values, a stack of frames (for function calls and variables in scopes), and an AST visitor that does the interpretation. It would be most helpful for the success of this to follow common CS examples like [¹], [²], or [³]. With realistic expectations we might have a working implementation in a month or so. With more requirements like bytecode, using dmd's backend, or JIT we end up with a long newsgroup discussion ;). Tricky things for a CTFE interpreter include: - enumerating VarDeclarations (they already have a ctfeAdrOnStack field) in each scope, and referring to outer variables from nested scopes At best just use a continuous stack and just set the stack pointer to the last frame pointer when leaving a scope. - getting function calls right Push arguments, on return shift top of stack under arguments and pop them (caller cleanup). If possible detect and support tail recursion. - converting AST values to and from Interpreter Values. Literals and constant VarExp from the AST need to be converted to an interpreter Value before being pushed on the stack. The result of interpretation (top of stack) needs to be converted back to an AST literal. Using separate data types (instead of creating AST values in the interpreter) will be a major performance improvement over using AST values (e.g. 16 vs. ~100 bytes). It also creates a clear boundary between Interpreter and AST values. Currently quite some complexity is thrown at cleaning interpreter generated AST values, and distinguishing interpreter owned from normal AST values (which allows some optimizations) [⁴]. We don't need a tagged union b/c the AST already contains the type information, but a tag could be helpful for debugging [⁵]. Values can be deleted when popped from stack (no ref counting needed I think). - Implementing more complex data structures (arrays, strings, hash tables, aggregates) Use Value[], Value[Value], and a dedicated String (char[]/wchar[]/dchar[]). For structs/classes field indexes are known => use fix-sized Value[]. Value.class_ needs to hold a reference to the actual class instance for vtable calls. Last time I was working on this (also on a bytecode interpreter) the entry point was fairly clear [⁶] (thanks to Don). -Martin [¹]: [The Interpreter In An Undergraduate Compilers Course ](http://arxiv.org/pdf/1412.0426.pdf) [²]: [L8: Interpreters & Visitors](http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-005-elements-of-software-construction-fall-2011/lecture-notes/MIT6_005F11_lec08.pdf) [³]: [PA 2: Interpreter](https://sites.google.com/a/bodik.org/cs164/projects/pa2) [⁴]: https://github.com/dlang/dmd/search?q=ownedByCtfe [⁵]: https://github.com/MartinNowak/dmd/blob/28ffb0ab4fa6950f60c085f33f8a2ce23df7c0cd/src/interpret.c#L73 [⁶]: https://github.com/MartinNowak/dmd/blob/28ffb0ab4fa6950f60c085f33f8a2ce23df7c0cd/src/interpret.c#L693
Re: Battle-plan for CTFE
On 05/10/2016 08:45 AM, Jacob Carlborg wrote: > > I was listening to a discussion Don and Daniel had about the current > implementation of CTFE. They talked about using a byte code interpreter. > Even implementing a really crappy byte code interpreter would be a huge > improvement. No need for a byte-code interpreter, it mostly just adds overhead and complexity over an AST interpreter. If you want to go really fast you need some sort of JIT anyhow, but a proper interpreter will be orders of mangnitude faster than the current implementation. I might refer you to http://dconf.org/2013/talks/chevalier_boisvert.pdf page 59 ff.
Re: Battle-plan for CTFE
On Wednesday, May 11, 2016 07:06:59 maik klein via Digitalmars-d-announce wrote: > What is the current problem with ctfe? The biggest problem is that it uses up a _lot_ of memory and is generally slow. For instance, as I understand it, every time it mutates a variable, it actually allocates a new one to hold the new state. This combined with the fact that the compiler doesn't actually ever free memory (since it's normally more efficient for it to work that way), and you risk running out of memory while compiling. CTFE is a fantastic feature, but it evolved over time rather than being designed up front, and it's suffered a lot because of that. Don did a _lot_ of work to improve it, but he wasn't able to continue working on it, and until now, no one has ever really stepped up to finish the job. Don's post gives a good background on why CTFE is the way it is and some of what he did to make it as solid as it is now: http://forum.dlang.org/post/jmvsbhdpsjgeykpuk...@forum.dlang.org But having someone like Stefan reimplement will be _huge_, and the D community will be _very_ grateful. - Jonathan M Davis
Re: Battle-plan for CTFE
On Friday, 13 May 2016 at 13:59:57 UTC, Don Clugston wrote: I think I need to explain the history of CTFE. Originally, we had constant-folding. Then constant-folding was extended to do things like slicing a string at compile time. Constant folding leaks memory like the Exxon Valdez leaks oil, but that's OK because it only ever happens once. Then, the constant folding was extended to include function calls, for loops, etc. All using the existing constant-folding code. Now the crappy memory usage is a problem. But it's OK because the CTFE code was kind of proof-of-concept thing anyway. [...] Thanks for the explanation, and for doing so much work on CTFE. I would like to work on a solution that does scale. The Problem is not making a byteCode-interpreter. That part is relatively easy. Currently I am trying to get a detailed understanding of dmd and it's data-structures. (mainly it's AST.) Generating the byte-code seems to be non-trivial. I wonder in how far the glue layer can be of help...
Re: Battle-plan for CTFE
On 13.05.2016 15:59, Don Clugston wrote: All that's needed is a very simple bytecode interpreter. Here is the one I have hacked together: https://github.com/tgehr/d-compiler/blob/master/interpret.d This file does both constant folding and byte-code interpretation for most of the language. I still need to implement exception handling. I'll let you know when it passes interpret3.d. :)
Re: Battle-plan for CTFE
On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: Hi Guys, I have been looking into the DMD now to see what I can do about CTFE. Unfortunately It is a pretty big mess to untangle. Code responsible for CTFE is in at least 3 files. [dinterpret.d, ctfeexpr.d, constfold.d] I was shocked to discover that the PowExpression actually depends on phobos! (depending on the exact codePath it may or may not compile...) Yes. This is because of lowering. Walter said in his DConf talk that lowering was a success; actually, it's a quick-and-dirty hack that inevitably leads to a disaster. Lowering always needs to be reverted. which let to me prematurely stating that it worked at ctfe [http://forum.dlang.org/thread/ukcoibejffinknrbz...@forum.dlang.org] My Plan is as follows. Add a new file for my ctfe-interpreter and update it gradually to take more and more of the cases the code in the files mentioned above was used for. Do Dataflow analysis on the code that is to be ctfe'd so we can tell beforehand if we need to store state in the ctfe stack or not. You don't need dataflow analysis. The CtfeCompile pass over the semantic tree was intended to determine how many variables are required by each function. Or baring proper data-flow analysis: RefCouting the variables on the ctfe-stack could also be a solution. I will post more details as soon as I dive deeper into the code. The current implementation stores persistent state for every ctfe incovation. While caching nothing. Not even the compiled for of a function body. Because it cannot relax purity. No. Purity is not why it doesn't save the state. It's because of history. I think I need to explain the history of CTFE. Originally, we had constant-folding. Then constant-folding was extended to do things like slicing a string at compile time. Constant folding leaks memory like the Exxon Valdez leaks oil, but that's OK because it only ever happens once. Then, the constant folding was extended to include function calls, for loops, etc. All using the existing constant-folding code. Now the crappy memory usage is a problem. But it's OK because the CTFE code was kind of proof-of-concept thing anyway. Now, everyone asks, why doesn't it use some kind of byte-code interpreter or something? Well, the reason is, it just wasn't possible. There was actually no single CTFE entry point. Instead, it was a complete mess. For example, with template arguments, the compiler would first try to run CTFE on the argument, with error messages suppressed. If that succeeded, it was a template value argument. If it generated errors, it would then see if was a type. If that failed as well, it assumed it was a template alias argument. The other big problem was that CTFE was also often called on a function which had semantic errors. So, here is what I did with CTFE: (1) Implement all the functionality, so that CTFE code can be developed. The valuable legacy of this, which I am immensely proud of, is the file "interpret3.d" in the test suite. It is very comprehensive. If an CTFE implementation passes the test suite, it's good to go. The CTFE implementation itself is practically worthless. It's value was to get the test suite developed. (2) Created a single entry point for CTFE. This involved working out rules for every place that CTFE is actually required, removing the horrid speculative execution of CTFE. It made sure that functions had actually been semantically analyzed before they were executed (there were really horrific cases where the function had its semantic tree modified while it was being executed!!) Getting to this point involved about 60 pull requests and six months of nearly full-time work. Finally it was possible to consider a byte-code interpreter or JITer. We reached this point around Nov 2012. (3) Added a 'ctfeCompile' step which runs over the semantic tree the first time the function is executed at compile time. Right now it does nothing much except that check that the semantic tree is valid. This detected many errors in the rest of the compiler. We reached this point around March 2013. My intention was to extend the cfteCompile step to a byte-code generator. But then I had to stop working on it and concentrate on looking after my family. Looking at the code without knowing the history, you'll think, the obvious way to do this would be with a byte-code generator or JITer, and wonder why the implementation is so horrible. But for most of the history, that kind of implementation was just not possible. People come up with all these elaborate schemes to speed up CTFE. It's totally not necessary. All that's needed is a very simple bytecode interpreter.
Re: Battle-plan for CTFE
On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote: Hi Guys, I have been looking into the DMD now to see what I can do about CTFE. Unfortunately It is a pretty big mess to untangle. Code responsible for CTFE is in at least 3 files. [dinterpret.d, ctfeexpr.d, constfold.d] I was shocked to discover that the PowExpression actually depends on phobos! (depending on the exact codePath it may or may not compile...) which let to me prematurely stating that it worked at ctfe [http://forum.dlang.org/thread/ukcoibejffinknrbz...@forum.dlang.org] My Plan is as follows. Add a new file for my ctfe-interpreter and update it gradually to take more and more of the cases the code in the files mentioned above was used for. Do Dataflow analysis on the code that is to be ctfe'd so we can tell beforehand if we need to store state in the ctfe stack or not. Or baring proper data-flow analysis: RefCouting the variables on the ctfe-stack could also be a solution. I will post more details as soon as I dive deeper into the code. What is the current problem with ctfe? Before I switched from C++ to D a few months ago I was heavily using boost hana in C++. I tried to emulate hana in D which worked quite well but the compile time performance was absolutely horrific https://maikklein.github.io/2016/03/01/metaprogramming-typeobject/ After that I tried a few other things and I compared the compile times with https://github.com/boostorg/hana/tree/master/benchmark which I could never beat. The fastest thing, if I remember correctly, was string mixins but they used up too much memory. But I have to say that I don't know much about the D internals and therefore don't know how I would optimize ct code execution.
Re: Battle-plan for CTFE
On Mon, May 09, 2016 at 05:36:21PM -0700, Walter Bright via Digitalmars-d-announce wrote: > On 5/9/2016 2:32 PM, Andrej Mitrovic via Digitalmars-d-announce wrote: > >On 5/9/16, Stefan Koch via Digitalmars-d-announce > >wrote: > >>I was shocked to discover that the PowExpression actually depends on > >>phobos! > > > >I seem to remember having to implement diagnostics in DMD asking for > >the user to import std.math. I'm fairly confident it had something to > >do with power expressions. > > > >Yah, it's a mess. :) > > > > The pow stuff should just be done in dmd without reference to the > library. +1. It makes little sense for a language built-in operator to require std.math, when the whole point behind the druntime/phobos split is to make it possible for implementations *not* to implement Phobos. T -- Some days you win; most days you lose.
Re: Battle-plan for CTFE
On Tuesday, 10 May 2016 at 07:44:54 UTC, Rory McGuire wrote: On Tue, May 10, 2016 at 8:45 AM, Jacob Carlborg via Digitalmars-d-announcewrote: I was listening to a discussion Don and Daniel had about the current implementation of CTFE. They talked about using a byte code interpreter. Even implementing a really crappy byte code interpreter would be a huge improvement. -- /Jacob Carlborg Does anyone know whether it would be worth while keeping the LLVM JIT in mind when writing this new CTFE engine? Yes I do know the llvm jit, it is slow as a three legged dog. But I do plan for a way of plugging it in. This is not a main priority however.
Re: Battle-plan for CTFE
On Tue, May 10, 2016 at 8:45 AM, Jacob Carlborg via Digitalmars-d-announcewrote: > > I was listening to a discussion Don and Daniel had about the current > implementation of CTFE. They talked about using a byte code interpreter. > Even implementing a really crappy byte code interpreter would be a huge > improvement. > > -- > /Jacob Carlborg Does anyone know whether it would be worth while keeping the LLVM JIT in mind when writing this new CTFE engine?
Re: Battle-plan for CTFE
On Monday, 9 May 2016 at 18:20:46 UTC, Robert burner Schadek wrote: awesome news :-) thanks you I very much agree.
Re: Battle-plan for CTFE
On 2016-05-09 18:57, Stefan Koch wrote: Hi Guys, I have been looking into the DMD now to see what I can do about CTFE. Unfortunately It is a pretty big mess to untangle. Code responsible for CTFE is in at least 3 files. [dinterpret.d, ctfeexpr.d, constfold.d] I was shocked to discover that the PowExpression actually depends on phobos! (depending on the exact codePath it may or may not compile...) which let to me prematurely stating that it worked at ctfe [http://forum.dlang.org/thread/ukcoibejffinknrbz...@forum.dlang.org] My Plan is as follows. Add a new file for my ctfe-interpreter and update it gradually to take more and more of the cases the code in the files mentioned above was used for. Do Dataflow analysis on the code that is to be ctfe'd so we can tell beforehand if we need to store state in the ctfe stack or not. Or baring proper data-flow analysis: RefCouting the variables on the ctfe-stack could also be a solution. I will post more details as soon as I dive deeper into the code. I was listening to a discussion Don and Daniel had about the current implementation of CTFE. They talked about using a byte code interpreter. Even implementing a really crappy byte code interpreter would be a huge improvement. -- /Jacob Carlborg
Re: Battle-plan for CTFE
On 5/9/16 7:57 PM, Stefan Koch wrote: Hi Guys, I have been looking into the DMD now to see what I can do about CTFE. Unfortunately It is a pretty big mess to untangle. Code responsible for CTFE is in at least 3 files. [dinterpret.d, ctfeexpr.d, constfold.d] I was shocked to discover that the PowExpression actually depends on phobos! (depending on the exact codePath it may or may not compile...) which let to me prematurely stating that it worked at ctfe [http://forum.dlang.org/thread/ukcoibejffinknrbz...@forum.dlang.org] My Plan is as follows. Add a new file for my ctfe-interpreter and update it gradually to take more and more of the cases the code in the files mentioned above was used for. Do Dataflow analysis on the code that is to be ctfe'd so we can tell beforehand if we need to store state in the ctfe stack or not. Or baring proper data-flow analysis: RefCouting the variables on the ctfe-stack could also be a solution. I will post more details as soon as I dive deeper into the code. Thanks Stefan, that's a good start! (This is probably better for the general forum.) -- Andrei
Re: Battle-plan for CTFE
On 5/9/2016 2:32 PM, Andrej Mitrovic via Digitalmars-d-announce wrote: On 5/9/16, Stefan Koch via Digitalmars-d-announcewrote: I was shocked to discover that the PowExpression actually depends on phobos! I seem to remember having to implement diagnostics in DMD asking for the user to import std.math. I'm fairly confident it had something to do with power expressions. Yah, it's a mess. :) The pow stuff should just be done in dmd without reference to the library.
Re: Battle-plan for CTFE
On Monday, May 09, 2016 13:24:56 Walter Bright via Digitalmars-d-announce wrote: > On 5/9/2016 9:57 AM, Stefan Koch wrote: > >[...] > > The memory consumption problem, at least, can be resolved by using stack > temporaries instead of allocating new nodes. This was already done in > constfold.d, but not in the rest of the interpreter. > > Doing that will (I predict) also double its speed right there. Previously, Don stated that he thought that simply making it so that CTFE didn't allocate a new integer every time it mutated it would be a _huge_ speed-up by itself (which presumably is what you're talking about with regards to allocating new nodes). Unfortunately, he got too busy to actually do that work and no one else stepped in to do. But if Stefan is going to step up and improve CTFE, that's fantastic. It's one of D's best features, but it's also one of its most problematic. Fixng that would be huge. - Jonathan M Davis
Re: Battle-plan for CTFE
On Monday, 9 May 2016 at 20:24:56 UTC, Walter Bright wrote: On 5/9/2016 9:57 AM, Stefan Koch wrote: [...] The memory consumption problem, at least, can be resolved by using stack temporaries instead of allocating new nodes. This was already done in constfold.d, but not in the rest of the interpreter. Doing that will (I predict) also double its speed right there. Thanks, your advice is most helpful and a good first stop-gap. Still the current state of CTFE is almost not maintainable and I would really prefer a clean-slate approach. SDC benefits extremely from the extra level of indirection, however I do understand that SDC and DMD have diffrent goals regarding compile-speed. Also I feel that good code has found it's most beautiful shape when it's simplicity makes it inevitable, at least the Ctfe-Mechanism has not reached this point yet, imo.
Re: Battle-plan for CTFE
On 5/9/16, Stefan Koch via Digitalmars-d-announcewrote: > I was shocked to discover that the PowExpression actually depends > on phobos! I seem to remember having to implement diagnostics in DMD asking for the user to import std.math. I'm fairly confident it had something to do with power expressions. Yah, it's a mess. :)