Re: Battle-plan for CTFE

2016-05-17 Thread Don Clugston via Digitalmars-d-announce

On Sunday, 15 May 2016 at 12:17:30 UTC, Daniel Murphy wrote:

On 15/05/2016 9:57 PM, Martin Nowak wrote:

On 05/15/2016 01:58 PM, Daniel Murphy wrote:
The biggest advantage of bytecode is not the interpreter 
speed, it's
that by lowering you can substitute VarExps etc with actual 
references

to memory without modifying the AST.

By working with something lower level than the AST, you 
should end up

with something much less complex and with fewer special cases.


Which is a bad assessment, you can stick variable indexes into
VarDeclaration (we already do that) and thereby access them in 
O(1).
Converting control flow and references into byte code is far 
from

trivial, we're talking about another s2ir and e2ir here.

-Martin



For simple types that's true.  For more complicated reference 
types...


Variable indexes are not enough, you also need heap memory, but 
slices and pointers (and references) can refer to values either 
on the heap or the stack, and you can have a slice of a member 
static array of a class on the stack, etc.  Then there are 
closures...


Neither e2ir or s2ir are actually that complex.  A lot of the 
mess there comes from the backend IR interface being rather 
difficult to work with.
 We can already save a big chunk of complexity by not having to 
translate the frontend types.  E.g.  implementing the logic in 
the interpreter to correctly unwind through destructors is 
unlikely to be simpler than lowering to an IR.


Exactly. I think the whole idea of trying to avoid a glue layer 
is a mistake.
CTFE is a backend. It really is. And it should be treated as one. 
A very simple one, of course.
Once you do this, you'll find all sorts of commonalities with the 
existing glue layers.
We should end up with at least 4 backends: DMD, GCD, LDC, and 
CTFE.


Many people here are acting like this is something complicated, 
and making dangerous suggestions like using Phobos inside the 
compiler. (I think everyone who has fixed a compiler bug that was 
discovered in Phobos, will know what a nightmare that would be. 
The last thing compiler development needs is another level of 
complexity in the compiler).


As I've tried to explain, the problems with CTFE historically 
were never with the CTFE engine itself. They were always with the 
interface between CTFE and the remainder of the compiler -- 
finding every case where CTFE can be called, finding all the 
bizarre cases (tuple variables, variables without a stack because 
they are local variables declared in comma expressions in global 
scope, local 'ref' variables, etc), finding all the cases where 
the syntax trees were invalid...


There's no need for grandiose plans, as if there is some 
almost-insurmountable problem to be solved. THIS IS NOT 
DIFFICULT. With the interface cleaned up, it is the well-studied 
problem of creating an interpreter. Everyone knows how to do 
this, it's been done thousands of times. The complete test suite 
is there for you. Someone just needs to do it.


I think I took the approach of using syntax trees about as far as 
it can go. It's possible, but it's really vile. Look at the code 
for doing assignments. Bleagh. The only thing in its favour is 
that originally it was the only implementation that was possible 
at all. Even the first, minimal step towards creating a ctfe 
backend -- introducing a syntax-tree-validation step -- 
simplified parts of the code immensely.


You might imagine that it's easier to work with syntax trees than 
to start from scratch but I'm certain that's not true. I'm pretty 
sure that the simplest approach is to use the simplest possible 
machine-independent bytecode that you can come up with. I had got 
to the point of starting that, but I just couldn't face doing it 
in C++.


TL;DR:  CTFE is actually a backend, so don't be afraid of 
creating a glue layer for it.





Re: Battle-plan for CTFE

2016-05-13 Thread Don Clugston via Digitalmars-d-announce

On Monday, 9 May 2016 at 16:57:39 UTC, Stefan Koch wrote:

Hi Guys,

I have been looking into the DMD now to see what I can do about 
CTFE.

Unfortunately It is a pretty big mess to untangle.
Code responsible for CTFE is in at least 3 files.
[dinterpret.d, ctfeexpr.d, constfold.d]
I was shocked to discover that the PowExpression actually 
depends on phobos! (depending on the exact codePath it may or 
may not compile...)


Yes. This is because of lowering. Walter said in his DConf talk 
that lowering was a success; actually, it's a quick-and-dirty 
hack that inevitably leads to a disaster.

Lowering always needs to be reverted.

which let to me prematurely stating that it worked at ctfe 
[http://forum.dlang.org/thread/ukcoibejffinknrbz...@forum.dlang.org]


My Plan is as follows.

Add a new file for my ctfe-interpreter and update it gradually 
to take more and more of the cases the code in the files 
mentioned above was used for.


Do Dataflow analysis on the code that is to be ctfe'd so we can 
tell beforehand if we need to store state in the ctfe stack or 
not.


You don't need dataflow analysis. The CtfeCompile pass over the 
semantic tree was intended to determine how many variables are 
required by each function.


Or baring proper data-flow analysis: RefCouting the variables 
on the ctfe-stack could also be a solution.


I will post more details as soon as I dive deeper into the code.


The current implementation stores persistent state for every 
ctfe incovation.
While caching nothing. Not even the compiled for of a function 
body.

Because it cannot relax purity.


No. Purity is not why it doesn't save the state. It's because of 
history.


I think I need to explain the history of CTFE.
Originally, we had constant-folding. Then constant-folding was 
extended to do things like slicing a string at compile time. 
Constant folding leaks memory like the Exxon Valdez leaks oil, 
but that's OK because it only ever happens once.
Then, the constant folding was extended to include function 
calls, for loops, etc. All using the existing constant-folding 
code. Now the crappy memory usage is a problem. But it's OK 
because the CTFE code was kind of proof-of-concept thing anyway.


Now, everyone asks, why doesn't it use some kind of byte-code 
interpreter or something?
Well, the reason is, it just wasn't possible. There was actually 
no single CTFE entry point. Instead, it was a complete mess. For 
example, with template arguments, the compiler would first try to 
run CTFE on the argument, with error messages suppressed. If that 
succeeded, it was a template value argument. If it generated 
errors, it would then see if was a type. If that failed as well, 
it assumed it was a template alias argument.
The other big problem was that CTFE was also often called on a 
function which had semantic errors.


So, here is what I did with CTFE:
(1) Implement all the functionality, so that CTFE code can be 
developed. The valuable legacy of this, which I am immensely 
proud of, is the file "interpret3.d" in the test suite. It is 
very comprehensive. If an CTFE implementation passes the test 
suite, it's good to go.
The CTFE implementation itself is practically worthless. It's 
value was to get the test suite developed.


(2) Created a single entry point for CTFE. This involved working 
out rules for every place that CTFE is actually required, 
removing the horrid speculative execution of CTFE.
It made sure that functions had actually been semantically 
analyzed before they were executed (there were really horrific 
cases where the function had its semantic tree modified while it 
was being executed!!)
Getting to this point involved about 60 pull requests and six 
months of nearly full-time work. Finally it was possible to 
consider a byte-code interpreter or JITer.


We reached this point around Nov 2012.

(3) Added a 'ctfeCompile' step which runs over the semantic tree 
the first time the function is executed at compile time. Right 
now it does nothing much except that check that the semantic tree 
is valid. This detected many errors in the rest of the compiler.


We reached this point around March 2013.

My intention was to extend the cfteCompile step to a byte-code 
generator. But then I had to stop working on it and concentrate 
on looking after my family.


Looking at the code without knowing the history, you'll think, 
the obvious way to do this would be with a byte-code generator or 
JITer, and wonder why the implementation is so horrible. But for 
most of the history, that kind of implementation was just not 
possible.
People come up with all these elaborate schemes to speed up CTFE. 
It's totally not necessary. All that's needed is a very simple 
bytecode interpreter.