On Sunday, 15 May 2016 at 12:17:30 UTC, Daniel Murphy wrote:
On 15/05/2016 9:57 PM, Martin Nowak wrote:
On 05/15/2016 01:58 PM, Daniel Murphy wrote:
The biggest advantage of bytecode is not the interpreter speed, it's that by lowering you can substitute VarExps etc with actual references
to memory without modifying the AST.

By working with something lower level than the AST, you should end up
with something much less complex and with fewer special cases.

Which is a bad assessment, you can stick variable indexes into
VarDeclaration (we already do that) and thereby access them in O(1). Converting control flow and references into byte code is far from
trivial, we're talking about another s2ir and e2ir here.

-Martin


For simple types that's true. For more complicated reference types...

Variable indexes are not enough, you also need heap memory, but slices and pointers (and references) can refer to values either on the heap or the stack, and you can have a slice of a member static array of a class on the stack, etc. Then there are closures...

Neither e2ir or s2ir are actually that complex. A lot of the mess there comes from the backend IR interface being rather difficult to work with. We can already save a big chunk of complexity by not having to translate the frontend types. E.g. implementing the logic in the interpreter to correctly unwind through destructors is unlikely to be simpler than lowering to an IR.

Exactly. I think the whole idea of trying to avoid a glue layer is a mistake. CTFE is a backend. It really is. And it should be treated as one. A very simple one, of course. Once you do this, you'll find all sorts of commonalities with the existing glue layers. We should end up with at least 4 backends: DMD, GCD, LDC, and CTFE.

Many people here are acting like this is something complicated, and making dangerous suggestions like using Phobos inside the compiler. (I think everyone who has fixed a compiler bug that was discovered in Phobos, will know what a nightmare that would be. The last thing compiler development needs is another level of complexity in the compiler).

As I've tried to explain, the problems with CTFE historically were never with the CTFE engine itself. They were always with the interface between CTFE and the remainder of the compiler -- finding every case where CTFE can be called, finding all the bizarre cases (tuple variables, variables without a stack because they are local variables declared in comma expressions in global scope, local 'ref' variables, etc), finding all the cases where the syntax trees were invalid...

There's no need for grandiose plans, as if there is some almost-insurmountable problem to be solved. THIS IS NOT DIFFICULT. With the interface cleaned up, it is the well-studied problem of creating an interpreter. Everyone knows how to do this, it's been done thousands of times. The complete test suite is there for you. Someone just needs to do it.

I think I took the approach of using syntax trees about as far as it can go. It's possible, but it's really vile. Look at the code for doing assignments. Bleagh. The only thing in its favour is that originally it was the only implementation that was possible at all. Even the first, minimal step towards creating a ctfe backend -- introducing a syntax-tree-validation step -- simplified parts of the code immensely.

You might imagine that it's easier to work with syntax trees than to start from scratch but I'm certain that's not true. I'm pretty sure that the simplest approach is to use the simplest possible machine-independent bytecode that you can come up with. I had got to the point of starting that, but I just couldn't face doing it in C++.

TL;DR: CTFE is actually a backend, so don't be afraid of creating a glue layer for it.


Reply via email to