On Sun, Jun 20, 2021 at 08:06:26PM -0600, ben via cctalk wrote: [...] > My latest gripe, is I still am looking for a algorithm to generate code > for a single accumulator machine for an arithmetic expression. Parenthesis > need to evaluated first and temporary variables allotted, thus a two pass > algorithm. Everything is single pass. Recursive decent can parse but can't > generate 'correct' code. A-(B+C) is LD B ADD C ST T1 LD A SUB T1, not LD A > ST T1 LD B ST T2 LD C ADD T2 NEGATE ADD T1
TL;DR: you're probably setting so many implicit design constrants that such an algorithm cannot exist. That accumulator machine sounds a bit like the 6502, for which there are plenty of BASIC interpreters which can parse and evaluate that expression. BBC BASIC is my favourite, but Apple's or Microsoft's implementation also do the job. The ones I've looked at were recursive-descent. Other more advanced parsers take too much precious memory. The modern hotness for parsing is parser combinators, but they're really just a fancy way to write recursive-descent parsers which are easier to read and reason about. They certainly can parse something as simple as "A-(B+C)", since recursive-descent can. Code-generation is a whole different can of worms and unlike the well-trodden path of parsing, is still not a solved problem in computer science. All of the compiler textbooks I've checked, including the infamous Dragon book, handwave code generation, introducing the tree-matching algorithm which works well for straight-line code on their made-up educational CPU architectures, but leaves a lot on the table with real-world CPUs. This limitation probably doesn't matter for your CPU. It seems that you might want to produce code directly from parsing. This shortcut can work if you're producing code for a stack machine such as FORTH, but not a register machine (unless you use your registers as a stack). A common wheeze on the 6502 is to implement a virtual stack machine interpreter, and then the compiler targets that. To do code-generation "properly", you need a multi-pass algorithm: parse the source into a tree, optionally do tree-to-tree transformations to do type checks and/or apply optimisations, and then apply the tree-matching algorithm to that to produce machine code. This is all well-described in any good compiler textbook. Note that except for trivial expressions, you'll need a stack and/or other registers to squirrel away intemediate results. Consider "(A*B)-(C*D)": you need to save the result of one of the multiplications before doing the other and then the subtraction. In the case of the tree-matching algorithm, if your CPU (or rather, the model given to the tree-matcher) is too limited to evaluate a given expression, it will definitively fail and give you a subtree that it cannot match. You then have the "joy" of fixing the model, possibly by adding virtual instructions which are really library calls. For a simple CPU, such calls might account for most of the generated code, at which point a virtual machine becomes a good idea. The book "Writing interactive compilers and interpreters" may be more suitable if you're looking for quick hacks to get up and running on a small machine, assuming you can lay your hands on a copy.
