Dag Sverre Seljebotn wrote: > The big refactoring you refer to is related to > seperating the type analysis and type coercion phases
I'm not sure whether you would gain much by doing this. The place where you discover that a coercion is needed is during type analysis, where you look at the types of the operations, decide whether they're compatible, and see whether they need to be converted to a common type, etc. Also you work out what the result type will be and annotate the node with it. To move the creation of coercion nodes into a separate pass, you would have to leave an annotation on the node saying that a coercion is needed. But if some phase between then and doing the actual coercions rearranges the parse tree, these annotations may no longer be correct. So you would have to disallow any other phases between type analysis and coercion, or at least put some restrictions on what they can do, such as not altering the structure of the parse tree, or doing anything else that could invalidate the calculated types. What sort of things were you intending to do in between type analysis and coercion? Could they still be done under these restrictions? More generally, sometimes trying to split things up into more phases can make things more complicated rather than simpler. As an example of this, I'm currently thinking about eliminating the allocate_temps subphase of expression analysis and combining it with the code generation phase. The reason is that there's currently a rather non-obvious dependency between these phases. The order in which temp variables are allocated and released during allocate_temps has to exactly match the order in which code is generated that creates and disposes the references that will be put in those temps. This makes it rather tricky to both write and maintain code for these two phases. The reason they're separate phases at the moment is that I was initially writing the generated code directly to the output file, so I had to know what temp variable declarations would be needed before starting to write any of the body code for a function. However, I'm currently writing the declaration and executable code to separate buffers and combining them afterwards, so there's probably no need for a separate allocate_temps pass any more, and combining it with code generation is likely to simplify quite a lot of things. > It would be better (see below) to have "deeper" cuts, i.e. so that one > could say "now the entire tree has been analysed", "now the entire tree > is ready for code generation", rather than some parts (functions etc.) > being in a seperate state. The reason for doing functions that way is that it seemed wasetful to keep all the symbol tables for the local scopes around longer than necessary. That decision was probably influenced by an earlier project in which I wrote a compiler for a Modula-like language that ran on machines much smaller than we have today. It used a 3-pass arrangement that kept all the symbol tables for everything between passes, with the result that it could only compile a module a few hundred lines long before running out of memory. That experience gave me an appreciation of why Wirth prefers to write single-pass compilers. Although the memory issue probably isn't a concern nowadays, the aforementioned experience led me to approach the problem with the mindset of using as few passes as possible. The only reason I used separate analysis and generation phases at all in the beginning was so that you can refer to C functions that are defined further down without needing forward declarations. > Consider for instance trivial inner functions. One > natural way to implement this is just "throwing the function out" > ... > so you *somehow* need an ugly kludge to > get around this, and spend time thinking about that I don't think it would be all that difficult to make a pass over the function body just before generating code for it, that generates code for any nested functions. In fact, I have a suspicion that for the special case you're talking about (no references to intermediate scopes) it would "just work", because the analysis phase will already have generated an appropriately-mangled C name for the function. On the other hand, "throwing the function out" presents difficulties of its own. As you mentioned, some kind of name mangling would need to be done, which requires knowing which names are supposed to refer to that function, so you need some kind of symbol table functionality available in the pass where you do the throwing-out. The obvious thing is to use the real symbol table, but that means doing it after the declaration analysis phase, by which time the only problem remaining is how to get the C code generated in the right place -- which as I've said isn't really all that hard. > (In reality there'll be full closures instead, which just means > generating a "cdef class" (with state) at module scope rather than a > function. I'd like to see code doing that in less than 150 lines without > using any of my "new structure" proposals...) I'll be impressed if you can do it in 150 lines whatever structure you're using. But in any case, I suspect that the hardest part of this will be coping with all the nested scope issues, and that it will be easiest to do that while the parse tree still reflects the lexical structure of the original code > Consider for instance the "with" statement... the only "natural" > way to implement it within the current structure is to implement > "WithStatementNode" Not necessarily. It may well be feasible to implement it by assembling existing nodes, and I'll be looking into that. > [1] See ParallellAssignmentNode in Nodes.py for an example -- excellent > kludge How so? It seems like exactly the sort of transformation you're advocating, the only difference being that it's implemented by code in Parser.py rather than in a separate pass. > the parser has to output stuff in *one certain way* I'm not sure what you mean by that. The parser doesn't *have* to perform that transformation -- it's just an optimisation. It could just leave parallel assignments alone and let the back end generate a tuple packing and unpacking operation. It would work, albeit less efficiently. If you're referring to the fact that the subnodes of a ParallelAssignmentNode have to be simple assigments, that's the whole purpose of the transformation -- to reduce a parallel assignment to a set of non-parallel ones. If one of the sub-assignments is itself a parallel assignment, it gets expanded as well, until only non-parallel ones are left. If you're wondering why the ParallelAssignmentNode exists at all, it's because you can't just do the assignments one after another -- you have to evaluate all the right hand sides before assigning to any of the left hand sides. It's possible that a more general set of primitive nodes can be found that enables this to be expressed without needing a special ParallelAssignmentNode. I may revisit this after seeing what needs to be done for the with statement. > and in a way which is unnatural for the parser, with a "local" view Not sure what you mean by that, either. The parser has to know that the ParallelAssignmentNode exists, and what kind of subnodes it expects. But whatever phase performed the transformation would need the same knowledge. Neither of them have to know how those nodes work internally. Sorry this reply was so long, I didn't have time to write a shorter one. :-) -- Greg _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
