Dag Sverre Seljebotn wrote:
> Because LetNodes can be introduced after what is currently the
> allocate_temps phase, the allocate_temps phase must then be moved (if we
> keep it; I'm basically saying "turn it into a pre-generation transform).
One of the undesirable things about the current scheme is that
there is a subtle coupling between the temp allocation and
code generation phase. Not only is it a source of error when
modifying things, it also leads to some bizarre-looking
code in the allocate_temps phase, such as allocating a temp
and then immediately releasing it, which looks redundant,
but is actually necessary.
Simply moving the allocate_temps phase to a different place
wouldn't do anything to remove the dependency. If it's folded
into the code generation phase, however, the general pattern
would become
1. Generate code to evaluate subexpressions.
2. Allocate temp for the result of this node if needed.
3. Generate code to calculate result based on results
of subexpressions.
4. Release temps holding results of subexpressions.
which actually makes sense.
Note that for most expression nodes, most of this pattern
could be implemented by a general method in ExprNode which
calls a node-specific method to implement (3) (as is done now,
except it's split between the temp allocation and code
generation phases).
> - During analysis/transformation, you know "what you are doing" and how
> many temps you will need.
I'm not convinced you really need to know about temps during
analysis. To my mind, temps are analogous to registers in a
conventional compiler -- something very low-level that you
only deal with late in the pipeline. Tree transformations
should operate at a higher level of abstraction.
The abstraction I would choose is "extra local variable",
which is what the LocalNode represents. Whether it uses
the same mechanism as that for the intermediate results of
expressions is an implementation detail, as is whether it
makes use of an Entry in the symbol table.
> LetNode would be one particular instance of this pattern; with
> your suggestion it would have "custom" (non-reusable, in one sense of the
> word) code to remember which temps it should allocate;
Not sure what you mean by that. The way you re-use it is
by using a LetNode wherever you want extra locals.
I'm expecting that various things such as for-loops will
be implementable using LetNodes with suitable internal
plumbing, so the custom nodes for these statements would
disappear, along with the custom code in them implementing
the temp allocation and code generation patterns. That's
an increase in re-use, to my way of thinking.
> In a similar way, ExprNode
> would need to remember that it should allocate its result as a temporary,
> and so on.
Still not following. If temp allocation is folded into code
generation, I don't think it will be necessary for a node
to remember when it's using a temp across passes -- it can
be figured out at the point where that information is
needed (i.e. at (2) above).
Some nodes might still want to do so, since it usually
depends on things that are figured out during type
analysis, such as whether the result is a Python reference.
But this would be a private matter for each node to decide.
I don't see why any other node should need to know during
tree transformations.
> Entry seems like a more natural concept for this to me -- because it is a
> more "refined", more basic concept of a handle to a variable in the scope.
I don't see what you gain by doing this. Under my
proposal, there would only be one kind of thing for tree
transformations to deal with, i.e. Nodes. Under yours,
there would be two kinds of things -- Nodes and Entries.
Also, I think it unnecessarily limits the strategy for
implementing temps. The only reason the temp list is kept
in the symbol table at the moment is because that's the
only piece of state being passed around during the phase
when temp allocation is being done.
If I move temp allocation into the code generation phase,
instead of the symbol table, the object being passed around
is the code object. So I would move the temp list from the
symbol table to the code object, at which point involving
an Entry makes little sense.
> one thing one might
> want to do is unrolling nested expressions. So you have a transform that
> basically unpacks every nested expression into assignments to temporaries
I'm not sure this would be a good thing to do. Currently,
temps are only used when they're really needed -- most
expressions involving non-Python results just turn into
an equivalent nested C expression in the generated code.
Putting every result into a temp would lead to a much
greater use of temps, most of them unnecessary. The C
compiler *might* be able to optimise them away, but I
wouldn't like to rely on it. I'd worry about many things
that would normally be kept in registers spilling over
into local variables.
> (this might be a good pre-step to some code analysis-algorithms that would
> benefit from being able to insert if-statements in the middle of nested
> expressions for instance)
I'd address that by adding a conditional expression node.
More generally, there might be benefit in removing or
reducing the distinction between statement and expression
nodes. Instead of trying to turn all expressions into
statements, try to turn all statements into expressions.
Part of this might involve implementing if-elsif-else
statements as a chain of conditional expression nodes.
> Do you have any "commits" or similar that would contain these
> changes relatively self-contained,
I can't remember exactly; have a look at the mercurial
commit comments. I've been trying to keep the changes
lined up with commits, but I don't always completely
succeed.
> Unless you cache the function bodies it would need to be done right prior,
> so that you know which local variables to declare.
I would split the code stream for the function body
in two, one for the declarations and one for the statements.
That wouldn't be hard to do -- I'm already doing something
similar with the module-level declarations (the 'h'
attribute of the code object refers to another code
object that collects the declarations in a StringIO).
--
Greg
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev