Re: [mono-cecil] Code Analysis Algorithm

Daniel Grunwald Wed, 16 Mar 2011 06:37:33 -0700

On 3/16/2011 14:18, Mort wrote:
> ...
> However these are some more complex situations which I would like to
> fix. My question is 'Does Mono-Cecil have any way of linking
> instructions that pop data to the instruction(s) that pushed it?'.
No, Mono.Cecil does not have that built-in.
> I found in the Cecil code some stack code which is used to calculate
> the maximum size of the stack. I noticed that the Cecil stack code
> resets the stack size to 0 after a branch instruction. Is this
> actually what happens or is that only relevant as far as calculating
> the maxstack size?
The CLR puts some limits on the code in order to ensure that a single
pass through the IL code is sufficient to analyze the stack size (and
types of stack elements).
The stack does not have to be empty during branches, it only must be
empty when there's an instruction which wasn't reachable for the
analysis for far, e.g. in the following example:
...
br loopHead
loopBody:  // stack must be empty here (because instruction seems to be
unreachable for the simple single-pass analysis)
...
loopHead: // stack does not have to be empty here
br.true loopBody


> I would like to be able to take an instruction that pops something and
> maps it back to the instruction has pushed the item onto the stack.
> I'm having trouble sorting this out when branches are involved.
It is not always possible to create a straight-forward mapping. See
http://community.sharpdevelop.net/blogs/dsrbecky/archive/2011/02/19/ilspy-ensuring-correctness.aspx
For analysis, I found it helpful to introduce temporary variables for
stack locations. Parameters, local variables and stack locations can be
all represented using a single type of virtual variable; this makes
analysis much easier.
> ...
>
> Any advice would be appreciated.
My advice for an optimizer would be: use a single representation for all
variables (parameters, local variables, stack locations). Then transform
those variables into Static Single Assignment (SSA) form (at least where
possible, it might be impossible in some cases when the address of a
variable is taken [ldloca]).
SSA is highly useful in optimizers as it makes optimizations related to
data-flow trivial.

I implemented an SSA transform for a code analyzer I wrote some years
ago. The code is now available as part of ILSpy at
https://github.com/icsharpcode/ILSpy/tree/master/ICSharpCode.Decompiler/FlowAnalysis
(although ILSpy itself only uses the control flow graph, not the SSA
transform).
In fact, there are even a few (very simple) optimizations implemented in
SsaOptimization.cs.

Here's an example of a control flow graph with the IL transformed in to
SSA: http://www.danielgrunwald.de/coding/null/ForEach.ssa.png
That graph was created with
ControlFlowGraphBuilder.copyFinallyBlocks=true, which creates copies of
finally-blocks to separate the regular-exit and exceptional-exit cases.
This is useful for analysis, but I guess it isn't helpful in an
optimizer as you need to output the optimized code as a single finally
block again.

Daniel

signature.asc
Description: OpenPGP digital signature

Re: [mono-cecil] Code Analysis Algorithm

Reply via email to