Re: [Mono-dev] [PATCH] A fastpath dead code elimination

2005-11-17 Thread Zoltan Varga
   Hi,

  The general JIT changes in the patch look harmless to me. There is a problem
tough: the patch makes tests/marhal2.exe crash when run with -O=all.

 Zoltan

On 11/15/05, Massimiliano Mantione [EMAIL PROTECTED] wrote:

 The attached patch implements deadce, but without using SSA.
 Instead, a simple alias analysis pass takes care of understanding
 when local variables [may] change value, and liveness computation
 has been modified to use this aliasing info if present.

 The alias analysis pass has O(n) complexity (n = code size), it is
 just a linear sweep on the list of instructions.
 Then, deadce operates one BB at a time, scanning the code linearly
 and using the liveness bits as start/end conditions (so it is O(n)
 as well).
 I positioned the deadce pass just between the liveness computation
 and the linear scan register allocator, so that:
 - it can reuse the liveness computation already computed for the
   linear regalloc, without requiring a new pass, and
 - it is downstream in the compilation pipeline, so it has the
   possibility to catch as many dead instructions as possible.

 The name fastpath deadce comes from Patrik Torstensson :-)
 He called it like this on IRC because it is on the fast compilation
 path (the one without SSA).

 But now, without other technical stuff, some numbers...
 First of all, the JIT overhead: the time to JIT mscorlib.dll with
 various compilation options, in seconds (time mono --compile-all):

 Options  -all  -all,ssa
 JIT time 1.05  1.35

 Here we see that just computing SSA has a 30% overhead with respect
 to a compilation with no optimizations at all.

 Options   default  ssa   deadce  ssa,deadce
 JIT time  1.32 1.55  1.451.6

 And here we see that the overhead of fastpath deadce on a default
 compilation is almost 10%, while ssa deadce has 21% (and note that
 in practice, since ssa does not work with aliasing, the new fastpath
 deadce is applied to more methods).

 So, the JIT overhead is reasonably small.

 And finally, some lovely benchmark...
 I tried SciMark, and here are the results:
 (MFlops, just the composite score)

 default205.92
 -O=ssa,deadce  210.61
 -O=deadce  207.50
 -O=consprop,copyprop,deadce207.72
 -O=consprop,copyprop,deadce,inline 209.82
 -O=inline  213.63
 -O=deadce,inline   214.52

 So, the idea is that fastpath deadce has some effect :-)
 It is not as effective as ssa deadce, but it is useful anyway and
 most of all it does not incur the same JIT time overhead.

 BTW, I spent a *lot* of time trying to be sure that those numbers
 are accurate. I had to kill evolution, gaim, beagle, the wireless
 card, and carefully monitor system load otherwise the results were
 erratic (just of a few percent, but this is what we're looking at).
 I then executed the benchmark several (10) times for every case
 and took the *best* score for each (with the idea that it is the
 one where other things interfered less), and anyway all the best
 scores were almost identical.

 A funny thing: these are the results I have on a SciMark.exe
 compiled with MS .NET 2.0:

 default: 136.65
 -O=consprop,copyprop,deadce: 137.52

 As you can see, they are *low*!!!
 I still have to understand what the 2.0 csc does exactly, but it
 looks a real disaster for us :-(
 Watch out this quirk when doing/examining benchmarks!

 OK, that's all... please approve the patch :-)
 We can always refine it later...

 Ciao,
   Massi



 ___
 Mono-devel-list mailing list
 Mono-devel-list@lists.ximian.com
 http://lists.ximian.com/mailman/listinfo/mono-devel-list




___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] [PATCH] A fastpath dead code elimination

2005-11-17 Thread Massimiliano Mantione
On Thu, 2005-11-17 at 13:23 +0100, Zoltan Varga wrote: 
   The general JIT changes in the patch look harmless to me. There is a problem
 tough: the patch makes tests/marhal2.exe crash when run with -O=all.

That looks more SSAPRE than the patch (or, better, it happens if
you specify -O=ssapre and disappears otherwise).
SSAPRE and fastdce *cannot* happen togather (if SSAPRE is in, then
SSA is built, and in the patch the JIT falls back to SSA deadce).
Anyway, I still have to fully debug this, but this doesn't seem to
involve fastdce here.

I'll investigate more...

Thanks,
  Massi


___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] [PATCH] A fastpath dead code elimination

2005-11-15 Thread Miguel de Icaza
Hello,

 The alias analysis pass has O(n) complexity (n = code size), it is
 just a linear sweep on the list of instructions.
 Then, deadce operates one BB at a time, scanning the code linearly
 and using the liveness bits as start/end conditions (so it is O(n)
 as well).

Is there is a threshold that will disable the optimization from running?

Miguel
___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] [PATCH] A fastpath dead code elimination

2005-11-15 Thread Massimiliano Mantione

On Tue, 2005-11-15 at 15:14 -0500, Miguel de Icaza wrote:
 Hello,
 
  The alias analysis pass has O(n) complexity (n = code size), it is
  just a linear sweep on the list of instructions.
  Then, deadce operates one BB at a time, scanning the code linearly
  and using the liveness bits as start/end conditions (so it is O(n)
  as well).
 
 Is there is a threshold that will disable the optimization from running?

Of course :-)
If you do not provide the deadce flag, it will not be used.
And even if we'll decide to enable it by default, it will be enough
to specify -O=-deadce to switch it off.

I didn't explain much of how the patch is coded, and/or how it can
be used, so here are some more details.



With this patch applied, -O=deadce activates fastpath deadce, unless
some other flag (for now either ssapre or abcrem) required ssa.
In this case, the traditional ssa deadce is used.
However, I added one new flag: ssa, which forces ssa construction.
If this is used, then deadce will default to the ssa one.
Note, however, that if deadce has been specified, and ssa is in
some way required (at least one of ssapre, abcrem or ssa was
specified), it can *still* happen that some method will fall back to
use fastpath deadce: this happens if ssa was disabled because of
aliasing.

Now, I *know* this sounds confusing at first.
Here's the rationale: the idea is that the JIT should do what the
user means with the flags.
So, if the user asks deadce, the JIT does its best to do deadce.
This is the same thing that happens now for consprop and copyprop,
which have both ssa and fast versions, which are not equally
effective but are called the same.
And as soon as I'll commit the HSSA patch, there will be one hssa
flag as well (at least it works so in my tree).
So one will be able to choose betwen fastpath, ssa and hssa
deadce, using -O=deadce, -O=ssa,deadce or -O=hssa,deadce.
Of course in the long run ssa will go away and will be replaced
by hssa, but at least for testing in between this can work.

The alternative would be having one different flag for each kind
of optimization, like fastdeadce, hssadeadce...
However, this way the number of flags will get too high (and also
ugly) with hssa.
And consprop and copyprop already work this way (we use the same
flags for the ssa and fast versions), so I decided go go on with
the same philosophy.



About the implementation... there's not much to say, the algorithms
are classical, no rocket science here ;-)

I tried not too mess up too much the liveness code, which is for
sure the more impacted by this patch.
I made so that the same code works both with and without aliasing
information.
When aliasing has been computed, it is possible to obtain one list
of affected variables for each MonoInst, and liveness uses that
to understand the effect of the inst on each variable.
On the other hand, without aliasing info, the code relied on the
presence of MONO_SSA_* flags to see what each instruction was doing
and each instruction could have effect on just *one* local.
In this case I decided to keep one list node as local variable, and
use it to build a one element list of affected variables.
This way all the logic of liveness has been transformed to use
those affected variables lists, and a small preamble takes the
real list if aliasing has been computed, or builds the one element
list otherwise.
So the patch is a bit hard to read, but it's not so terrible IMHO.
And yes, I know that I'm slowing liveness computation a bit this
way, but it's just a bit, and we have the advantage of not forking
the liveness code!

If anything isn't clear, just ask ;-)

Ciao,
  Massi


___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list


Re: [Mono-dev] [PATCH] A fastpath dead code elimination

2005-11-15 Thread Ben Maurer
On Tue, 2005-11-15 at 15:14 -0500, Miguel de Icaza wrote:
 Hello,
 
  The alias analysis pass has O(n) complexity (n = code size), it is
  just a linear sweep on the list of instructions.
  Then, deadce operates one BB at a time, scanning the code linearly
  and using the liveness bits as start/end conditions (so it is O(n)
  as well).
 
 Is there is a threshold that will disable the optimization from running?

Massi's code is inside:

if (cfg-opt  MONO_OPT_LINEARS) {

We already have:

if ((cfg-num_varinfo  2000)  !mono_compile_aot) {
/* 
 * we disable some optimizations if there are too many variables
 * because JIT time may become too expensive. The actual number 
needs 
 * to be tweaked and eventually the non-linear algorithms 
should be fixed.
 */
cfg-opt = ~ (MONO_OPT_LINEARS | MONO_OPT_COPYPROP | 
MONO_OPT_CONSPROP);
cfg-disable_ssa = TRUE;
}

Obviously though, that comment about tuning still applies.

-- Ben

___
Mono-devel-list mailing list
Mono-devel-list@lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list