Hi Armin,

Armin Rigo wrote:

> I wonder how important this is at the moment.  Maybe the .NET JIT
> compiler is good enough to remove all this.  How does the resulting
> machine code look like?

I have not tried the CLR by Microsoft yet but in this stage I'm using mono under linux, just because I'd like to stay in windows as less as possibile ;-).

BTW, mono doesn't seems smart enough to optimize the code; consider the following IL methods: the first is generated by my compiler, the second is written by hand:

.method static public int32 slow(int32 a_1, int32 b_1) il managed
{
    .locals (int32 v6, int32 v12)

block0:
    ldarg.s 'a_1'
    ldarg.s 'b_1'
    add
    stloc.s 'v12'
    ldloc.s 'v12'
    stloc.s 'v6'
    br.s block1

block1:
    ldloc.s 'v6'
    ret
}

.method static public int32 fast(int32 a_1, int32 b_1) il managed
{
    ldarg.s 'a_1'
    ldarg.s 'b_1'
    add
    ret
}

I used mono's ahead-of-time compiler with all optimizations enabled, then I disassembled the result with "objdump -d"; here is an extract of the output:

Disassembly of section .text:

000004f0 <methods>:
 4f0:   55                      push   %ebp
 4f1:   8b ec                   mov    %esp,%ebp
 4f3:   8b 45 08                mov    0x8(%ebp),%eax
 4f6:   03 45 0c                add    0xc(%ebp),%eax
 4f9:   c9                      leave
 4fa:   c3                      ret
 4fb:   90                      nop
 4fc:   8d 74 26 00             lea    0x0(%esi,1),%esi
 500:   55                      push   %ebp
 501:   8b ec                   mov    %esp,%ebp
 503:   57                      push   %edi
 504:   8b 45 08                mov    0x8(%ebp),%eax
 507:   8b f8                   mov    %eax,%edi
 509:   03 7d 0c                add    0xc(%ebp),%edi
 50c:   8b c7                   mov    %edi,%eax
 50e:   8d 65 fc                lea    0xfffffffc(%ebp),%esp
 511:   5f                      pop    %edi
 512:   c9                      leave
 513:   c3                      ret
 514:   8d 74 26 00             lea    0x0(%esi,1),%esi

I don't know x86 assembly very well (to be honest I don't know it at all ;-) but it seems that the 'fast' method spans from 4f0 to 4fc and the 'slow' methods spans from 500 to 514, and I think that the first should be more efficient than the latter, don't I?

I don't know how smart are the JIT and AOT shipped with MS CLR, but perhaps it is worth the pain of trying to generate smarter code, so that it can run efficiently even under mono.

Sure, it is not the task with the highest priority.

> It would probably make sense to write this as a function that takes a
> single block and produces a list of "complex expression" objects -- to
> be defined in a custom way, instead of trying to push this into the
> existing flow graph model.

I agree, I think this is the simplest solution.

ciao Anto

_______________________________________________
[email protected]
http://codespeak.net/mailman/listinfo/pypy-dev

Reply via email to