You must be using my old example, with gcc -O3 my new example does access
the array and is 20x faster than the ghc -O code. Scroll down for
"-->" that I inserted beside the instruction to see the array access.

        .file   "arrayloop.c"
        .version        "01.01"
gcc2_compiled.:
.text
        .p2align 2,0x90
.globl main
                .type            main,@function
main:
        pushl %ebp
        movl %esp,%ebp
        subl $80,%esp
        pushl %esi
        pushl %ebx
        xorl %ebx,%ebx
        xorl %ecx,%ecx
        leal -64(%ebp),%esi
        .p2align 2,0x90
.L6:
        movl %ecx,%edx
        testl %ecx,%ecx
        jge .L7
        leal 15(%ecx),%edx
.L7:
        andl $-16,%edx
        movl %ecx,%eax
        subl %edx,%eax
-->     addl (%esi,%eax,4),%ebx
        incl %ecx
        cmpl $99999999,%ecx
        jle .L6
        movl %ebx,%eax
        popl %ebx
        popl %esi
        leave
        ret
.Lfe1:
                .size            main,.Lfe1-main
        .ident  "GCC: (GNU) c 2.95.4 20020320 [FreeBSD]"

On Thu, 26 Jun 2003, Hal Daume wrote:

> This has been hinted to before: gcc gets rid of the loop.
>
> bash-2.05b$ gcc -O3 arr.c -S -o arr.s
> bash-2.05b$ cat arr.s
>       .file   "arr.c"
>       .def    ___main;        .scl    2;      .type   32;     .endef
>       .text
>       .align 2
>       .align 16
> .globl _main
>       .def    _main;  .scl    2;      .type   32;     .endef
> _main:
>       pushl   %ebp
>       movl    %esp, %ebp
>       subl    $72, %esp
>       xorl    %eax, %eax
>       andl    $-16, %esp
>       call    __alloca
>       call    ___main
>       xorl    %eax, %eax
>       .align 16
> L7:
>       incl    %eax
>       cmpl    $100000000, %eax
>       jle     L7
>       leave
>       ret
>
> the inside of the loop is exactly:
>
> L7:
>       incl    %eax
>       cmpl    $100000000, %eax
>       jle     L7
>
> which doesn't read the array!  essentially you are being unfare to ghc
> by making it seq the array element, but not doing the same to GCC.  even
> with -O0, gcc gets rid of the loop body.
>
> one solution: add in the assembly to actually read the array (i'm too
> lazy and busy to do this).
>
>  - hal
>
>
> --
>  Hal Daume III                                   | [EMAIL PROTECTED]
>  "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume
>
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of Lex Stein
> > Sent: Thursday, June 26, 2003 3:46 PM
> > To: Sven Panne
> > Cc: [EMAIL PROTECTED]
> > Subject: Re: haskell array access
> >
> >
> >
> > Great, thanks. Those suggestions narrow the gap from GHC -O being 330x
> > slower than GCC -O3 to it being 20x slower. Here are the new results:
> >
> > gcc -O3      0.54s
> > ocamlopt     1.11s
> > ghc -O      10.76s
> > ocamlc      14.10s
> >
> > GHC is still pretty slow for native x86 instruction code. Is
> > there any way
> > to further explain the performance gap ? (new code below)
> >
> > Thanks!
> > Lex
> >
> > Haskell (GHC) source:
> >
> > import Array
> > k :: Array Int Int
> > k = Array.array (0,15) [(i, i+1) | i <- [0 .. 15] ]
> > acc :: Int -> Int
> > acc 0 = 0
> > acc n = seq (k Array.! (n `mod` 16)) (acc (n-1))
> > main = do
> >     print (acc 100000000)
> >
> > Caml test source:
> >
> > let a = Array.make 16 0;;
> > let rec do1 i z =
> >     if (i>100000000) then z else
> >         do1 (i+1) (z + a.(i mod 16));;
> > do1 0 0;;
> >
> > C (gcc) test source:
> >
> > int main () {
> >     int i, k = 0;
> >     int a [16];
> >     for (i=0; i<100000000; i++) {
> >         k += a [i % 16];
> >     }
> >     return (k);
> > }
> >
> >
> > On Fri, 27 Jun 2003, Sven Panne wrote:
> >
> > > Well, part of the answer is definitely that the Haskell
> > program is the
> > > *only* one which really uses the array elements. :-) I
> > guess that the
> > > compilers for the other languages simply remove the array
> > access from
> > > the generated code (gcc definitely does, only an empty loop
> > remains).
> > >
> > > Another reason is that the type signatures are missing, so
> > Integer is
> > > used instead of Int, which is a bit unfair. Adding
> > >
> > >     k :: Array Int Int
> > >     acc :: Int -> Int
> > >
> > > cuts down the time by more than factor 5 on my machine.
> > >
> > > Cheers,
> > >     S.
> > >
> > >
> > > _______________________________________________
> > > Haskell-Cafe mailing list
> > > [EMAIL PROTECTED]
> > > http://www.haskell.org/mailman/listinfo/haskell-cafe
> > >
> > _______________________________________________
> > Haskell-Cafe mailing list
> > [EMAIL PROTECTED]
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> _______________________________________________
> Haskell-Cafe mailing list
> [EMAIL PROTECTED]
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
_______________________________________________
Haskell-Cafe mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to