marlowsd: >>> >>> Simon Marlow has recently fixed FP performance for modern x86 chips in >>> the native code generator in the HEAD. That was the last reason we know >>> of to prefer via-C to the native code generators. But before we start >>> the removal process, does anyone know of any other problems with the >>> native code generators that need to be fixed first? >>> >> >> Do we have the blessing of the DPH team, wrt. tight, numeric inner loops? >> >> As recently as last year -fvia-C -optc-O3 was still useful for some >> microbenchmarks -- what's changed in that time, or is expected to change? > > If you have benchmarks that show a significant difference, I'd be > interested to see them.
I've attached an example where there's a 40% variation (and it's a floating point benchmark). Roman would be seeing similar examples in the vector code. I'm all in favor of dropping the C backend, but I'm also wary that we don't have benchmarks to know what difference it is making. Here's a simple program testing a tight, floating point loop: import Data.Array.Vector import Data.Complex main = print . sumU $ replicateU (1000000000 :: Int) (1 :+ 1 ::Complex Double) Compiled with ghc 6.12, uvector-0.1.1.0 on a 64 bit linux box. The -fvia-C -optc-O3 is about 40% faster than -fasm. How does it fair with the new sse patches? I've attached the assembly below for each case.. -- Don ------------------------------------------------------------------------ Fastest. 2.17s. About 40% faster than -fasm $ time ./sum-complex 1.0e9 :+ 1.0e9 ./sum-complex 2.16s user 0.00s system 99% cpu 2.175 total Main_mainzuzdszdwfold_info: leaq 32(%r12), %rax movq %r12, %rdx cmpq 144(%r13), %rax movq %rax, %r12 ja .L4 cmpq $1000000000, %r14 je .L9 .L5: movsd .LC0(%rip), %xmm0 leaq 1(%r14), %r14 addsd %xmm0, %xmm5 addsd %xmm0, %xmm6 movq %rdx, %r12 jmp Main_mainzuzdszdwfold_info .L4: leaq -24(%rbp), %rax movq $32, 184(%r13) movq %rax, %rbp movq %r14, (%rax) movsd %xmm5, 8(%rax) movsd %xmm6, 16(%rax) movl $Main_mainzuzdszdwfold_closure, %ebx jmp *-8(%r13) .L9: movq $ghczmprim_GHCziTypes_Dzh_con_info, -24(%rax) movsd %xmm5, -16(%rax) movq $ghczmprim_GHCziTypes_Dzh_con_info, -8(%rax) leaq 25(%rdx), %rbx movsd %xmm6, 32(%rdx) leaq 9(%rdx), %r14 jmp *(%rbp) ------------------------------------------------------------------------ Second, 2.34s $ ghc-core sum-complex.hs -O2 -fvia-C -optc-O3 $ time ./sum-complex 1.0e9 :+ 1.0e9 ./sum-complex 2.33s user 0.01s system 99% cpu 2.347 total Main_mainzuzdszdwfold_info: leaq 32(%r12), %rax cmpq 144(%r13), %rax movq %r12, %rdx movq %rax, %r12 ja .L4 cmpq $100000000, %r14 je .L9 .L5: movsd .LC0(%rip), %xmm0 leaq 1(%r14), %r14 movq %rdx, %r12 addsd %xmm0, %xmm5 addsd %xmm0, %xmm6 jmp Main_mainzuzdszdwfold_info .L4: leaq -24(%rbp), %rax movq $32, 184(%r13) movl $Main_mainzuzdszdwfold_closure, %ebx movsd %xmm5, 8(%rax) movq %rax, %rbp movq %r14, (%rax) movsd %xmm6, 16(%rax) jmp *-8(%r13) .L9: movq $ghczmprim_GHCziTypes_Dzh_con_info, -24(%rax) movsd %xmm5, -16(%rax) movq $ghczmprim_GHCziTypes_Dzh_con_info, -8(%rax) leaq 25(%rdx), %rbx movsd %xmm6, 32(%rdx) leaq 9(%rdx), %r14 jmp *(%rbp) ------------------------------------------------------------------------ Native codegen, 3.57s ghc 6.12 -fasm -O2 $ time ./sum-complex 1.0e9 :+ 1.0e9 ./sum-complex 3.57s user 0.01s system 99% cpu 3.574 total Main_mainzuzdszdwfold_info: .Lc1i7: addq $32,%r12 cmpq 144(%r13),%r12 ja .Lc1ia movq %r14,%rax cmpq $100000000,%rax jne .Lc1id movq $ghczmprim_GHCziTypes_Dzh_con_info,-24(%r12) movsd %xmm5,-16(%r12) movq $ghczmprim_GHCziTypes_Dzh_con_info,-8(%r12) movsd %xmm6,(%r12) leaq -7(%r12),%rbx leaq -23(%r12),%r14 jmp *(%rbp) .Lc1ia: movq $32,184(%r13) movl $Main_mainzuzdszdwfold_closure,%ebx addq $-24,%rbp movq %r14,(%rbp) movsd %xmm5,8(%rbp) movsd %xmm6,16(%rbp) jmp *-8(%r13) .Lc1id: movsd %xmm6,%xmm0 addsd .Ln1if(%rip),%xmm0 movsd %xmm5,%xmm7 addsd .Ln1ig(%rip),%xmm7 leaq 1(%rax),%r14 movsd %xmm7,%xmm5 movsd %xmm0,%xmm6 addq $-32,%r12 jmp Main_mainzuzdszdwfold_info _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users