Wed Apr 30 10:41:26 PDT 2008  Don Stewart <[EMAIL PROTECTED]>
  * Add RULES for realToFrac from Int.
  
      {-# RULES
      "realToFrac/Int->Double"    realToFrac   = int2Double
      "realToFrac/Int->Float"     realToFrac   = int2Float
        #-}
  
  Note that this only matters for realToFrac. If you've been using
  fromIntegral to promote Int to Doubles, things should be fine as they are.
  
  The following program, using stream fusion to eliminate arrays:
  
      import Data.Array.Vector
  
      n = 40000000
  
      main = do
            let c = replicateU n (2::Double)
                a = mapU realToFrac (enumFromToU 0 (n-1) ) :: UArr Double
            print (sumU (zipWithU (*) c a))
  
  Yields this loop body without the RULE:
  
      case $wtoRational sc_sY4 of ww_aM7 { (# ww1_aM9, ww2_aMa #) ->
      case $wfromRat ww1_aM9 ww2_aMa of tpl_X1P { D# ipv_sW3 ->
      Main.$s$wfold
        (+# sc_sY4 1)
        (+# wild_X1i 1)
        (+## sc2_sY6 (*## 2.0 ipv_sW3))
  
  And with the rule:
  
     Main.$s$wfold
        (+# sc_sXT 1)
        (+# wild_X1h 1)
        (+## sc2_sXV (*## 2.0 (int2Double# sc_sXT)))
  
  The running time of the program goes from 120 seconds to 0.198 seconds
  with the native backend, and 0.143 seconds with the C backend.
  
  
  And just so I don't forget, here's the difference in resulting
  assembly (x86_64), between the native code generator, and the
  C backend. 
  
  -fasm
  
      Main_zdszdwfold_info:
        movq %rdi,%rax
        cmpq $40000000,%rax
        jne .LcZK
        jmp *(%rbp)
      .LcZK:
        cmpq $39999999,%rsi
        jg .LcZN
        cvtsi2sdq %rsi,%xmm0
        mulsd .LnZP(%rip),%xmm0
        movsd %xmm5,%xmm7
        addsd %xmm0,%xmm7
        incq %rax
        incq %rsi
        movq %rax,%rdi
        movsd %xmm7,%xmm5
        jmp Main_zdszdwfold_info
  
  With the C backend we get the even better assembly, (-fvia-C -optc-O3)
  
      Main_zdszdwfold_info:
          cmpq    $40000000, %rdi
          je  .L9
      .L5:
          cmpq    $39999999, %rsi
          jg  .L9
          cvtsi2sdq   %rsi, %xmm0
          leaq    1(%rdi), %rdi
          addq    $1, %rsi
          addsd   %xmm0, %xmm0
          addsd   %xmm0, %xmm5
          jmp Main_zdszdwfold_info
      .L9:
          jmp *(%rbp)
  
  So might make a useful test once the native codegen project starts up.
  
  

    M ./GHC/Float.lhs +2

View patch online:
http://darcs.haskell.org/packages/base/_darcs/patches/20080430174126-cba2c-4ce3875cd2f094dba4f58778642ee0799e725222.gz
_______________________________________________
Cvs-libraries mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-libraries

Reply via email to