#7436: Derived Foldable and Traversable instances become extremely inefficient 
due
to eta-expansion
---------------------------------+------------------------------------------
    Reporter:  shachaf           |       Owner:                         
        Type:  bug               |      Status:  new                    
    Priority:  normal            |   Milestone:                         
   Component:  Compiler          |     Version:  7.6.1                  
    Keywords:                    |          Os:  Unknown/Multiple       
Architecture:  Unknown/Multiple  |     Failure:  Runtime performance bug
  Difficulty:  Unknown           |    Testcase:                         
   Blockedby:                    |    Blocking:                         
     Related:                    |  
---------------------------------+------------------------------------------

Comment(by shachaf):

 I'd think what I said above about the eta-expansion happening at each
 iteration
 is sufficient to explain it.

 The reduction should happen something like this (not all at once, of
 course,
 but this is enough to demonstrate what I mean):

 {{{
 slow S Z
   $ S (S (S (S Z)))

 S $ slow (\a -> S a) Z
   $ S (S (S Z))

 S $ (\a -> S a)
   $ slow (\b -> (\a -> S a) b) Z
   $ S (S Z)

 S $ (\a -> S a)
   $ (\b -> (\a -> S a) b)
   $ slow (\c -> (\b -> (\a -> S a) b)) Z
   $ (S Z)

 S $ (\a -> S a)
   $ (\b -> (\a -> S a) b)
   $ (\c -> (\b -> (\a -> S a) b))
   $ slow (\d -> (\c -> (\b -> (\a -> S a) b))) Z
   $ Z

 S $ (\a -> S a)
   $ (\b -> (\a -> S a) b)
   $ (\c -> (\b -> (\a -> S a) b))
   $ Z
 }}}

 So we'd expect a tower of `1+2+3+...+n-1` reductions of the lambda.

 The profiling results seem to agree (sum [1..100000-1] = 4999950000):

 {{{
 shachaf@carbon:~/9$ ghc -prof -fprof-auto -rtsopts -O2 C.hs && time ./C
 +RTS -p
 ()

 real    1m3.477s
 user    1m3.372s
 sys     0m0.004s
 shachaf@carbon:~/9$ cat C.prof
         Wed Nov 21 06:21 2012 Time and Allocation Profiling Report
 (Final)

            C +RTS -p -RTS

         total time  =       49.32 secs   (49319 ticks @ 1000 us, 1
 processor)
         total alloc =  11,249,528 bytes  (excludes profiling overheads)

 COST CENTRE MODULE    %time %alloc

 slow.\      Main      100.0   14.2
 slow        Main        0.0   49.8
 mkNat       Main        0.0   35.6


                                                        individual
 inherited
 COST CENTRE  MODULE                  no.     entries  %time %alloc   %time
 %alloc

 MAIN         MAIN                     45           0    0.0    0.0   100.0
 100.0
  main        Main                     91           0    0.0    0.1     0.0
 0.1
  CAF         Main                     89           0    0.0    0.0   100.0
 99.6
   main       Main                     90           1    0.0    0.0   100.0
 99.6
    mkNat     Main                     95      100001    0.0   35.6     0.0
 35.6
    slow      Main                     94      100001    0.0   49.8   100.0
 64.0
     slow.\   Main                     96  4999950000  100.0   14.2   100.0
 14.2
    unNat     Main                     93      100001    0.0    0.0     0.0
 0.0
    main.n    Main                     92           1    0.0    0.0     0.0
 0.0
  CAF         GHC.IO.Handle.FD         88           0    0.0    0.3     0.0
 0.3
  CAF         GHC.Show                 87           0    0.0    0.0     0.0
 0.0
  CAF         GHC.Conc.Signal          86           0    0.0    0.0     0.0
 0.0
  CAF         GHC.IO.Encoding          76           0    0.0    0.0     0.0
 0.0
  CAF         GHC.IO.Encoding.Iconv    75           0    0.0    0.0     0.0
 0.0
 }}}

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/7436#comment:5>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
Glasgow-haskell-bugs@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to