#6110: Data.Vector.Unboxed performance regression of 7.4.1 relative to 7.0.4
-------------------------------------+--------------------------------------
 Reporter:  mdgabriel                |          Owner:                          
     
     Type:  bug                      |         Status:  new                     
     
 Priority:  normal                   |      Component:  Compiler                
     
  Version:  7.4.1                    |       Keywords:  Vector Performance 
Regression
       Os:  Linux                    |   Architecture:  x86                     
     
  Failure:  Runtime performance bug  |       Testcase:                          
     
Blockedby:                           |       Blocking:                          
     
  Related:                           |  
-------------------------------------+--------------------------------------
 == Problem ==

 Severe Data.Vector.Unboxed performance regression in 7.4.1 relative to
 7.0.4:[[BR]]
 (Sum GHC 7.4.1)/(Sum GHC 7.0.4) ~ 2.4[[BR]]

 == System ==

 GNU/Linux 3.2.0-24-generic 38-Ubuntu i386[[BR]]

 == Compilers ==

 GHC 7.0.4[[BR]]
 GHC 7.4.1[[BR]]
 GCC 4.6.3 for a baseline[[BR]]

 == Main.hs ==

 {{{
 module Main where

 import System.Environment (getArgs)
 import qualified Data.Vector.Unboxed as U (generate, sum)

 main :: IO ()
 main = do args <- getArgs
           if length args == 1
             then putSum (read (head args) :: Int)
             else error "need a count operand"

 putSum :: Int -> IO ()
 putSum cnt = let v = U.generate cnt (\i -> fromIntegral i :: Double)
                  s = U.sum v
              in putStrLn ("Sum="++show s)

 }}}

 == GHC compilation ==

 > ghc --version[[BR]]
 7.4.1[[BR]]
 > ghc -O2 -Wall --make -o sum Main.hs[[BR]]
 [[BR]]
 > ghc --version[[BR]]
 7.0.4[[BR]]
 > ghc -O2 -Wall --make -o sum Main.hs[[BR]]

 == Baseline csum.c ==

 {{{

 #include <libgen.h>
 #include <stdio.h>
 #include <stdlib.h>

 int main(int argc, char **argv)
 {
   unsigned long i, size;
   double tot=0;

   if (argc != 2)
     {
       (void)fprintf(stderr, "usage: %s size\n", basename(argv[0]));
       return(1);
     }

   size = atol(argv[1]);

   for(i = 0; i < size; i++) tot += (double)i;

   (void)printf("Sum=%.15e\n", tot);

   return(0);
 }

 }}}

 == GCC baseline compilation ==

 > gcc --version[[BR]]
 4.6.3[[BR]]
 > gcc -O2 -Wall csum.c -o csum[[BR]]

 == Data: time sum-7.0.4 n ==

 n          seconds[[BR]]
 100000000  0.74[[BR]]
 200000000  1.46[[BR]]
 300000000  2.24[[BR]]
 400000000  2.94[[BR]]
 500000000  3.70[[BR]]
 600000000  4.40[[BR]]
 700000000  5.14[[BR]]
 800000000  5.89[[BR]]
 900000000  6.62[[BR]]
 1000000000 7.34[[BR]]

 == Data: time sum-7.4.1 n ==

 n          seconds[[BR]]
 100000000  1.74[[BR]]
 200000000  3.49[[BR]]
 300000000  5.24[[BR]]
 400000000  6.98[[BR]]
 500000000  8.73[[BR]]
 600000000  10.51[[BR]]
 700000000  12.22[[BR]]
 800000000  13.96[[BR]]
 900000000  15.75[[BR]]
 1000000000 17.51[[BR]]

 == Data: time csum-4.6.3 n ==

 n          seconds[[BR]]
 100000000  1.04[[BR]]
 200000000  2.10[[BR]]
 300000000  3.12[[BR]]
 400000000  4.19[[BR]]
 500000000  5.23[[BR]]
 600000000  6.26[[BR]]
 700000000  7.32[[BR]]
 800000000  8.37[[BR]]
 900000000  9.41[[BR]]
 1000000000 10.45[[BR]]

 == Linear in n ==

 y is in seconds[[BR]]
 [[BR]]
 GHC 7.0.4: y = (0.73/10^8) * n + 0.03[[BR]]
 GCC 4.6.3: y = (1.04/10^8) * n + 0.03[[BR]]
 GHC 7.4.1: y = (1.75/10^8) * n - 0.01[[BR]]

 Severe performance regression:[[BR]]
 GHC 7.4.1/GHC 7.0.4 ~ 1.75/0.73 ~ 2.4[[BR]]

 == Notes ==

 1/ I discovered the problem in a slightly more complicated case when I
 recompiled a package that used some simple statisics.  The sum of
 [0..(n-1)] was the simplest case that I imagined to demonstrate the
 problem.

 2/ I tried a similar experiment with Data.List, Data.Array.Unboxed,
 Data.Vector.Storable.MMap, and Foreign.Marshal.Alloc.  In all cases,
 the GHC 7.4.1 version was faster than the GHC 7.0.4 version.

 3/ It is the same Data.Vector.Unboxed code in both cases compilied and
 installed separately for each version of the GHC compiler.  Thus, the
 problem appears to be the interaction between Data.Vector.Unboxed and the
 7.4.1 compiler that causes the performance regression.

 4/ I am impressed that the GHC 7.0.4 sum is faster than the GCC 4.6.3 sum.
 I expected it to be close, but not faster.  Given this
 impressive result, I certainly would hope that the same result can be
 recovered once again.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/6110>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler

_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to