#6110: Data.Vector.Unboxed performance regression of 7.4.1 relative to 7.0.4
-------------------------------------+--------------------------------------
Reporter: mdgabriel | Owner:
Type: bug | Status: new
Priority: normal | Component: Compiler
Version: 7.4.1 | Keywords: Vector Performance
Regression
Os: Linux | Architecture: x86
Failure: Runtime performance bug | Testcase:
Blockedby: | Blocking:
Related: |
-------------------------------------+--------------------------------------
== Problem ==
Severe Data.Vector.Unboxed performance regression in 7.4.1 relative to
7.0.4:[[BR]]
(Sum GHC 7.4.1)/(Sum GHC 7.0.4) ~ 2.4[[BR]]
== System ==
GNU/Linux 3.2.0-24-generic 38-Ubuntu i386[[BR]]
== Compilers ==
GHC 7.0.4[[BR]]
GHC 7.4.1[[BR]]
GCC 4.6.3 for a baseline[[BR]]
== Main.hs ==
{{{
module Main where
import System.Environment (getArgs)
import qualified Data.Vector.Unboxed as U (generate, sum)
main :: IO ()
main = do args <- getArgs
if length args == 1
then putSum (read (head args) :: Int)
else error "need a count operand"
putSum :: Int -> IO ()
putSum cnt = let v = U.generate cnt (\i -> fromIntegral i :: Double)
s = U.sum v
in putStrLn ("Sum="++show s)
}}}
== GHC compilation ==
> ghc --version[[BR]]
7.4.1[[BR]]
> ghc -O2 -Wall --make -o sum Main.hs[[BR]]
[[BR]]
> ghc --version[[BR]]
7.0.4[[BR]]
> ghc -O2 -Wall --make -o sum Main.hs[[BR]]
== Baseline csum.c ==
{{{
#include <libgen.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
unsigned long i, size;
double tot=0;
if (argc != 2)
{
(void)fprintf(stderr, "usage: %s size\n", basename(argv[0]));
return(1);
}
size = atol(argv[1]);
for(i = 0; i < size; i++) tot += (double)i;
(void)printf("Sum=%.15e\n", tot);
return(0);
}
}}}
== GCC baseline compilation ==
> gcc --version[[BR]]
4.6.3[[BR]]
> gcc -O2 -Wall csum.c -o csum[[BR]]
== Data: time sum-7.0.4 n ==
n seconds[[BR]]
100000000 0.74[[BR]]
200000000 1.46[[BR]]
300000000 2.24[[BR]]
400000000 2.94[[BR]]
500000000 3.70[[BR]]
600000000 4.40[[BR]]
700000000 5.14[[BR]]
800000000 5.89[[BR]]
900000000 6.62[[BR]]
1000000000 7.34[[BR]]
== Data: time sum-7.4.1 n ==
n seconds[[BR]]
100000000 1.74[[BR]]
200000000 3.49[[BR]]
300000000 5.24[[BR]]
400000000 6.98[[BR]]
500000000 8.73[[BR]]
600000000 10.51[[BR]]
700000000 12.22[[BR]]
800000000 13.96[[BR]]
900000000 15.75[[BR]]
1000000000 17.51[[BR]]
== Data: time csum-4.6.3 n ==
n seconds[[BR]]
100000000 1.04[[BR]]
200000000 2.10[[BR]]
300000000 3.12[[BR]]
400000000 4.19[[BR]]
500000000 5.23[[BR]]
600000000 6.26[[BR]]
700000000 7.32[[BR]]
800000000 8.37[[BR]]
900000000 9.41[[BR]]
1000000000 10.45[[BR]]
== Linear in n ==
y is in seconds[[BR]]
[[BR]]
GHC 7.0.4: y = (0.73/10^8) * n + 0.03[[BR]]
GCC 4.6.3: y = (1.04/10^8) * n + 0.03[[BR]]
GHC 7.4.1: y = (1.75/10^8) * n - 0.01[[BR]]
Severe performance regression:[[BR]]
GHC 7.4.1/GHC 7.0.4 ~ 1.75/0.73 ~ 2.4[[BR]]
== Notes ==
1/ I discovered the problem in a slightly more complicated case when I
recompiled a package that used some simple statisics. The sum of
[0..(n-1)] was the simplest case that I imagined to demonstrate the
problem.
2/ I tried a similar experiment with Data.List, Data.Array.Unboxed,
Data.Vector.Storable.MMap, and Foreign.Marshal.Alloc. In all cases,
the GHC 7.4.1 version was faster than the GHC 7.0.4 version.
3/ It is the same Data.Vector.Unboxed code in both cases compilied and
installed separately for each version of the GHC compiler. Thus, the
problem appears to be the interaction between Data.Vector.Unboxed and the
7.4.1 compiler that causes the performance regression.
4/ I am impressed that the GHC 7.0.4 sum is faster than the GCC 4.6.3 sum.
I expected it to be close, but not faster. Given this
impressive result, I certainly would hope that the same result can be
recovered once again.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/6110>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs