Hello,
I've done some benchmarking of ARM/LLVM registerised build in comparison
with ARM unregisterised builds using either C code gen or LLVM code gen.
The raw data are here:
http://www.gardas.roznovan.cz/ghc-on-arm/comparison-unreg-viaC-llvm-to-reg-llvm.html
From the performance point of view our registerised build is pure
fiasco so far, but as I'm thinking how to solve it, the number of
allocations on registerised build hit me to eyes. Example: the first
column is unregisterised viaC as a reference, second is unregisterised
LLVM and third is registerised LLVM. I've chosen benchmarks from the
oposit side of the spectrum and also running considerable time.
Allocations:
constraints 1071454876 +0.0% +118.5%
wheel-sieve2 44021556 +0.0% +417.5%
RunTime:
constraints 29.80 -6.2% -15.0%
wheel-sieve1 9.93 -14.1% +226.2%
My idea is: due to excess number of allocations on registerised build,
it looses in performance race with both unregisterised builds. Where the
number of allocations is not that different, then it wins. Also unreg
viaC versus unreg LLVM shows that LLVM generates faster ARM code than
GNU C, so expectation is it should also do the same on registerised build.
My question is: is it normal that registerised build allocates that much
or is this some issue with our ARM port? If the later one, do you have
any idea where to look for it?
Full nofib comparison table is here for your reference:
http://www.gardas.roznovan.cz/ghc-on-arm/comparison-unreg-viaC-llvm-to-reg-llvm.html
Thanks!
Karel
_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc