Hello,

I've done some benchmarking of ARM/LLVM registerised build in comparison with ARM unregisterised builds using either C code gen or LLVM code gen. The raw data are here: http://www.gardas.roznovan.cz/ghc-on-arm/comparison-unreg-viaC-llvm-to-reg-llvm.html

From the performance point of view our registerised build is pure fiasco so far, but as I'm thinking how to solve it, the number of allocations on registerised build hit me to eyes. Example: the first column is unregisterised viaC as a reference, second is unregisterised LLVM and third is registerised LLVM. I've chosen benchmarks from the oposit side of the spectrum and also running considerable time.

Allocations:
constraints     1071454876      +0.0%   +118.5%
wheel-sieve2    44021556        +0.0%   +417.5%

RunTime:
constraints     29.80   -6.2%   -15.0%
wheel-sieve1    9.93    -14.1%  +226.2%


My idea is: due to excess number of allocations on registerised build, it looses in performance race with both unregisterised builds. Where the number of allocations is not that different, then it wins. Also unreg viaC versus unreg LLVM shows that LLVM generates faster ARM code than GNU C, so expectation is it should also do the same on registerised build.

My question is: is it normal that registerised build allocates that much or is this some issue with our ARM port? If the later one, do you have any idea where to look for it?

Full nofib comparison table is here for your reference:
http://www.gardas.roznovan.cz/ghc-on-arm/comparison-unreg-viaC-llvm-to-reg-llvm.html

Thanks!
Karel

_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Reply via email to