Hi,
Following up and the threads on haskell and haskell-cafe, I'd like
to gather ideas, comments and suggestions for a standarized Haskell
Benchmark Suite.
The idea is to gather a bunch of programs written in Haskell, and
which are representative for the Haskell community (i.e. apps,
libraries, ...). Following the example of SPEC (besides the fact
that the SPEC benchmarks aren't available for free), we would like
to build a database containing performance measurements for the
various benchmarks in the suite. Users should be able to submit
their results. This will hopefully stimulate people to take
performance into account when writing a Haskell program/library,
and will also serve as a valuable tool for further optimizing both
applications written in Haskell and the various Haskell compilers
out there (GHC, jhc, nhc, ...).
This thread is meant to gather peoples thought on this subject.
Which programs should we consider for the first version of the
Haskell benchmark suite?
How should we standarize them, and make them produce reliable
performance measurement?
Should we only use hardware performance counters, or also do more
thorough analysis such as data locality studies, ...
Are there any papers available on this subject (I know about the
paper which is being written as we speak ICFP, which uses PAPI as a
tool).
I think that we should have, as David Roundy pointed out, a
restriction to code that is actually used frequently. However, I
think we should make a distinction between micro-benchmarks, that
test some specific item, and real-life benchmarks. When using micro
benchmarks, the wrong conclusions may be drawn, because e.g., code or
data can be completely cached, there are no TLB misses after startup,
etc. I think that is somebody is interested in knowing how Haskell
performs, and if he should use it for his development, it is nice to
know that e.g., Data.ByteString performs as good as C, but is would
be even nicer to see that large, real-life apps can reach that same
performance. There is more to the Haskell runtime than simply
executing application code, and these things should also be taken
into account.
Also, I think that having several compilers for the benchmark set is
a good idea, because, afaik, they can provide a different runtime
system as well. We know that in Java, the VM can have a significant
impact on behaviour on the microprocessor. I think that Haskell may
have similar issues.
Also, similar to SPEC CPU, it would be nice to have input sets for
each benchmark that gets included into the set. Furthermore, I think
that we should provide a rigorous analysis of the benchmarks on as
many platforms as is feasible. See e.g., the analysis done for the
Dacapo Java benchmark suite, published at OOPSLA 2006.
-- Andy
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe