On 02/14/11 16:47, Guy Hulbert wrote:
Some comments on what I plan to do after my 2 week hibernation.

On Sun, 2011-13-02 at 20:27 +0300, Richard Hainsworth wrote:
see http://shootout.alioth.debian.org/ for more info on the
algorithms.

There are many factors that can be considered when benchmarking,
including IO and OS.

It seemed to me it would be a good idea to fix on some "elegant" form
of
the perl6 implementation. By keeping the program the same, it will be
possible to track how developments implementations affect speed/memory
size.
I am interested first in developing a generic framework around the work
already done for 'the benchmark game' (TBG*).  I will pretend that I am
starting from scratch and define a protocol for adding algorithms and
exchanging information.
TBG was designed to test languages and assumes a stable implementation of the language. There is a possibility of different language implementations, but in a sense they are treated as different languages.

Also the TBG tool allows for a single program to be altered and remeasured against a statically evolving language, rather than a change in the language and then measuring against a static program.

This may be relevant for what Steffen wrote, as in a sense the modules are a part of the programing environment that changes.
I have been convinced that everything following has been done for TBG
but some of it is obscure.  The details are hidden in CVS.
Actually, all of the intelligence is hidden in the bencher/makefiles/xxxxxx.ini file Explanation is in the .ini and in the bencher/readme file. However, it can be very obscure.

I am trying to get the c and cc working, as these require a compilation step prior to running the program.
I'd like to set things up so everything is fully automated.  Perl6
developers (and users :-) should be able to just run the benchmarks
in a "reasonable way" (one which halts :-) after installing the
latest rakudo release.
Absolutely
(A) Protocol to specify algorithm.

1. Define an algorithm and provide a reference for it.
2. Define standard inputs and implement algorithm in 2 languages.
3. Generate and verify outputs corresponding to inputs.
4. Make code, input and output available.
TBG has directories for each benchmark, since the idea is to make it easy to add another language to a benchmark.

Also a directory for input files and output files to be diffed against

I think that we should have directories for each language, since the number of languages is small, and if a new implementation of a benchmark is added, we want to see when it appears in the historical record.
Details can be posted on github and descriptions on the perl6 wiki.

(B) Benchmark protocol per language.

1. Define hardware, operating system, language version.
2. Execute for a reasonable subset of inputs (some may be
too slow or fast to be interesting).
3. Generate standard metrics (see alioth).
Accumulate historical values
Summaries can be posted on the perl6 wiki.

It should be possible to extend the standard metrics.  It should also be
possible to filter them in standard ways to make results clearer.
Collecting data should be separated from analysing the data.
Given the above, I would just define a protocol to exchange results.
One need only specify md5 sums to verify/identify input/output -- some
of the algorithms in TBG have input and output so large that they are
truncated in the results pages.  In such cases publishing checksums (md5
is sufficient) of the results will be useful.

I'm interested in autoamating B1 for other purposes.

[*] Personally, I have nothing against 'shootout' but it does no harm to
respect the wishes of the current maintainer of TBG on alioth.

Reply via email to