Re: [Pharo-dev] [ANN] CalipeL: a benchmarking tool for Smalltalk/X and Pharo

smalltalk Thu, 05 Nov 2015 08:20:13 -0800

Hi Jan,
Hi Max:

I guess, the main issues is missing documentation…
Even so, there are class comments…

> On 01 Nov 2015, at 23:45, Jan Vrany <[email protected]> wrote:
> 
> Hi Max,
> 
> I looked at some version of SMark years ago and never used 
> it extensively, so I might be wrong, but: 
> 
> * SMark executor does some magic with numbers.

Nope. It does only do that if you ask for it. However, granted, that’s the 
standard setting, because it is supposed to be used conveniently from within 
the image.

The SMark design knows the concepts reporter (how and what data to report), 
runner (how to execute benchmarks), suite (the benchmarks), timer (should be 
named gauge or something, can be everything, doesn’t have to be time).

> It tries to 
>   calculate a number of iterations to run in order to get 
>   "statistically meaningful results". Maybe it's me, but
>   I could not fully understand what it does and why it does it
>   so. 
>   CalipeL does no magic - it gives you raw numbers (no average, no
> mean,
>   rather a sequence of measurements).

See the ReBenchHarness, that’s giving you exactly that as alternative standard 
setting.

> * SMark, IIRC, requires benchmarks to inherit from some base class
>   (like SUnit).

Require is a strong word, as long as you implement the interface of SMarkSuite 
you can inherit from where ever you want. It’s Smalltalk after all.

> Also, not sure if SMark allows you to specify a warmup-
>   phase (handy for example to measure peak performance when caches are
>   filled or so). 

There is the concept of #setup/teardown methods.
And, a runner can do what it wants/needs to reach warmup, too.
For instance, the SMarkCogRunner will make sure that all code is compiled 
before starting to measure.

>   CalipeL, OTOH, uses method annotations to describe the benchmark,
>   so one can turn a regular SUnit test method into benchmark as simply
>   as annotating it with <benchmark>.

Ok, that’s not possible.

> A warmup method and setup/teardown
>   methods can be specified per-benchmark. 

We got that too.

> * SMark has no support for parametrization. 

Well, there is the #problemSize parameter, but that is indeed rather simplistic.

> 
> * SMark measures time only. 

Nope, the SMarkTimer can measure what they want. (and it even got a class 
comment ;))

> * SMark had no support for “system" profilers and similar. 

That’s absent, true.

> * Finally, SMark spits out a report and that’s it. 

Well, reports and raw data. I use ReBench [1], and pipe the raw data directly 
into my latex/knitr/R tool chain to generate graphs/numbers in my papers 
(example sec. 4 [2], that’s based on a latex file with embedded R).

So, I’d say there are some interesting differences.
But, much of the mentioned things seem just to be missing 
‘documentation’/communication ;)

Best regards
Stefan

[1] https://github.com/smarr/ReBench
[2] 
http://stefan-marr.de/papers/oopsla-marr-ducasse-meta-tracing-vs-partial-evaluation/

Re: [Pharo-dev] [ANN] CalipeL: a benchmarking tool for Smalltalk/X and Pharo

Reply via email to