Re: RTS changes affect runtime when they shouldn’t

Sven Panne Sat, 23 Sep 2017 11:47:24 -0700

2017-09-21 0:34 GMT+02:00 Sebastian Graf <sgraf1...@gmail.com>:

> [...] The only real drawback I see is that instruction count might skew
> results, because AFAIK it doesn't properly take the architecture (pipeline,
> latencies, etc.) into account. It might be just OK for the average program,
> though.
>


It really depends on what you're trying to measure: The raw instruction
count is basically useless if you want to have a number which has any
connection to the real time taken by the program. The average number of
cycles per CPU instruction varies by 2 orders of magnitude on modern
architectures, see e.g. the Skylake section in
http://www.agner.org/optimize/instruction_tables.pdf (IMHO a must-read for
anyone doing serious optimizations/measurements on the assembly level). And
these numbers don't even include the effects of the caches, pipeline
stalls, branch prediction, execution units/ports, etc. etc. which can
easily add another 1 or 2 orders of magnitude.

So what can one do? It basically boils down to a choice:

   * Use a stable number like the instruction count (the "Instructions
Read" (Ir) events), which has no real connection to the speed of a program.

   * Use a relatively volatile number like real time and/or cycles used,
which is what your users will care about. If you put a non-trivial amount
of work into your compiler, you can make these numbers a bit more stable
(e.g. by making the code layout/alignment more stable), but you will still
get quite different numbers if you switch to another CPU
generation/manufacturer.

A bit tragic, but that's life in 2017... :-}

_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Re: RTS changes affect runtime when they shouldn’t

Reply via email to