Justin Chang <[email protected]> writes:

> Last question
>
> I would like to report the efficiency of my code. That is, flops/s over the
> theoretical peak performance (on n-cores). Where the TPP is clock *
> FLOPS/cycle * n. My current machine is a Intel® Core™ i7-4790 CPU @ 3.60GHz
> and I am assuming that the FLOPS/cycle is 4.

This calculation is becoming obsolete because the vector clock rate is
slower than the scalar clock rate.  It is probably better to define peak
flops as the best measured performance for tuned DGEMM.

> One of my serial test runs has achieved a FLOPS/s of 2.01e+09, which
> translates to an efficiency of almost 14%. I know these are crude
> measurements but would these manual flop counts be appropriate for this
> kind of measurement? Or would hardware counts from PAPI?

Hardware counters are notoriously inaccurate since they may count
speculative flops instead of useful flops.

Attachment: signature.asc
Description: PGP signature

Reply via email to