Hi Vince,
Perhaps those above are actually ‘packed’ operations and they just skipped that
in the description?
Phil
> On Aug 31, 2018, at 9:34 PM, Vince Weaver <vincent.wea...@maine.edu> wrote:
>
> I was trying to work out why the PAPI floating point event validation
> tests were failing on a Ryzen machine.
>
> PAPI is using:
>
> RETIRED_SSE_AVX_OPERATIONS:DP_ADD_SUB_FLOPS:DP_MULT_FLOPS:DP_MULT_ADD_FLOPS:DP_DIV_FLOPS
> 0x53f003
>
> which I think should cover most double-precision floating point, and it
> does record a lot of counts when run on Linpack. However our simple
> validation test does a matrix-matrix multiply that does
>
> mulsd (%rax),%xmm0
> addsd %xmm0,%xmm1
>
> both of which I would think would count as SSE double-precision, but the
> event doesn't incrememnt (the total count for the test ends up being 0).
>
> Expected results *are* returned if we use
>
> RETIRED_MMX_FP_INSTRUCTIONS:SSE_INSTR:MMX_INSTR:X87_INSTR
> 0x5307cb
> instead
>
> Does anyone know why this might be happening?
>
> Also, somewhat related, in reading the "Open Source Register Reference for
> AMD Family 17h" document it mentions the special "Merge" events for
> combining results when the increment is too big. Does libpfm4 support
> this? I suppose not as it doesn't appear that Linux supports this?
>
> Thanks,
>
> Vince
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel