Hi Vince,

Perhaps those above are actually ‘packed’ operations and they just skipped that 
in the description?

Phil


> On Aug 31, 2018, at 9:34 PM, Vince Weaver <vincent.wea...@maine.edu> wrote:
> 
> I was trying to work out why the PAPI floating point event validation 
> tests were failing on a Ryzen machine.
> 
> PAPI is using:
>       
> RETIRED_SSE_AVX_OPERATIONS:DP_ADD_SUB_FLOPS:DP_MULT_FLOPS:DP_MULT_ADD_FLOPS:DP_DIV_FLOPS
>       0x53f003
> 
> which I think should cover most double-precision floating point, and it
> does record a lot of counts when run on Linpack.  However our simple
> validation test does a matrix-matrix multiply that does
> 
>       mulsd  (%rax),%xmm0
>       addsd  %xmm0,%xmm1
> 
> both of which I would think would count as SSE double-precision, but the 
> event doesn't incrememnt (the total count for the test ends up being 0).
> 
> Expected results *are* returned if we use
> 
>       RETIRED_MMX_FP_INSTRUCTIONS:SSE_INSTR:MMX_INSTR:X87_INSTR
>       0x5307cb
> instead
> 
> Does anyone know why this might be happening?
> 
> Also, somewhat related, in reading the "Open Source Register Reference for 
> AMD Family 17h" document it mentions the special "Merge" events for 
> combining results when the increment is too big.  Does libpfm4 support 
> this?  I suppose not as it doesn't appear that Linux supports this?
> 
> Thanks,
> 
> Vince

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to