I was trying to work out why the PAPI floating point event validation tests were failing on a Ryzen machine.
PAPI is using: RETIRED_SSE_AVX_OPERATIONS:DP_ADD_SUB_FLOPS:DP_MULT_FLOPS:DP_MULT_ADD_FLOPS:DP_DIV_FLOPS 0x53f003 which I think should cover most double-precision floating point, and it does record a lot of counts when run on Linpack. However our simple validation test does a matrix-matrix multiply that does mulsd (%rax),%xmm0 addsd %xmm0,%xmm1 both of which I would think would count as SSE double-precision, but the event doesn't incrememnt (the total count for the test ends up being 0). Expected results *are* returned if we use RETIRED_MMX_FP_INSTRUCTIONS:SSE_INSTR:MMX_INSTR:X87_INSTR 0x5307cb instead Does anyone know why this might be happening? Also, somewhat related, in reading the "Open Source Register Reference for AMD Family 17h" document it mentions the special "Merge" events for combining results when the increment is too big. Does libpfm4 support this? I suppose not as it doesn't appear that Linux supports this? Thanks, Vince ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel