Hi guys,
I have instrumented my code on Nehalem with libpfm4 and now I am a little
bit confused.
I am reading Intel's "IA64-Optimization-Reference". In Appendix B3.4 they
describe a drill-down technique for performance analysis. I think this is a
good starting point for analysis.
Basically I want to map Intel's description in the document to the events of
libpfm4. This sounds easy... but it is not that easy for me :-)
Intel writes for example:
"Cycles_non_retiring_uops .... A constant issue rate of μops flowing through
the issue port. Thus, we define:
uops_rate = Dispatch_uops/Cycles_issuing_uops, where Dispatch_uops can be
measured with RS_UOPS_DISPATCHED, clearing the INV bit and the CMASK."
Well, I don't see a RS_UOPS_DISPATCHED event in libpfm. As far as I
understood the architecture, UOPS_DISPATCHED can be expressed as the sum of
UOPS_RETIRED and UOPS_ISSUED.
What do you think?
Regards, Eduard
------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel