On Thu August 24 2006 10:44 am, Stephane Eranian wrote:
> On Thu, Aug 24, 2006 at 10:10:17AM -0500, Kevin Corry wrote:
> > I've definitely been able to count things on P4 with pfmon.
>
> That's excellent. No offense for the comment, I think I was still under
> the impression it was very preliminary (from your own comments)
> and that it was not counting yet.
>
> Obviously, I was wrong, my apologies.

Not a problem. You're right that I did mention that the support is not 
complete - there are features of the P4 PMU that I still want to add support 
for: event filtering, event tagging, and event cascading. But the code that's 
there now is useable for basic event counting.

> But when I use instr_completed, it gets more flaky. But maybe I do
> not undertstand what the event actually measures:
>
> $ pfmon -iinstr_completed
> Name     : instr_completed
> Code     : 0x7
> Counters : [ 6 7 8 15 16 17 ]
> Desc     : Instructions that have completed and retired during a clock
> cycle Umask    : 0x01 : [NBOGUS] : Non-bogus instructions.
> Umask    : 0x02 : [BOGUS] : Bogus instructions.
>
> Does anyone get something meaningful out of this one?

I've noticed this as well, actually.

The IA32 Developers Manual, Appendix A lists all the events and related info. 
It has this to say about instr_retired and instr_completed:

instr_retired: This event counts instructions that are retired during a clock 
cycle. Mask bits specify bogus or non-bogus (and whether they are tagged 
using the front-end tagging mechanism).

instr_completed: This event counts instructions that have completed and 
retired during a clock cycle. Mask bits specify whether the instruction is 
bogus or non-bogus. This metric differs from instr_retired, since it counts 
instructions completed, rather than the number of times that instructions 
started.

On the surface, it sounds like these should provide similar counts. But here's 
the output I get for my simple test:

[EMAIL PROTECTED] /home/corry]$ pfmon -u -k -e instr_retired:NBOGUSNTAG \
dd if=/dev/sda of=/dev/null bs=1M count=100
100+0 records in
100+0 records out
41078793 instr_retired

[EMAIL PROTECTED] /home/corry]$ pfmon -u -k -e instr_completed:NBOGUS \
dd if=/dev/sda of=/dev/null bs=1M count=100
100+0 records in
100+0 records out
0 instr_completed

I don't really know how to explain this. The table in 
libpfm/lib/pentium4_events.h looks like it has the correct data for 
instr_complete. When I get some time I'll run the above example through a 
debugger and make sure the correct values are getting passed to the correct 
PMC registers in the kernel.

> > [EMAIL PROTECTED] /home/corry]$ pfmon -i global_power_events
> > Name   : global_power_events
> > Code   : 0x13
> > counter: [ 0 1 9 10 ]
> > Unit-mask 0: RUNNING
> > Desc   : Counts the time during which a processor is not stopped.
>
> In the rewritten AMD K8 event table, I have use the rule that if an event
> only has one unit mask, then we implement it without a unit mask, i.e., it
> is collapsed with the event name. It makes it much easier to use. On AMD
> K8, I have encoded the umask value starting at bit position 8. On P4, you
> probably need to use another trick, though.

Yeah, I had thought about that recently. I'll look into fixing this up. And 
since I don't think any of the P4 events actually count anything unless a at 
least one mask is specified, it might not be a bad idea to designate a 
"default" mask if none is provided. What do you think?

-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/
http://evms.sourceforge.net/
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/

Reply via email to