Brendan Gregg <brendan.d.gr...@gmail.com> writes: > > Despite millions of samples, many NOPs are never seen. (See the > Percent column.) I'm not using PEBS, but I suppose I should.
Yes you should. You get better results with :p / :pp (PEBS) and the best results (but not at the cycles level) with INST_RETIRED.PREC_DIST / ALL cycles:upp │ 0000000000400400 <main>: 0.05 │ 0: nop │ nop │ nop 19.78 │ nop 0.17 │ nop │ nop │ nop 20.16 │ nop 0.12 │ nop │ nop │ nop 20.49 │ nop 0.07 │ nop │ nop │ nop 18.88 │ nop 20.30 │ ↑ jmp 0 cpu/event=0xc0,umask=0x1,name=inst_retired_prec_dist/pp │ 0000000000400400 <main>: 6.13 │ 0: nop │ nop 0.02 │ nop 0.02 │ nop 23.69 │ nop │ nop │ nop 0.02 │ nop 24.05 │ nop │ nop │ nop 0.02 │ nop 23.60 │ nop │ nop │ nop │ nop 22.46 │ ↑ jmp 0 This is nearly as good as you can get here because the machine can retire four nops per cycle A common trick is to run PREC_DIST/ALL in parallel with other events and correlate. > I think Andi mentioned this to me last year -- that instruction > profiling was no longer reliable. It never was. > Is this due to parallel and out-of-order execution? (ie, we're > sampling the instruction pointer, but that's set to the resumption > instruction, not the instructions being processed in the backend?). Most problems are due to 'skid': It takes some time to trigger the profiling interrupt after the event fired. Without PEBS the skid is quite high. With PEBS it's a lot better because it writes out the information into the PEBS buffer faster, but also not zero and can still be noticed. With PREC_DIST/ALL it does some additional tricks to further reduce it. There are also other problems, for example an event may not be tied to an instruction. Some events have inherently large skid. -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html