Em Mon, Nov 19, 2018 at 09:06:17PM -0800, Andi Kleen escreveu: > From: Andi Kleen <a...@linux.intel.com> > > This is a fix for another instance of the skid problem Milian > recently found [1]
Milian, have you tested this? Adrian, can I have your Reviewed-by or Acked-by? Thanks, - Arnaldo > The LBRs don't freeze at the exact same time as the PMI is triggered. > The perf script brstackinsn code that dumps LBR assembler > assumes that the last branch in the LBR leads to the sample point. > But with skid it's possible that the CPU executes one or more branches > before the sample, but which do not appear in the LBR. > > What happens then is either that the sample point is before > the last LBR branch. In this case the dumper sees a negative > length and ignores it. Or it the sample point is long after > the last branch. Then the dumper sees a very long block and dumps > it upto its block limit (16k bytes), which is noise in the output. > > On typical sample session this can happen regularly. > > This patch tries to detect and handle the situation. On the last > block that is dumped by the LBR dumper we always stop on the first > branch. If the block length is negative just scan forward to the > first branch. Otherwise scan until a branch is found. > > The PT decoder already has a function that uses the instruction > decoder to detect branches, so we can just reuse it here. > > Then when a terminating branch is found print an indication > and stop dumping. This might miss a few instructions, but at least > shows no runaway blocks. > > Cc: milian.wo...@kdab.com > Cc: adrian.hun...@intel.com > Signed-off-by: Andi Kleen <a...@linux.intel.com> > --- > tools/perf/builtin-script.c | 18 +++++++++++++++++- > tools/perf/util/dump-insn.c | 8 ++++++++ > tools/perf/util/dump-insn.h | 2 ++ > .../intel-pt-decoder/intel-pt-insn-decoder.c | 8 ++++++++ > 4 files changed, 35 insertions(+), 1 deletion(-) > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c > index 4da5e32b9e03..11868bf39e66 100644 > --- a/tools/perf/builtin-script.c > +++ b/tools/perf/builtin-script.c > @@ -1049,9 +1049,18 @@ static int perf_sample__fprintf_brstackinsn(struct > perf_sample *sample, > > /* > * Print final block upto sample > + * > + * Due to pipeline delays the LBRs might be missing a branch > + * or two, which can result in very large or negative blocks > + * between final branch and sample. When this happens just > + * continue walking after the last TO until we hit a branch. > */ > start = br->entries[0].to; > end = sample->ip; > + if (end < start) { > + /* Missing jump. Scan 128 bytes for the next branch */ > + end = start + 128; > + } > len = grab_bb(buffer, start, end, machine, thread, &x.is64bit, > &x.cpumode, true); > printed += ip__fprintf_sym(start, thread, x.cpumode, x.cpu, &lastsym, > attr, fp); > if (len <= 0) { > @@ -1060,7 +1069,6 @@ static int perf_sample__fprintf_brstackinsn(struct > perf_sample *sample, > machine, thread, &x.is64bit, &x.cpumode, false); > if (len <= 0) > goto out; > - > printed += fprintf(fp, "\t%016" PRIx64 "\t%s\n", sample->ip, > dump_insn(&x, sample->ip, buffer, len, NULL)); > goto out; > @@ -1070,6 +1078,14 @@ static int perf_sample__fprintf_brstackinsn(struct > perf_sample *sample, > dump_insn(&x, start + off, buffer + off, len > - off, &ilen)); > if (ilen == 0) > break; > + if (arch_is_branch(buffer + off, len - off, x.is64bit) && > + start + off != sample->ip) { > + /* > + * Hit a missing branch. Just stop. > + */ > + printed += fprintf(fp, "\t... not reaching sample > ...\n"); > + break; > + } > } > out: > return printed; > diff --git a/tools/perf/util/dump-insn.c b/tools/perf/util/dump-insn.c > index 10988d3de7ce..2bd8585db93c 100644 > --- a/tools/perf/util/dump-insn.c > +++ b/tools/perf/util/dump-insn.c > @@ -13,3 +13,11 @@ const char *dump_insn(struct perf_insn *x __maybe_unused, > *lenp = 0; > return "?"; > } > + > +__weak > +int arch_is_branch(const unsigned char *buf __maybe_unused, > + size_t len __maybe_unused, > + int x86_64 __maybe_unused) > +{ > + return 0; > +} > diff --git a/tools/perf/util/dump-insn.h b/tools/perf/util/dump-insn.h > index 0e06280a8860..650125061530 100644 > --- a/tools/perf/util/dump-insn.h > +++ b/tools/perf/util/dump-insn.h > @@ -20,4 +20,6 @@ struct perf_insn { > > const char *dump_insn(struct perf_insn *x, u64 ip, > u8 *inbuf, int inlen, int *lenp); > +int arch_is_branch(const unsigned char *buf, size_t len, int x86_64); > + > #endif > diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c > b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c > index 54818828023b..1c0e289f01e6 100644 > --- a/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c > +++ b/tools/perf/util/intel-pt-decoder/intel-pt-insn-decoder.c > @@ -180,6 +180,14 @@ int intel_pt_get_insn(const unsigned char *buf, size_t > len, int x86_64, > return 0; > } > > +int arch_is_branch(const unsigned char *buf, size_t len, int x86_64) > +{ > + struct intel_pt_insn in; > + if (intel_pt_get_insn(buf, len, x86_64, &in) < 0) > + return -1; > + return in.branch != INTEL_PT_BR_NO_BRANCH; > +} > + > const char *dump_insn(struct perf_insn *x, uint64_t ip __maybe_unused, > u8 *inbuf, int inlen, int *lenp) > { > -- > 2.17.2 -- - Arnaldo