On Fri, Jul 10, 2020 at 06:11:04PM +0300, Adrian Hunter wrote:
> A single q option decodes ip from only FUP/TIP packets. Make it so that
> repeating the q option (i.e. qq) decodes only PSB+, getting ip if there is
> a FUP packet within PSB+ (i.e. between PSB and PSBEND).
> 
> Example:
> 
>  $ perf record -e intel_pt//u grep -rI pudding drivers
>  [ perf record: Woken up 52 times to write data ]
>  [ perf record: Captured and wrote 57.870 MB perf.data ]
>  $ time perf script --itrace=bi | wc -l
>  58948289
> 
>  real    1m23.863s
>  user    1m23.251s
>  sys     0m7.452s
>  $ time perf script --itrace=biq | wc -l
>  3385694
> 
>  real    0m4.453s
>  user    0m4.455s
>  sys     0m0.328s
>  $ time perf script --itrace=biqq | wc -l
>  1883
> 
>  real    0m0.047s
>  user    0m0.043s
>  sys     0m0.009s
> 
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
>  tools/perf/Documentation/perf-intel-pt.txt     | 15 +++++++++++++++
>  .../util/intel-pt-decoder/intel-pt-decoder.c   | 18 ++++++++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/tools/perf/Documentation/perf-intel-pt.txt 
> b/tools/perf/Documentation/perf-intel-pt.txt
> index f9fe4a4040ba..d5a266d7f15b 100644
> --- a/tools/perf/Documentation/perf-intel-pt.txt
> +++ b/tools/perf/Documentation/perf-intel-pt.txt
> @@ -999,6 +999,21 @@ What *will* be decoded with the (single) q option:
>  Note the q option does not specify what events will be synthesized e.g. the p
>  option must be used also to show power events.
>  
> +Repeating the q option (double-q i.e. qq) results in even faster decoding 
> and even
> +less detail.  The decoder decodes only extended PSB (PSB+) packets, getting 
> the
> +instruction pointer if there is a FUP packet within PSB+ (i.e. between PSB 
> and
> +PSBEND).  Note PSB packets occur regularly in the trace based on the 
> psb_period
> +config term (refer config terms section).  There will be a FUP packet if the
> +PSB+ occurs while control flow is being traced.

Some estimate would be good how frequent that is.

If we assume one bit per instruction then a 2K period it's roughly 16k 
instructions,
with the 16K period roughly 128K instructions.

Could be added in a followon patch.

But looks overall the patches look good to me now.

(for the whole series)
Reviewed-by: Andi Kleen <[email protected]>

Reply via email to