Re: [PATCH 00/10] Stitch LBR call stack

Liang, Kan Mon, 07 Oct 2019 13:06:43 -0700



On 10/7/2019 2:24 PM, Ingo Molnar wrote:


* [email protected] <[email protected]> wrote:

Performance impact:
The processing time may increase with the LBR stitching approach
enabled. The impact depends on the number of samples with stitched LBRs.

For sqlite's tcltest,
perf record --call-graph lbr -- make tcltest
perf report --stitch-lbr

There are 4.11% samples has stitched LBRs.
Total number of samples:                        2833728
The number of samples with stitched LBRs        116478

The processing time of perf report increases 6.8%
Without --stitch-lbr:                           55906106 usec
With --stitch-lbr:                              59728701 usec

For a simple test case tchain_edit with 43 depth of call stacks.
perf record --call-graph lbr -- ./tchain_edit
perf report --stitch-lbr

There are 99.9% samples has stitched LBRs.
Total number of samples:                        10915
The number of samples with stitched LBRs        10905

The processing time of perf report increases 67.4%
Without --stitch-lbr:                           11970508 usec
With --stitch-lbr:                              20036055 usec


That cost seems pretty high, while the feature sounds useful - is there
any way to speed this up?

For each LBR entry, perf tool will calculate and generate an appendednode for callchain_cursor.The stitched LBR entries are from previous sample. It looks like wedon't need to do the calculation again for them. That should speed upthe whole process. I will do more test for it.


Thanks,
Kan

Re: [PATCH 00/10] Stitch LBR call stack

Reply via email to