On 05/27/2014 05:39 PM, Stephane Eranian wrote:
> I have been looking at those patches and ran some tests.
> And I found a few issues so far.
>
> I am running:
> $ perf record -j any_ret -e cycles:u test_program
> $ perf report -D
>
> Most entries are okay and match the filter, however some do not make sense:
>
> 3642586996762 0x15d0 [0x108]: PERF_RECORD_SAMPLE(IP, 2): 17921/17921:
> 0x10001170 period: 613678 addr: 0
> .... branch stack: nr:9
> ..... 0: 00000000100011cc -> 0000000010000e38
> ..... 1: 0000000010001150 -> 00000000100011bc
> ..... 2: 0000000010001208 -> 0000000010000e38
> ..... 3: 0000000010001160 -> 00000000100011f8
> ..... 4: 00000000100011cc -> 0000000010000e38
> ..... 5: 0000000010001150 -> 00000000100011bc
> ..... 6: 0000000010001208 -> 0000000010000e38
> ..... 7: 0000000010001160 -> 00000000100011f8
> ..... 8: 0000000000000000 -> 0000000010001160
> ^^^^^^
> Entry 8 does not make sense, unless 0x0 is a valid return branch
> instruction address.
> If an address is invalid, the whole entry needs to be eliminated. It
> is okay to have
> less than the max number of entries supported by HW.
Hey Stephane,
Okay. The same behaviour is also reflected in the test results what I have
shared in the patchset. Here is that section.
(3) perf record -j any_ret -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared
Object Target Symbol
# ........ ....... .................... .....................
.................... .....................
#
15.61% cprog [unknown] [.] 00000000 cprog
[.] sw_3_1
6.28% cprog cprog [.] symbol2 cprog
[.] hw_1_2
6.28% cprog cprog [.] ctr_addr cprog
[.] sw_4_1
6.26% cprog cprog [.] success_3_1_3 cprog
[.] sw_3_1
6.24% cprog cprog [.] symbol1 cprog
[.] hw_1_1
6.24% cprog cprog [.] sw_4_2 cprog
[.] callme
6.21% cprog [unknown] [.] 00000000 cprog
[.] callme
6.19% cprog cprog [.] lr_addr cprog
[.] sw_4_2
3.16% cprog cprog [.] hw_1_2 cprog
[.] callme
3.15% cprog cprog [.] success_3_1_1 cprog
[.] sw_3_1
3.15% cprog cprog [.] sw_4_1 cprog
[.] callme
3.14% cprog cprog [.] callme cprog
[.] main
3.13% cprog cprog [.] hw_1_1 cprog
[.] callme
So a lot of samples above have 0x0 as the "from" address. This originates from
the code
section here inside the function "power_pmu_bhrb_read", where we hit two back
to back
target addresses. So we zero out the from address for the first target address
and re-read
the second address over again. So thats how we get zero as the from address.
This is how the
HW capture the samples. I was reluctant to drop these samples but I agree that
these kind of
samples can be dropped if we need to.
if (val & BHRB_TARGET) {
/* Shouldn't have two targets in a
row.. Reset index and try again */
r_index--;
addr = 0;
}
> I also had cases where monitoring only at the user level, got me
> branch addresses in the
> 0xc0000000...... range. My test program is linked statically.
>
Thats weird. I would need more information and details on this. BTW
what is the system you are running on ? Could you please share the
/proc/cpuinfo details of the same ?
> when eliminating the bogus entries, my tests yielded only return
> branch instruction addresses
> which is good. Will run more tests.
Sure. Thanks for the tests and comments.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/