Stephane,
So I was just messing around with the old version of things using CentOS4
with perfmon-3.2-070206 and 2.6.20 kernel and seem to have something *close*
to working for PEBS+event counters. Similar to your previous workaround
with perfmon-3.2-070725 and the 2.6.22 kernel, I went ahead and modified the
register types to what seemed to be the most generic. I also implemented a
cheap hack in which I call dispatch a 2nd time with the PEBS event, plus the
other event counters, and then merge the events together (I guess this won't
work all of the time based upon your last email). The changes are here:
arch/x86_64/perfmon/perfmon_core.c:
-------------------------------------------------------------------------------
static struct pfm_reg_desc pfm_core_pmc_desc[]={
*** modified /* pmc0 */ { .type = PFM_REG_I,
--> to /* pmc0 */ { .type = PFM_REG_I64,
.desc = "GLOBAL_CTRL",
.dfl_val = 0,
.rsvd_msk = 0xfffffff8fffffffcULL,
.no_emul64_msk = 0,
.hw_addr = MSR_CORE_PERF_GLOBAL_CTRL
},
*** modified /* pmc1 */ { .type = PFM_REG_I,
--> to /* pmc1 */ { .type = PFM_REG_I64,
.desc = "PEBS_ENABLE",
.dfl_val = 0,
.rsvd_msk = 0xfffffffffffffffeULL,
.no_emul64_msk = 0,
.hw_addr = MSR_IA32_PEBS_ENABLE
},
***modified /* pmc2 */ { .type = PFM_REG_W,
--> to /* pmc2 */ { .type = PFM_REG_I64,
.desc = "FIXED_CTRL",
.dfl_val = 0x888ULL,
.rsvd_msk = 0xfffffffffffff444ULL,
.no_emul64_msk = 0,
.hw_addr = MSR_CORE_PERF_FIXED_CTR_CTRL
},
/* pmc3 */ PMX_NA,
/* pmc4 */ {
.type = PFM_REG_I64,
.desc = "PERFEVTSEL0",
.dfl_val = PFM_CORE_PMC_VAL,
.rsvd_msk = PFM_CORE_PMC_RSVD,
.no_emul64_msk = PFM_CORE_NO64,
.hw_addr = MSR_P6_EVNTSEL0
},
/* pmc5 */ {
*** modified .type = PFM_REG_W64,
--> to .type = PFM_REG_I64,
.desc = "PERFEVTSEL1",
.dfl_val = PFM_CORE_PMC_VAL,
.rsvd_msk = PFM_CORE_PMC_RSVD,
.no_emul64_msk = PFM_CORE_NO64,
.hw_addr = MSR_P6_EVNTSEL1
}
};
This seems to be *close* to what is needed. I am able to use PEBS on
MEM_LOAD_RETIRED:L2_MISS while counting INSTRUCTIONS_RETIRED and
UNHALTED_CLOCK_CYCLES. However, if I change to another event, such as
EXT_SNOOP.HITM, I am not able call pfm_dispatch with
MEM_LOAD_RETIRED:L2_MISS and EXT_SNOOP:HITM. Is doesn't seem that this
limitation should exist since they use different sets of counters. I think
that it may be due to how I set the types for the PFM_REGs. I will fiddle
with it but could you let me know if some of those REG.types obviously don't
make sense to you??
Thanks!
--alexshye
On 8/13/07, Alex Shye <[EMAIL PROTECTED]> wrote:
>
> Stephane,
>
> > My more complex tools which do run-time analysis of the data hang
> > > indefinitely now. There is a good chance there it is a bug in my tool
> > --
> > > but if you think of any interactions with the kernel change that could
> > cause
> > > this, could you let me know?
> > >
> >
> > That could be due to the kernel. Are you using PFM_FL_NOTIFY_BLOCK?
> > You may also have issue with your interactions with ptrace() especially
> > if your are monitoring a multi-threaded application.
> >
>
> After exploring this further using the newest perfmon2 kernel
> patch/library, there is still a problem. However, I believe that the issue
> may be with PEBS because it appears without even using the extra workaround
> to simultaneously use the event counters. A few things I have noticed:
>
> 1) I am using a sampling buffer which fits ~100 samples in it. If I
> sample based upon "INST_RETIRED:ANY_P", things work as normal. I
> consistently get buffer overflows with the correct number of samples, no
> matter the sampling period. However, if I switch to
> "MEM_LOAD_RETIRED:L2_MISS", then the number of samples collected when
> processing the sampling buffer becomes very erratic. Could the PEBS event
> counter used make a difference like this? Could there be an issue with
> signals and perfmon?
>
> 2) Ultimately, I am trying to create a Pin tool which monitors itself
> using perfmon. I have a small example program which seems to work (except
> for the erratic num. samples/buffer mentioned before). I am using a mode in
> Pin which does minimal work -- it calls ptrace once at the beginning of
> execution to gain control, runs natively without JITing the code, and does
> not do anything to handle signals. If I take the same perfmon code and put
> it into my Pin tool, it only works at higher sampling periods. When I lower
> the sampling period to gain a reasonable number of samples, the Pin tool
> hangs. The hanging occurs more often on a multi-threaded app and not on a
> simple single threaded toy program I have written. Since Pin only calls
> ptrace once at the beginning, and my apps don't call ptrace, I'm not sure
> that is the issue.
>
> 3) I was using PFM_FL_NOTIFY_BLOCK. I have tried with and without..
> sometimes it gets further without the flag, but it still ends up hanging
> nonetheless.
>
> 4) This problem using Pin tool+PEBS only showed up after moving from
> libpfm-3.2-070206 (with 2.6.20 kernel) to libpfm-3.2-070725 (with 2.6.22)
> kernel. PEBS worked well with Pin beforehand and I was able to collect
> samples at any reasonable sampling rate. There seems to be a bug in the
> kernel, perfmon, or the interaction between them when upgrading.
>
> Because the old setup seemed to work for me with PEBS, it may be faster
> for now to try using that. If it still applies, would you be able to
> suggest a kernel workaround for the old version of the kernel patch? It was
> in arch/x86_64/perfmon/perfmon_core.c and looked like:
>
> static struct pfm_reg_desc pfm_core_pmc_desc[]={
> /* pmc0 */ { .type = PFM_REG_I,
> .desc = "GLOBAL_CTRL",
> .dfl_val = 0,
> .rsvd_msk = 0xfffffff8fffffffcULL,
> .no_emul64_msk = 0,
> .hw_addr = MSR_CORE_PERF_GLOBAL_CTRL
> },
> /* pmc1 */ { .type = PFM_REG_I,
> .desc = "PEBS_ENABLE",
> .dfl_val = 0,
> .rsvd_msk = 0xfffffffffffffffeULL,
> .no_emul64_msk = 0,
> .hw_addr = MSR_IA32_PEBS_ENABLE
> },
> /* pmc2 */ { .type = PFM_REG_W,
> .desc = "FIXED_CTRL",
> .dfl_val = 0x888ULL,
> .rsvd_msk = 0xfffffffffffff444ULL,
> .no_emul64_msk = 0,
> .hw_addr = MSR_CORE_PERF_FIXED_CTR_CTRL
> },
> /* pmc3 */ PMX_NA,
> /* pmc4 */ {
> .type = PFM_REG_I64,
> .desc = "PERFEVTSEL0",
> .dfl_val = PFM_CORE_PMC_VAL,
> .rsvd_msk = PFM_CORE_PMC_RSVD,
> .no_emul64_msk = PFM_CORE_NO64,
> .hw_addr = MSR_P6_EVNTSEL0
> },
> /* pmc5 */ {
> .type = PFM_REG_W64,
> .desc = "PERFEVTSEL1",
> .dfl_val = PFM_CORE_PMC_VAL,
> .rsvd_msk = PFM_CORE_PMC_RSVD,
> .no_emul64_msk = PFM_CORE_NO64,
> .hw_addr = MSR_P6_EVNTSEL1
> }
> };
>
> Thanks in advance for your help! Its been complicated trying to figure
> out how everything works together but your help has been much appreciated :)
>
>
> --alexshye
>
>
_______________________________________________
perfmon mailing list
[email protected]
http://www.hpl.hp.com/hosted/linux/mail-archives/perfmon/