Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-30 Thread Namhyung Kim
Hi Stephane,

On Mon, 29 Oct 2012 16:15:46 +0100, Stephane Eranian wrote:
> + /*
> +  * use the mapping table for bit 0-15
> +  */
> + val = pebs_data_source[dse.ld_dse];

I guess you meant bit 0-3, right?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Andi Kleen
> I agree. I did that because it was the easiest thing I could think of.
> Discovered I had to deal with all of this just two days ago when I
> rebased. Wasn't too happy to have to deal with this at the last minute.

I'll repost, as soon as I debugged why Jiri's version of my 31/33 parser patch
doesn't work.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Stephane Eranian
On Mon, Oct 29, 2012 at 10:16 PM, Andi Kleen  wrote:
>> > Why do you need to replace the whole table?
>> >
>> Because I am extending them with one or two events based on cpu
>> model. That was the easiest way of doing this instead of playing
>> some kind of malloc+copy trick.
>
> I did malloc and copy.
>
>>
>> > BTW I still think my approach in the v4 Haswell patchkit
>> > is simpler and didn't rely on hardcoding these events.
>> >
>> I don't care about those events. As I found out, they are not even
>> used by perf because they are all hardcoded and that's what gets
>> used. I assume they are exposed for reference only. I don't object
>> to that. But I think the right mechanism would be one where you
>> can add events at boot time based on CPU model. It could be used
>> to add the common events as well in the common part of the init
>> code.
>
> Yes that's what I did.
>
> I don't think copying everything for everything new is a good
> approach.
>
I agree. I did that because it was the easiest thing I could think of.
Discovered I had to deal with all of this just two days ago when I
rebased. Wasn't too happy to have to deal with this at the last minute.

In any case, if you have something, then looks like I need to wait until
it's in before I can adjust this patch set.


> -Andi
>
> --
> a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Andi Kleen
> > Why do you need to replace the whole table?
> >
> Because I am extending them with one or two events based on cpu
> model. That was the easiest way of doing this instead of playing
> some kind of malloc+copy trick.

I did malloc and copy.

> 
> > BTW I still think my approach in the v4 Haswell patchkit
> > is simpler and didn't rely on hardcoding these events.
> >
> I don't care about those events. As I found out, they are not even
> used by perf because they are all hardcoded and that's what gets
> used. I assume they are exposed for reference only. I don't object
> to that. But I think the right mechanism would be one where you
> can add events at boot time based on CPU model. It could be used
> to add the common events as well in the common part of the init
> code.

Yes that's what I did. 

I don't think copying everything for everything new is a good 
approach.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Peter Zijlstra
On Mon, 2012-10-29 at 21:39 +0100, Stephane Eranian wrote:
> But I think the right mechanism would be one where you
> can add events at boot time based on CPU model. It could be used
> to add the common events as well in the common part of the init
> code. 

mlin once posted something like that, it did some scary things with
sysfs without using the driver model -- which might or might not be
'right' (TM).

If someone has the stomach to stare at that code and then convince
gregkh that its not 'wrong' I'm all for it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Stephane Eranian
On Mon, Oct 29, 2012 at 8:42 PM, Andi Kleen  wrote:
>> +
>> +struct attribute *nhm_events_attrs[] = {
>> + EVENT_PTR(CPU_CYCLES),
>> + EVENT_PTR(INSTRUCTIONS),
>> + EVENT_PTR(CACHE_REFERENCES),
>> + EVENT_PTR(CACHE_MISSES),
>> + EVENT_PTR(BRANCH_INSTRUCTIONS),
>> + EVENT_PTR(BRANCH_MISSES),
>> + EVENT_PTR(BUS_CYCLES),
>> + EVENT_PTR(STALLED_CYCLES_FRONTEND),
>> + EVENT_PTR(STALLED_CYCLES_BACKEND),
>> + EVENT_PTR(REF_CPU_CYCLES),
>> + EVENT_PTR(mem_ld_nhm),
>> + NULL,
>> +};
>
> I thought Jiri's patch already exports all the generic ones?
>
> Why do you need to replace the whole table?
>
Because I am extending them with one or two events based on cpu
model. That was the easiest way of doing this instead of playing
some kind of malloc+copy trick.

> BTW I still think my approach in the v4 Haswell patchkit
> is simpler and didn't rely on hardcoding these events.
>
I don't care about those events. As I found out, they are not even
used by perf because they are all hardcoded and that's what gets
used. I assume they are exposed for reference only. I don't object
to that. But I think the right mechanism would be one where you
can add events at boot time based on CPU model. It could be used
to add the common events as well in the common part of the init
code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Andi Kleen
> +
> +struct attribute *nhm_events_attrs[] = {
> + EVENT_PTR(CPU_CYCLES),
> + EVENT_PTR(INSTRUCTIONS),
> + EVENT_PTR(CACHE_REFERENCES),
> + EVENT_PTR(CACHE_MISSES),
> + EVENT_PTR(BRANCH_INSTRUCTIONS),
> + EVENT_PTR(BRANCH_MISSES),
> + EVENT_PTR(BUS_CYCLES),
> + EVENT_PTR(STALLED_CYCLES_FRONTEND),
> + EVENT_PTR(STALLED_CYCLES_BACKEND),
> + EVENT_PTR(REF_CPU_CYCLES),
> + EVENT_PTR(mem_ld_nhm),
> + NULL,
> +};

I thought Jiri's patch already exports all the generic ones?

Why do you need to replace the whole table?

BTW I still think my approach in the v4 Haswell patchkit 
is simpler and didn't rely on hardcoding these events.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Peter Zijlstra
On Mon, 2012-10-29 at 16:43 +0100, Stephane Eranian wrote:
> You meant fll, instead I think.

Oh, yes, too small font I guess.

> Well, that would work too, but I am trying to factorize the code
> with Precise Store which is a later patch. 

Yeah, just found that, its fine the way it is. Just looked funneh on
first reading.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Stephane Eranian
On Mon, Oct 29, 2012 at 4:38 PM, Peter Zijlstra  wrote:
> On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
>> +   fll = event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT;
>> +
>> perf_sample_data_init(&data, 0, event->hw.last_period);
>>
>> +   data.period = event->hw.last_period;
>> +   sample_type = event->attr.sample_type;
>> +
>> +   /*
>> +* if PEBS-LL or PreciseStore
>> +*/
>> +   if (fll) {
>> +   if (sample_type & PERF_SAMPLE_ADDR)
>> +   data.addr = pebs->dla;
>> +
>> +   /*
>> +* Use latency for cost (only avail with PEBS-LL)
>> +*/
>> +   if (fll && (sample_type & PERF_SAMPLE_COST))
>> +   data.cost = pebs->lat;
>> +
>> +   /*
>> +* data.dsrc encodes the data source
>> +*/
>> +   if (sample_type & PERF_SAMPLE_DSRC) {
>> +   if (fll)
>> +   data.dsrc.val = load_latency_data(pebs->dse);
>> +   }
>> +   }
>
>
> Why does fl1 exist? the above looks really convoluted, all fl1 checks
> inside are pointless. Also 'fl1' is a bizarre name, 'pebs_ll' would be a
> better name I guess.
>
You meant fll, instead I think.

> What's wrong with:
>
> if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) {
> if (sample_type & PERF_SAMPLE_ADDR)
> data.addr = pebs->dla;
>
> if (sample_type & PERF_SAMPLE_COST)
> data.cost = perf->lat;
>
> if (sample_type & PERF_SAMPLE_DSRC)
> data.dsrc.val = load_latency_data(pebs->dse);
> }
>
Well, that would work too, but I am trying to factorize the code
with Precise Store which is a later patch. But we could clone
your proposal and do:

> if (event->hw.flags & PERF_X86_EVENT_PEBS_ST) {
> if (sample_type & PERF_SAMPLE_ADDR)
> data.addr = pebs->dla;
>
> if (sample_type & PERF_SAMPLE_DSRC)
> data.dsrc.val = precise_store_data(pebs->dse);
> }

I am fine with that too.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Stephane Eranian
On Mon, Oct 29, 2012 at 4:35 PM, Peter Zijlstra  wrote:
> On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
>> +static u64 load_latency_data(u64 status)
>> +{
>> +   union intel_x86_pebs_dse dse;
>> +   u64 val;
>> +   int model = boot_cpu_data.x86_model;
>> +   int fam = boot_cpu_data.x86;
>> +
>> +   dse.val = status;
>> +
>> +   /*
>> +* use the mapping table for bit 0-15
>> +*/
>> +   val = pebs_data_source[dse.ld_dse];
>> +
>> +   /*
>> +* Nehalem models do not support TLB, Lock infos
>> +*/
>> +   if (fam == 0x6 && (model == 26 || model == 30
>> +   || model == 31 || model == 46)) {
>> +   val |= P(TLB, NA) | P(LOCK, NA);
>> +   return val;
>> +   }
>
> I'm so 100% sure we'll forget to add a nhm model number if we ever find
> we missed one.
>
> Could we either add a classification enum to x86_pmu that is set in the
> big model switch on init, or do this with your new constraints flags,
> where we have a different flag for NHM_LL vs SNB_LL or so?
>
Let's add something in x86_pmu. We could, for instance, generalize
x86_pmu.er_flags into x86_pmu.flags. Instead of adding yet another
flag field.

> Or if all else fails, add a quirk to the Intel Debugstore bits bitfield,
> something like pebs_ll_nhm.
>
>> +   /*
>> +* bit 4: TLB access
>> +* 0 = did not miss 2nd level TLB
>> +* 1 = missed 2nd level TLB
>> +*/
>> +   if (dse.ld_stlb_miss)
>> +   val |= P(TLB, MISS) | P(TLB, L2);
>> +   else
>> +   val |= P(TLB, HIT) | P(TLB,L1) | P(TLB, L2);
>> +
>> +   /*
>> +* bit 5: locked prefix
>> +*/
>> +   if (dse.ld_locked)
>> +   val |= P(LOCK, LOCKED);
>> +
>> +   return val;
>> +}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Peter Zijlstra
On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
> +   fll = event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT;
> +
> perf_sample_data_init(&data, 0, event->hw.last_period);
>  
> +   data.period = event->hw.last_period;
> +   sample_type = event->attr.sample_type;
> +
> +   /*
> +* if PEBS-LL or PreciseStore
> +*/
> +   if (fll) {
> +   if (sample_type & PERF_SAMPLE_ADDR)
> +   data.addr = pebs->dla;
> +
> +   /*
> +* Use latency for cost (only avail with PEBS-LL)
> +*/
> +   if (fll && (sample_type & PERF_SAMPLE_COST))
> +   data.cost = pebs->lat;
> +
> +   /*
> +* data.dsrc encodes the data source
> +*/
> +   if (sample_type & PERF_SAMPLE_DSRC) {
> +   if (fll)
> +   data.dsrc.val = load_latency_data(pebs->dse);
> +   }
> +   } 


Why does fl1 exist? the above looks really convoluted, all fl1 checks
inside are pointless. Also 'fl1' is a bizarre name, 'pebs_ll' would be a
better name I guess.

What's wrong with:

if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) {
if (sample_type & PERF_SAMPLE_ADDR)
data.addr = pebs->dla;

if (sample_type & PERF_SAMPLE_COST)
data.cost = perf->lat;

if (sample_type & PERF_SAMPLE_DSRC)
data.dsrc.val = load_latency_data(pebs->dse);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Peter Zijlstra
On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
> +static u64 load_latency_data(u64 status)
> +{
> +   union intel_x86_pebs_dse dse;
> +   u64 val;
> +   int model = boot_cpu_data.x86_model;
> +   int fam = boot_cpu_data.x86;
> +
> +   dse.val = status;
> +
> +   /*
> +* use the mapping table for bit 0-15
> +*/
> +   val = pebs_data_source[dse.ld_dse];
> +
> +   /*
> +* Nehalem models do not support TLB, Lock infos
> +*/
> +   if (fam == 0x6 && (model == 26 || model == 30
> +   || model == 31 || model == 46)) {
> +   val |= P(TLB, NA) | P(LOCK, NA);
> +   return val;
> +   }

I'm so 100% sure we'll forget to add a nhm model number if we ever find
we missed one.

Could we either add a classification enum to x86_pmu that is set in the
big model switch on init, or do this with your new constraints flags,
where we have a different flag for NHM_LL vs SNB_LL or so?

Or if all else fails, add a quirk to the Intel Debugstore bits bitfield,
something like pebs_ll_nhm.

> +   /*
> +* bit 4: TLB access
> +* 0 = did not miss 2nd level TLB
> +* 1 = missed 2nd level TLB
> +*/
> +   if (dse.ld_stlb_miss)
> +   val |= P(TLB, MISS) | P(TLB, L2);
> +   else
> +   val |= P(TLB, HIT) | P(TLB,L1) | P(TLB, L2);
> +
> +   /*
> +* bit 5: locked prefix
> +*/
> +   if (dse.ld_locked)
> +   val |= P(LOCK, LOCKED);
> +
> +   return val;
> +} 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Stephane Eranian
On Mon, Oct 29, 2012 at 4:23 PM, Peter Zijlstra  wrote:
> On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
>> +EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x100b,umask=0x1,ldlat=3");
>> +EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3");
>
> I haven't fully grokked the macro magic yet, but event=0x100b seems
> wrong, event only takes 8 bit on Intel.

Yeah, my bad. I messed up on this one. should be:
event=0x0b,umask=0x10,ldlat=3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch v1 04/10] perf/x86: add memory profiling via PEBS Load Latency

2012-10-29 Thread Peter Zijlstra
On Mon, 2012-10-29 at 16:15 +0100, Stephane Eranian wrote:
> +EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x100b,umask=0x1,ldlat=3");
> +EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3"); 

I haven't fully grokked the macro magic yet, but event=0x100b seems
wrong, event only takes 8 bit on Intel.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/