Re: [perfmon2] [PATCH] perf_events: update PEBS event constraints

2011-03-02 Thread Peter Zijlstra
On Tue, 2011-03-01 at 15:50 +0200, Stephane Eranian wrote: > +static void intel_ds_init_pebs_constraints(void) > +{ > + /* > +* we only know hwo to deal with Family 6 > +*/ > + if (boot_cpu_data.x86 != 6) { > + x86_pmu.pebs = 0; > + return; >

Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v9)

2011-02-15 Thread Peter Zijlstra
On Mon, 2011-02-14 at 11:20 +0200, Stephane Eranian wrote: > + if (mode & PERF_CGROUP_SWOUT) { > + cpu_ctx_sched_out(cpuctx, EVENT_ALL); > + /* > +* must not be done before ctxswout dur

Re: [perfmon2] [RFC][PATCH] cgroup: Fix cgroup_subsys::exit callback

2011-02-11 Thread Peter Zijlstra
On Thu, 2011-02-10 at 10:04 +0800, Li Zefan wrote: > Since both sched and perf won't use the 2 args @cgrp and @old_cgrp, > don't bother to change the ->exit interface? symmetry with ->attach, both sched and perf use a common method to implement attach and exit, if one needs it the other would too

Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v8)

2011-02-09 Thread Peter Zijlstra
On Tue, 2011-02-08 at 23:31 +0100, Stephane Eranian wrote: > Peter, > > See comments below. > > > On Mon, Feb 7, 2011 at 5:10 PM, Peter Zijlstra wrote: > > Compile tested only, depends on the cgroup::exit patch > > > > --- linux-2.6.orig/include/linux/pe

Re: [perfmon2] [RFC][PATCH] cgroup: Fix cgroup_subsys::exit callback

2011-02-08 Thread Peter Zijlstra
On Mon, 2011-02-07 at 13:21 -0800, Paul Menage wrote: > On Mon, Feb 7, 2011 at 12:02 PM, Peter Zijlstra wrote: > > On Mon, 2011-02-07 at 11:28 -0800, Paul Menage wrote: > >> On Mon, Feb 7, 2011 at 8:10 AM, Peter Zijlstra > >> wrote: > >> > > >> &

Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v8)

2011-02-07 Thread Peter Zijlstra
On Mon, 2011-02-07 at 11:29 -0800, Paul Menage wrote: > > This again means that all such notification handlers must poll state, > > which is ridiculous. > > > > Not necessarily - we could make it that a failed rmdir() sets a bit > that causes a notification again once the final refcount is dropped

Re: [perfmon2] [RFC][PATCH] cgroup: Fix cgroup_subsys::exit callback

2011-02-07 Thread Peter Zijlstra
On Mon, 2011-02-07 at 11:28 -0800, Paul Menage wrote: > On Mon, Feb 7, 2011 at 8:10 AM, Peter Zijlstra wrote: > > > > Make the ::exit method act like ::attach, it is after all very nearly > > the same thing. > > The major difference between attach and exit is that the

Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v8)

2011-02-07 Thread Peter Zijlstra
open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP); close(cgroup_fd); Signed-off-by: Stephane Eranian [ added perf_cgroup_{exit,attach} ] Signed-off-by: Peter Zijlstra LKML-Reference: --- include/linux/cgroup.h|1 include/linux/cgroup_subsys.h |4 include/linux/perf_eve

[perfmon2] [RFC][PATCH] cgroup: Fix cgroup_subsys::exit callback

2011-02-07 Thread Peter Zijlstra
On Thu, 2011-02-03 at 00:32 +0530, Balbir Singh wrote: > > No, just fixed. The callback as it exists isn't useful and leads to > > hacks like the above. --- Subject: cgroup: Fix cgroup_subsys::exit callback From: Peter Zijlstra Date: Mon Feb 07 17:02:20 CET 2011 Make the ::exi

Re: [perfmon2] [PATCH 0/2] perf_events: add support for Intel fixed counter 2

2011-02-04 Thread Peter Zijlstra
On Fri, 2011-02-04 at 14:00 +0200, Stephane Eranian wrote: > This series of patches solves this problem by introducing a custom > encoding for UNHALTED_REFERENCE_CYCLES (0xff3c) and improving > the constraint infrastructure to handle events which can ONLY be > measured on fixed counters. Right, s

Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v8)

2011-02-02 Thread Peter Zijlstra
On Wed, 2011-02-02 at 17:20 +0530, Balbir Singh wrote: > * Peter Zijlstra [2011-02-02 12:29:20]: > > > On Thu, 2011-01-20 at 15:39 +0100, Peter Zijlstra wrote: > > > On Thu, 2011-01-20 at 15:30 +0200, Stephane Eranian wrote: > > > > @@ -4259,8 +4261,20 @@ void cg

Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v8)

2011-02-02 Thread Peter Zijlstra
On Thu, 2011-01-20 at 15:39 +0100, Peter Zijlstra wrote: > On Thu, 2011-01-20 at 15:30 +0200, Stephane Eranian wrote: > > @@ -4259,8 +4261,20 @@ void cgroup_exit(struct task_struct *tsk, int > > run_callbacks) > > > > /* Reassign the task to the init_css_se

Re: [perfmon2] [PATCH 1/2] perf_events: add cgroup support (v8)

2011-01-20 Thread Peter Zijlstra
On Thu, 2011-01-20 at 15:30 +0200, Stephane Eranian wrote: > @@ -4259,8 +4261,20 @@ void cgroup_exit(struct task_struct *tsk, int > run_callbacks) > > /* Reassign the task to the init_css_set. */ > task_lock(tsk); > + /* > +* we mask interrupts to prevent: > +

Re: [perfmon2] [PATCH 4/5] perf_events: add cgroup support (v7)

2011-01-06 Thread Peter Zijlstra
On Wed, 2011-01-05 at 22:39 +0100, Stephane Eranian wrote: > Peter, > > On Wed, Jan 5, 2011 at 2:01 PM, Stephane Eranian wrote: > > On Wed, Jan 5, 2011 at 12:23 PM, Peter Zijlstra > > wrote: > >> On Mon, 2011-01-03 at 18:20 +0200, Stephane Eranian wrote: > &g

Re: [perfmon2] [PATCH 4/5] perf_events: add cgroup support (v7)

2011-01-05 Thread Peter Zijlstra
On Mon, 2011-01-03 at 18:20 +0200, Stephane Eranian wrote: > +#ifdef CONFIG_CGROUP_PERF > +/* > + * perf_cgroup_info keeps track of time_enabled for a cgroup. > + * This is a per-cpu dynamically allocated data structure. > + */ > +struct perf_cgroup_info { > + u64 time; > + u64 timestamp;

Re: [perfmon2] [PATCH 0/5] perf_events: add support for per-cpu per-cgroup monitoring (v7)

2011-01-05 Thread Peter Zijlstra
I've applied 1,2 and 3. I'll wait for a fixed 4 and a rebased 5 (won't apply to current -tip). -- Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize thei

Re: [perfmon2] [PATCH 4/5] perf_events: add cgroup support (v6)

2010-12-01 Thread Peter Zijlstra
On Tue, 2010-11-30 at 19:20 +0200, Stephane Eranian wrote: > diff --git a/init/Kconfig b/init/Kconfig > index b28dd23..10d408e 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -1066,6 +1066,17 @@ config PERF_COUNTERS > > Say N if unsure. > > +config PERF_CGROUPS > + bool "Enab

Re: [perfmon2] [PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v5)

2010-11-26 Thread Peter Zijlstra
On Fri, 2010-11-26 at 16:28 +0800, Li Zefan wrote: > More information: > > Another feature recently added (eventfd-based notifications) also uses fd > to identify a cgroup. Ok, anyway I guess this all means the cgroup people are fine with this ;-) ---

Re: [perfmon2] [PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v5)

2010-11-26 Thread Peter Zijlstra
On Fri, 2010-11-26 at 08:26 +0530, Balbir Singh wrote: > * l...@cn.fujitsu.com [2010-11-26 09:50:24]: > > > 19:28, Peter Zijlstra wrote: > > > On Thu, 2010-11-18 at 12:40 +0200, Stephane Eranian wrote: > > >> This kernel patch adds the ability to filter mon

Re: [perfmon2] [PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v5)

2010-11-26 Thread Peter Zijlstra
On Thu, 2010-11-25 at 22:32 +0100, Stephane Eranian wrote: > On Thu, Nov 25, 2010 at 4:02 PM, Peter Zijlstra wrote: > > On Thu, 2010-11-25 at 15:51 +0100, Stephane Eranian wrote: > >> > >> > >> On Thu, Nov 25, 2010 at 12:20 PM, Peter Zijlstra > >> wr

Re: [perfmon2] [PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v5)

2010-11-25 Thread Peter Zijlstra
On Thu, 2010-11-25 at 15:51 +0100, Stephane Eranian wrote: > > > On Thu, Nov 25, 2010 at 12:20 PM, Peter Zijlstra wrote: > On Thu, 2010-11-18 at 12:40 +0200, Stephane Eranian wrote: > > @@ -919,6 +945,10 @@ static inline void > perf_event_task_sched_in(str

Re: [perfmon2] [PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v5)

2010-11-25 Thread Peter Zijlstra
On Thu, 2010-11-18 at 12:40 +0200, Stephane Eranian wrote: > This kernel patch adds the ability to filter monitoring based on > container groups (cgroups). This is for use in per-cpu mode only. > > The cgroup to monitor is passed as a file descriptor in the pid > argument to the syscall. The f

Re: [perfmon2] [PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v5)

2010-11-25 Thread Peter Zijlstra
On Thu, 2010-11-18 at 12:40 +0200, Stephane Eranian wrote: > @@ -919,6 +945,10 @@ static inline void perf_event_task_sched_in(struct > task_struct *task) > static inline > void perf_event_task_sched_out(struct task_struct *task, struct task_struct > *next) > { > +#ifdef CONFIG_CGROUPS > +

Re: [perfmon2] [PATCH] perf: add support for per-event sampling period or frequency in perf record

2010-11-23 Thread Peter Zijlstra
On Tue, 2010-11-23 at 12:33 +0100, Stephane Eranian wrote: > > Forgot to say, that the reason I did it thru -c and -F was to maintain > backward compatibility. Right, we could keep the existing parameters to mean default values for those events that didn't specify anything.

Re: [perfmon2] [PATCH] perf: add support for per-event sampling period or frequency in perf record

2010-11-23 Thread Peter Zijlstra
On Tue, 2010-11-23 at 11:45 +0200, Stephane Eranian wrote: > This patch allows specifying a per event sampling period or frequency. > Up until now, the same sampling period or frequency was applied to all > the events specified on the command line of perf record. A sampling > period depends on the

Re: [perfmon2] [PATCH] perf_events: fix time tracking in samples

2010-10-20 Thread Peter Zijlstra
On Wed, 2010-10-20 at 14:42 +0200, Stephane Eranian wrote: > It may be better to try another approach just for PERF_SAMPLE_READ > with its own version of ctx->time. What about if on event_sched_in() you > were snapshotting ctx->time. Then in the perf_output_read_event(), you'd > have to compute th

Re: [perfmon2] [PATCH] perf_events: fix time tracking in samples

2010-10-20 Thread Peter Zijlstra
On Tue, 2010-10-19 at 21:03 +0200, Stephane Eranian wrote: > >> Ok, I missed that. But I don't understand why you need the lock to > >> udpate the time. The lower-level clock is lockless if I recall. Can't you > >> use an atomic ops in update_context_time()? > > > > atomic ops would slow down thos

Re: [perfmon2] [PATCH] perf_events: fix time tracking in samples

2010-10-20 Thread Peter Zijlstra
On Tue, 2010-10-19 at 21:03 +0200, Stephane Eranian wrote: > >> Ok, I missed that. But I don't understand why you need the lock to > >> udpate the time. The lower-level clock is lockless if I recall. Can't you > >> use an atomic ops in update_context_time()? > > > > atomic ops would slow down thos

Re: [perfmon2] [PATCH] perf_events: fix the fix for transaction recovery in group_sched_in()

2010-10-20 Thread Peter Zijlstra
On Tue, 2010-10-19 at 23:45 +0200, Stephane Eranian wrote: > This patch fixes the group_sched_in() fix added by commit 8e5fc1a. > Although the patch solved the issue with time_running, time_enabled > for all events in a group, it had one flaw in case the group could > never be scheduled. It would c

Re: [perfmon2] [PATCH] perf_events: fix time tracking in samples

2010-10-19 Thread Peter Zijlstra
On Tue, 2010-10-19 at 19:01 +0200, Stephane Eranian wrote: > On Tue, Oct 19, 2010 at 6:52 PM, Peter Zijlstra wrote: > > On Tue, 2010-10-19 at 18:47 +0200, Stephane Eranian wrote: > >> This patch corrects time tracking in samples. Without this patch > >> both time_enab

Re: [perfmon2] [PATCH] perf_events: fix time tracking in samples

2010-10-19 Thread Peter Zijlstra
On Tue, 2010-10-19 at 18:47 +0200, Stephane Eranian wrote: > This patch corrects time tracking in samples. Without this patch > both time_enabled and time_running may be reported as zero when > user asks for PERF_SAMPLE_READ. > > You use PERF_SAMPLE_READ when you want to sample the values of > oth

Re: [perfmon2] [PATCH] perf_events: fix transaction recovery in group_sched_in()

2010-10-15 Thread Peter Zijlstra
On Fri, 2010-10-15 at 19:34 +0200, Stephane Eranian wrote: > > > Yes, makes sense.. I'm a bit hesitant to slap a -stable tag on it due to > > its size,.. Ingo, Paulus? > > > I was worried about the size too but I could not figure out another > smaller way of doing this. An alternative would be to

Re: [perfmon2] [PATCH] perf_events: fix transaction recovery in group_sched_in()

2010-10-15 Thread Peter Zijlstra
On Fri, 2010-10-15 at 16:54 +0200, Stephane Eranian wrote: > The group_sched_in() function uses a transactional approach to schedule > a group of events. In a group, either all events can be scheduled or > none are. To schedule each event in, the function calls event_sched_in(). > In case of error,

Re: [perfmon2] [PATCH] perf_events: fix bogus AMD64 generic TLB events

2010-10-15 Thread Peter Zijlstra
On Fri, 2010-10-15 at 15:15 +0200, Stephane Eranian wrote: > PERF_COUNT_HW_CACHE_DTLB:READ:MISS had a bogus umask value of 0 which > counts nothing. Needed to be 0x7 (to count all possibilities). > > PERF_COUNT_HW_CACHE_ITLB:READ:MISS had a bogus umask value of 0 which > counts nothing. Needed to

Re: [perfmon2] [PATCH] perf_events: fix bogus context time tracking

2010-10-15 Thread Peter Zijlstra
On Fri, 2010-10-15 at 15:26 +0200, Stephane Eranian wrote: > You can only call update_context_time() when the context > is active, i.e., the thread it is attached to is still running. > > However, perf_event_read() can be called even when the context > is inactive, e.g., user read() the counters.

Re: [perfmon2] [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-25 Thread Peter Zijlstra
On Wed, 2010-09-22 at 12:26 +0200, Stephane Eranian wrote: > Ok, early testing shows that this seems to be working fine with the > pid approach. Of course it is less convenient than just opening a file > descriptor in cgroup_fs. There is more bookkeeping involved, incl. > cleanup the child on exit

Re: [perfmon2] [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-22 Thread Peter Zijlstra
On Wed, 2010-09-22 at 09:53 +0530, Balbir Singh wrote: > Yes, a task can belong to multiple subsystems, hence multiple cgroups. > Ideally we'd want to use pid + subsystem Apparently we create a perf subsystem, and we only care about that. So pid will uniquely identify a cgroup, since for each subs

Re: [perfmon2] [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-22 Thread Peter Zijlstra
On Wed, 2010-09-22 at 10:04 +0530, Balbir Singh wrote: > That understanding is correct, but the whole approach sounds more > complex due to several subsystems involved, the expectation is that > we'll move perf to all the correct cgroups for each subsystem. Well, we'll only move a completely dorm

Re: [perfmon2] [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-21 Thread Peter Zijlstra
On Tue, 2010-09-21 at 18:17 +0200, Stephane Eranian wrote: > On Tue, Sep 21, 2010 at 4:03 PM, Peter Zijlstra wrote: > > On Tue, 2010-09-21 at 15:38 +0200, Stephane Eranian wrote: > >> > Hmm, indeed. One thing we can do about that is move perf into the > >> > cgr

Re: [perfmon2] [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-21 Thread Peter Zijlstra
On Tue, 2010-09-21 at 15:38 +0200, Stephane Eranian wrote: > > Hmm, indeed. One thing we can do about that is move perf into the > > cgroup, create the counter (disabled) using self to identify the cgroup, > > move perf back to where it came from, and enable the counter. > > > Yes, that's another p

Re: [perfmon2] [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-21 Thread Peter Zijlstra
On Tue, 2010-09-21 at 13:48 +0200, Stephane Eranian wrote: > The main issue I see with this is that it relies on having at least one > task in the cgroup when you start the measurement. That is certainly > not always the case. Hmm, indeed. One thing we can do about that is move perf into the cgro

Re: [perfmon2] [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-21 Thread Peter Zijlstra
On Tue, 2010-09-21 at 11:38 +0200, Peter Zijlstra wrote: > On Thu, 2010-09-09 at 15:05 +0200, Stephane Eranian wrote: > > The cgroup to monitor is designated by passing a file descriptor opened > > on a new per-cgroup file in the cgroup filesystem (perf_event.perf). The >

Re: [perfmon2] [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-21 Thread Peter Zijlstra
On Thu, 2010-09-09 at 15:05 +0200, Stephane Eranian wrote: > The cgroup to monitor is designated by passing a file descriptor opened > on a new per-cgroup file in the cgroup filesystem (perf_event.perf). The > option must be activated by setting perf_event_attr.cgroup=1 and passing > a valid file d

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 21:35 +0200, Andi Kleen wrote: > Stephane Eranian writes: > > > The DS, BTS, and PEBS memory regions were allocated using kzalloc(), i.e., > > requesting contiguous physical memory. There is no such restriction on > > DS, PEBS and BTS buffers. Using kzalloc() could lead to e

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 21:34 +0200, Peter Zijlstra wrote: > On Mon, 2010-09-13 at 15:31 -0400, Mathieu Desnoyers wrote: > > > Ok, so can we play the same trick you're playing with the sampling > > > buffer, i.e., you use alloc_pages_node() for one page at a time, and >

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 15:31 -0400, Mathieu Desnoyers wrote: > > Ok, so can we play the same trick you're playing with the sampling > > buffer, i.e., you use alloc_pages_node() for one page at a time, and > > then you stitch them on demand via SW? > > Well, a thought is striking me: it sounds like

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 20:49 +0200, Stephane Eranian wrote: > On Mon, Sep 13, 2010 at 8:42 PM, Peter Zijlstra wrote: > > On Mon, 2010-09-13 at 20:40 +0200, Stephane Eranian wrote: > >> Ok, so can we play the same trick you're playing with the sampling > >> buffer

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 20:40 +0200, Stephane Eranian wrote: > Ok, so can we play the same trick you're playing with the sampling > buffer, i.e., you use alloc_pages_node() for one page at a time, and > then you stitch them on demand via SW? Not for BTS, it wants a linear range, getting the vmalloc

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 19:24 +0200, Stephane Eranian wrote: > Based on this comment, I assume that the only reason the allocation > of the sampling buffer in perf_buffer_alloc() is immune to this is because > you are allocating each page individually (order 0). Right? Right, and I software stitch

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 17:55 +0200, Stephane Eranian wrote: > > Ok, so you're saying there is no allocator that will give non-contiguous > physical memory WITHOUT requiring a page fault to populate the pte. > > On the other hand, with vmalloc_node() the pte are populated when > you first touch the

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 17:31 +0200, Stephane Eranian wrote: > On Mon, Sep 13, 2010 at 5:24 PM, Peter Zijlstra wrote: > > On Mon, 2010-09-13 at 17:20 +0200, Stephane Eranian wrote: > > > >> That is the case we the sizes you have chosen today. For DS, we > >> c

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 17:20 +0200, Stephane Eranian wrote: > That is the case we the sizes you have chosen today. For DS, we > could round up to one page for now. Markus chose the BTS size, for PEBS a single page was plenty since we do single event things (although we could do multiple for attr.p

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 17:13 +0200, Stephane Eranian wrote: > > For now I think you can not do this. vmalloc'ed memory can't be safely > > accessed from NMIs in x86 because that might fault. And faults from NMIs > > are not supported. They cause very bad things: return from fault calls > > iret whic

Re: [perfmon2] [PATCH] perf_events: improve DS/BTS/PEBS buffer allocation

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 16:55 +0200, Stephane Eranian wrote: > The DS, BTS, and PEBS memory regions were allocated using kzalloc(), i.e., > requesting contiguous physical memory. There is no such restriction on > DS, PEBS and BTS buffers. Using kzalloc() could lead to error in case > no contiguous ph

Re: [perfmon2] [PATCH] perf_events: fix NULL point in free_event()

2010-09-13 Thread Peter Zijlstra
On Mon, 2010-09-13 at 16:36 +0200, Stephane Eranian wrote: > Without the following patch, perf top as non-root and > paranoid cpu set causes a NULL pointer dereference in > free_event() because event->ctx is NULL. Yeah, its already on its way.. Cyrill also reported this. -

Re: [perfmon2] [PATCH percpu#for-next] percpu: clear memory allocated with the km allocator

2010-09-10 Thread Peter Zijlstra
On Fri, 2010-09-10 at 10:52 +0200, Tejun Heo wrote: > Percpu allocator should clear memory before returning it but the km > allocator forgot to do it. Fix it. > > Signed-off-by: Tejun Heo > Spotted-by: Peter Zijlstra (fwiw, -tip uses Reported-by) Acked-by: Peter Zijlstra >

Re: [perfmon2] [RFC PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v3)

2010-09-10 Thread Peter Zijlstra
On Thu, 2010-09-09 at 23:41 +0200, Stephane Eranian wrote: > > alloc_percpu() is zalloc_percpu() in fact, memory is already cleared. > > > I remember thinking about this and trying to trace to the code down > to figure this out. But it is rather complicated. If alloc_percpu() always > clears the me

Re: [perfmon2] [RFC PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v2)

2010-09-08 Thread Peter Zijlstra
On Wed, 2010-09-08 at 15:56 +0200, stephane eranian wrote: > On Wed, Sep 8, 2010 at 3:52 PM, Peter Zijlstra wrote: > > On Wed, 2010-09-08 at 15:30 +0200, Stephane Eranian wrote: > >> + } times[NR_CPUS] cacheline_aligned_in_smp; > > > > That's fail

Re: [perfmon2] [RFC PATCH 1/2] perf_events: add support for per-cpu per-cgroup monitoring (v2)

2010-09-08 Thread Peter Zijlstra
On Wed, 2010-09-08 at 15:30 +0200, Stephane Eranian wrote: > + } times[NR_CPUS] cacheline_aligned_in_smp; That's fail! NR_CPUS can be like 4k for distro configs. Use proper per-cpu allocations. -- This SF.net D

Re: [perfmon2] [BUG] perf_events: NMI watchdog event cannot be throttled

2010-08-19 Thread Peter Zijlstra
On Wed, 2010-08-18 at 22:26 +0200, Stephane Eranian wrote: > Hi, > > I ran into some issue with the NMI watchdog not firing in a deadlock > situation. After some debugging I found the source of the problem. > > The NMI watchdog is currently subject, like any other events, to interrupt > throttli

Re: [perfmon2] [RFC] perf/perf_events: misleading number of samples due to mmap()

2010-06-16 Thread Peter Zijlstra
On Wed, 2010-06-16 at 18:41 +0200, stephane eranian wrote: > On Wed, Jun 16, 2010 at 4:52 PM, Peter Zijlstra wrote: > > On Wed, 2010-06-16 at 16:40 +0200, Stephane Eranian wrote: > >> This leads me to another point. For per-thread sampling, why > >> do we need to rec

Re: [perfmon2] [RFC] perf/perf_events: misleading number of samples due to mmap()

2010-06-16 Thread Peter Zijlstra
On Wed, 2010-06-16 at 16:40 +0200, Stephane Eranian wrote: > The reason is that perf reports an estimate based on the > number of bytes written to the buffer divided by the minimal > sample size of 24 bytes. Right, we should change that based on the PERF_SAMPLE flags used. It will remain an estim

Re: [perfmon2] [RFC] perf/perf_events: misleading number of samples due to mmap()

2010-06-16 Thread Peter Zijlstra
On Wed, 2010-06-16 at 16:40 +0200, Stephane Eranian wrote: > This leads me to another point. For per-thread sampling, why > do we need to record mmap() events happening *outside* of > the process? I can understand the exception of kernel modules. How does that happen? The per-thread events should

Re: [perfmon2] [PATCH] perf_events: fix event scheduling issues introduced by transactional API (take 2)

2010-05-25 Thread Peter Zijlstra
On Tue, 2010-05-25 at 18:10 +0200, Stephane Eranian wrote: > > Index: linux-2.6/kernel/perf_event.c > > === > > --- linux-2.6.orig/kernel/perf_event.c > > +++ linux-2.6/kernel/perf_event.c > > @@ -668,15 +668,9 @@ group_sched_in(struct

Re: [perfmon2] [PATCH] perf_events: fix event scheduling issues introduced by transactional API (take 2)

2010-05-25 Thread Peter Zijlstra
On Tue, 2010-05-25 at 17:32 +0200, Peter Zijlstra wrote: > > Because you always call cancel_txn() even when commit() > > succeeds. I don't really understand why. I think it could be > > avoided by clearing the group_flag in commit_txn() if it > > succeeds. It would al

Re: [perfmon2] [PATCH] perf_events: fix event scheduling issues introduced by transactional API (take 2)

2010-05-25 Thread Peter Zijlstra
On Tue, 2010-05-25 at 17:02 +0200, Stephane Eranian wrote: > Ok, the patch look good expect it needs: > > static int x86_pmu_commit_txn(const struct pmu *pmu) > { > .. > /* > * copy new assignment, now we know it is possible > * will be used by hw_perf_enable(

Re: [perfmon2] [PATCH] perf_events: fix event scheduling issues introduced by transactional API (take 2)

2010-05-25 Thread Peter Zijlstra
On Tue, 2010-05-25 at 15:39 +0200, stephane eranian wrote: > On Tue, May 25, 2010 at 3:35 PM, Peter Zijlstra wrote: > > On Tue, 2010-05-25 at 15:20 +0200, Stephane Eranian wrote: > > > >> With this patch, you can now overcommit the PMU even with pinned > >> sys

Re: [perfmon2] [PATCH] perf_events: fix event scheduling issues introduced by transactional API (take 2)

2010-05-25 Thread Peter Zijlstra
On Tue, 2010-05-25 at 15:20 +0200, Stephane Eranian wrote: > With this patch, you can now overcommit the PMU even with pinned > system-wide events present and still get valid counts. Does this patch differ from the one you send earlier? ---

Re: [perfmon2] [PATCH] perf: fix cmpxchg warning in perf_event_amd.c

2010-05-18 Thread Peter Zijlstra
On Tue, 2010-05-18 at 16:43 -0400, Jason Baron wrote: > Hi, > > I'm getting the following warnings: > > In file included from arch/x86/kernel/cpu/perf_event.c:1343: > arch/x86/kernel/cpu/perf_event_amd.c: In function > ‘amd_put_event_constraints’: > arch/x86/kernel/cpu/perf_event_amd.c:167: warni

Re: [perfmon2] [RFC] perf: perf record sets inherit by default

2010-05-17 Thread Peter Zijlstra
ces. > > > In that case, don't you think you should also ensure that the buffer is > allocated on the NUMA node of the designated per-thread-per-cpu? > I don't think it is the case today. Yeah, something like the below ought to do I guess.. Almost-Signed-off-by: Pete

Re: [perfmon2] [PATCH] perf_events: fix errors path in perf_output_begin()

2010-05-17 Thread Peter Zijlstra
On Mon, 2010-05-17 at 14:04 +0200, Stephane Eranian wrote: > > > So you want to preserve this state for when you munmap() and mmap() > > again? The only user of data->lost is writing the PERF_RECORD_LOST > > event, which only ever happens when you have pages, so counting it when > > there's no pag

Re: [perfmon2] [PATCH] perf_events: fix errors path in perf_output_begin()

2010-05-17 Thread Peter Zijlstra
On Mon, 2010-05-17 at 13:56 +0200, Stephane Eranian wrote: > >> + if (!data->nr_pages) { > >> + atomic_inc(&data->lost); > >> + goto out; > >> + } > Well, nr_pages = 0 means all you have is the sampling buffer > header page. You cannot save any sample, so you actu

Re: [perfmon2] [RFC] perf: perf record sets inherit by default

2010-05-11 Thread Peter Zijlstra
On Tue, 2010-05-11 at 12:50 -0300, Arnaldo Carvalho de Melo wrote: > Nope, see above. Ah, -p/-t might make sense to default to inherit off indeed. -- ___ perfmon2-devel mail

Re: [perfmon2] [RFC] perf: perf record sets inherit by default

2010-05-11 Thread Peter Zijlstra
On Tue, 2010-05-11 at 12:00 -0300, Arnaldo Carvalho de Melo wrote: > > Humm, since for -C and -a using -i doesn't make sense, I guess it should > be off by default and only be auto-activated if we don't specify any > option, i.e. when using it like: > > perf record ./hackbench > > What do you th

Re: [perfmon2] [RFC] perf: perf record sets inherit by default

2010-05-11 Thread Peter Zijlstra
On Tue, 2010-05-11 at 16:04 +0200, Stephane Eranian wrote: > Hi, > > > I am confused by the inheritance cmd line option of perf record: > > $ perf record -h > usage: perf record [] [] > or: perf record [] -- [] > > -e, --eventevent selector. use 'perf list' to list > available eve

Re: [perfmon2] [BUG] perf_event: crash in perf_output_begin

2010-05-11 Thread Peter Zijlstra
On Mon, 2010-05-10 at 22:03 +0200, Stephane Eranian wrote: > On Mon, May 10, 2010 at 5:52 PM, Peter Zijlstra wrote: > > On Mon, 2010-05-10 at 17:33 +0200, Stephane Eranian wrote: > >> Hi, > >> > >> While testing 2.6.34-rc7 I ran into the following issue when &

Re: [perfmon2] [BUG] perf_event: crash in perf_output_begin

2010-05-10 Thread Peter Zijlstra
On Mon, 2010-05-10 at 17:33 +0200, Stephane Eranian wrote: > Hi, > > While testing 2.6.34-rc7 I ran into the following issue when > using BTS sampling on Intel Core. It seems like something > is not terminated properly. I am sampling BTS per-thread > on a test program, then hit CTRL-C, one second

Re: [perfmon2] [PATCH] perf_events: fix bogus warn_on(_once) in perf_prepare_sample()

2010-04-08 Thread Peter Zijlstra
On Thu, 2010-04-08 at 23:29 +0200, Stephane Eranian wrote: > On Thu, Apr 8, 2010 at 11:15 PM, Peter Zijlstra wrote: > > On Thu, 2010-04-08 at 23:08 +0200, Stephane Eranian wrote: > >> On Thu, Apr 8, 2010 at 10:55 PM, Peter Zijlstra > >> wrote: > >> > On T

Re: [perfmon2] [PATCH] perf_events: fix bogus warn_on(_once) in perf_prepare_sample()

2010-04-08 Thread Peter Zijlstra
On Thu, 2010-04-08 at 23:18 +0200, Stephane Eranian wrote: > I am not sure I understand what you mean by buffered PEBS. Are you talking > about using PEBS buffer bigger than one entry? Yep. > If so, how can you do: > - the LBR based fixups for multiple samples (on PMU interrupt) Not, so it wou

Re: [perfmon2] [PATCH] perf_events: fix bogus warn_on(_once) in perf_prepare_sample()

2010-04-08 Thread Peter Zijlstra
On Thu, 2010-04-08 at 23:14 +0200, Stephane Eranian wrote: > On Thu, Apr 8, 2010 at 11:11 PM, Peter Zijlstra wrote: > > On Thu, 2010-04-08 at 23:08 +0200, Stephane Eranian wrote: > >> > >> Are you suggesting you add some padding the PEBS raw sample you > >>

Re: [perfmon2] [PATCH] perf_events: fix bogus warn_on(_once) in perf_prepare_sample()

2010-04-08 Thread Peter Zijlstra
On Thu, 2010-04-08 at 23:08 +0200, Stephane Eranian wrote: > On Thu, Apr 8, 2010 at 10:55 PM, Peter Zijlstra wrote: > > On Thu, 2010-04-08 at 22:45 +0200, Stephane Eranian wrote: > >> There is a warn_on_once() check for PERF_SAMPLE_RAW which trips > >> when

Re: [perfmon2] [PATCH] perf_events: fix bogus warn_on(_once) in perf_prepare_sample()

2010-04-08 Thread Peter Zijlstra
On Thu, 2010-04-08 at 23:08 +0200, Stephane Eranian wrote: > > > There's various things that do indeed rely on the perf buffer to always > > be u64 aligned, so this warning isn't bogus at all. > > > I assume this has to do with the wrap-around detection. Nah, mostly dealing with architectures tha

Re: [perfmon2] [PATCH] perf_events: fix bogus warn_on(_once) in perf_prepare_sample()

2010-04-08 Thread Peter Zijlstra
On Thu, 2010-04-08 at 23:08 +0200, Stephane Eranian wrote: > > Are you suggesting you add some padding the PEBS raw sample you > return as PERF_SAMPLE_RAW? Then you need to define what RAW > actually means? Seems here, it would mean more than what the > HW returns. Well, RAW doesn't mean anythin

Re: [perfmon2] [PATCH] perf_events: fix bogus warn_on(_once) in perf_prepare_sample()

2010-04-08 Thread Peter Zijlstra
On Thu, 2010-04-08 at 22:55 +0200, Peter Zijlstra wrote: > On Thu, 2010-04-08 at 22:45 +0200, Stephane Eranian wrote: > > There is a warn_on_once() check for PERF_SAMPLE_RAW which trips > > when using PEBS on both Core and Nehalem. Core PEBS sample size is 144 > >

Re: [perfmon2] [PATCH] perf_events: fix bogus warn_on(_once) in perf_prepare_sample()

2010-04-08 Thread Peter Zijlstra
On Thu, 2010-04-08 at 22:45 +0200, Stephane Eranian wrote: > There is a warn_on_once() check for PERF_SAMPLE_RAW which trips > when using PEBS on both Core and Nehalem. Core PEBS sample size is 144 > bytes and 176 bytes for Nehalem. Both are multiples of 8, but the size > fi

Re: [perfmon2] [PATCH] perf_events: add PERF_SAMPLE_BRANCH_STACK

2010-04-07 Thread Peter Zijlstra
On Wed, 2010-04-07 at 18:48 +0200, Stephane Eranian wrote: > Then, why didn't you extend perf to leverage your patch? > Because I couldn't come up with a sensible use case. -- Download Intel® Parallel Studio Eval Try the

Re: [perfmon2] [PATCH] perf_events: add PERF_SAMPLE_BRANCH_STACK

2010-04-07 Thread Peter Zijlstra
On Wed, 2010-04-07 at 14:45 +0200, Stephane Eranian wrote: > LBR is configured by default to record ALL taken branches. On some > processors, it is possible to filter the type of branches. This will > be supported in a subsequent patch. > > On other processors, the

Re: [perfmon2] [PATCH] perf_events: fix remapped count support

2010-03-23 Thread Peter Zijlstra
On Tue, 2010-03-23 at 18:25 +0200, Stephane Eranian wrote: > This patch fixes the remapped counter support such that it now works > on X86 processors. (could you please not add all this whitespace in front? and make sure it's no wider than 70 chars) Also, I wouldn't say it fixes

Re: [perfmon2] [PATCH] perf_events: fix bug in AMD per-cpu initialization

2010-03-23 Thread Peter Zijlstra
On Tue, 2010-03-23 at 16:12 +0100, Stephane Eranian wrote: > On Tue, Mar 23, 2010 at 4:07 PM, Peter Zijlstra wrote: > > On Tue, 2010-03-23 at 15:55 +0100, Stephane Eranian wrote: > >> What's the point of CPU_ONLINE vs. CPU_STARTING if you're saying the > >>

Re: [perfmon2] [PATCH] perf_events: fix bug in AMD per-cpu initialization

2010-03-23 Thread Peter Zijlstra
On Tue, 2010-03-23 at 15:55 +0100, Stephane Eranian wrote: > What's the point of CPU_ONLINE vs. CPU_STARTING if you're saying the > former is never right? Why not move CPU_ONLINE to the right place and > drop CPU_STARTING? Its right for a lot of things, just not for perf, we need to be ready and d

Re: [perfmon2] [PATCH] perf_events: fix bug in AMD per-cpu initialization

2010-03-23 Thread Peter Zijlstra
On Thu, 2010-03-18 at 01:33 +0100, Stephane Eranian wrote: > On Thu, Mar 18, 2010 at 12:47 AM, Peter Zijlstra wrote: > > On Wed, 2010-03-17 at 10:40 +0200, Stephane Eranian wrote: > >> On AMD processors, we need to allocate a data structure per > >> North

Re: [perfmon2] [PATCH] perf_events: fix ordering bug in perf_output_sample()

2010-03-18 Thread Peter Zijlstra
On Thu, 2010-03-18 at 22:29 +0100, Stephane Eranian wrote: > On Thu, Mar 18, 2010 at 7:33 PM, Peter Zijlstra wrote: > > On Thu, 2010-03-18 at 14:42 +0200, Stephane Eranian wrote: > >> In order to parse a sample correctly based on the information > >> req

Re: [perfmon2] [PATCH] perf_events: fix ordering bug in perf_output_sample()

2010-03-18 Thread Peter Zijlstra
On Thu, 2010-03-18 at 14:42 +0200, Stephane Eranian wrote: > In order to parse a sample correctly based on the information > requested via sample_type, the kernel needs to save each component > in a known order. There is no type value saved with each component. > The current

Re: [perfmon2] [PATCH] perf_events: fix bug in AMD per-cpu initialization

2010-03-17 Thread Peter Zijlstra
On Wed, 2010-03-17 at 10:40 +0200, Stephane Eranian wrote: > On AMD processors, we need to allocate a data structure per Northbridge > to handle certain events. > > On CPU initialization, we need to query the Northbridge id and check > whether the structure is already alloc

Re: [perfmon2] [PATCH] perf_events: remove bogus Intel Core constraint on ITLB_MISS_RETIRED

2010-03-11 Thread Peter Zijlstra
On Thu, 2010-03-11 at 12:59 -0800, eran...@google.com wrote: > Contrary to what Vol3b section 30.4.3 leads to believe, there is > no constraint on ITLB_MISS_RETIRED on Intel Core-based CPU, so > remove it. Is that from Intel, and will they clarify the text in the next version of the document? > S

Re: [perfmon2] [PATCH] perf_events: fix X86 bogus counts when multiplexing

2010-03-11 Thread Peter Zijlstra
On Thu, 2010-03-11 at 09:32 +0100, Peter Zijlstra wrote: > On Wed, 2010-03-10 at 22:17 -0800, eran...@google.com wrote: > > This patch fixes a bug in 2.6.33 X86 event scheduling whereby > > all counts are bogus as soon as events need to be multiplexed > > because the

Re: [perfmon2] [PATCH] perf_events: fix X86 bogus counts when multiplexing

2010-03-11 Thread Peter Zijlstra
On Wed, 2010-03-10 at 22:17 -0800, eran...@google.com wrote: > This patch fixes a bug in 2.6.33 X86 event scheduling whereby > all counts are bogus as soon as events need to be multiplexed > because the PMU is overcommitted. > > The code in hw_perf_enable() was causing multiplexed events > to accu

Re: [perfmon2] [PATCH] perf_events: improve task_sched_in()

2010-03-11 Thread Peter Zijlstra
e scheduling flexible > events > and we go throuh hw_perf_enable() twice. By encapsulating, the whole sequence > into perf_disable()/perf_enable() we ensure, hw_perf_enable() is going to be > invoked only once because of the refcount protection. Agreed, this makes perfect sense. Acked-by: P

Re: [perfmon2] [PATCH] perf_events: add sampling period randomization support

2010-03-02 Thread Peter Zijlstra
On Tue, 2010-03-02 at 11:53 +0100, Robert Richter wrote: > > Only adding the random value will lead to longer sample periods on > average. To compensate this you could calculate something like: > > event->hw.sample_period = event->attr.sample_period + (new_seed & > mask) - (mask >> 1);

Re: [perfmon2] [PATCH] perf_events: add sampling period randomization support

2010-03-02 Thread Peter Zijlstra
On Mon, 2010-03-01 at 22:07 -0800, eran...@google.com wrote: > This patch adds support for randomizing the sampling period. > Randomization is very useful to mitigate the bias that exists > with sampling. The random number generator does not need to > be sophisticated. This patch uses the builtin r

  1   2   3   >