Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Tue, Sep 11, 2018 at 01:29:32PM +, Song Liu wrote: SNIP > >>> > >>> jirka > >> > >> I am not sure I am following. The pmu is disabled when we call > >> event_pmu_add(). Why do we need to read before calling pmu->add()? > >> And this is the first added dup event for this master, so we don't > >> need to worry about others. > >> > >> Does this make sense? > > > > I was just thinking since the pmu is disable we could > > we don't need to read the event on 2 places.. it's almost > > identic code > > How about something like: > > > +/* PMU sharing aware version of event->pmu->add() */ > +static int event_pmu_add(struct perf_event *event, > + struct perf_event_context *ctx) > +{ > + struct perf_event_dup *dup; > + int ret; > + > + /* no sharing, just do event->pmu->add() */ > + if (event->dup_id == -1) > + return event->pmu->add(event, PERF_EF_START); > + > + dup = >dup_events[event->dup_id]; > + > + if (dup->active_event_count = 0) { > + /* try add master */ > + ret = event->pmu->add(dup->master, PERF_EF_START); > + if (ret) > + return ret; > + } > + > + dup->active_event_count++; > + event->pmu->read(dup->master); > + event->dup_base_count = dup_read_count(dup); > + event->dup_base_child_count = dup_read_child_count(dup); > + > + return 0; > +} yep, seems ok jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Tue, Sep 11, 2018 at 01:29:32PM +, Song Liu wrote: SNIP > >>> > >>> jirka > >> > >> I am not sure I am following. The pmu is disabled when we call > >> event_pmu_add(). Why do we need to read before calling pmu->add()? > >> And this is the first added dup event for this master, so we don't > >> need to worry about others. > >> > >> Does this make sense? > > > > I was just thinking since the pmu is disable we could > > we don't need to read the event on 2 places.. it's almost > > identic code > > How about something like: > > > +/* PMU sharing aware version of event->pmu->add() */ > +static int event_pmu_add(struct perf_event *event, > + struct perf_event_context *ctx) > +{ > + struct perf_event_dup *dup; > + int ret; > + > + /* no sharing, just do event->pmu->add() */ > + if (event->dup_id == -1) > + return event->pmu->add(event, PERF_EF_START); > + > + dup = >dup_events[event->dup_id]; > + > + if (dup->active_event_count = 0) { > + /* try add master */ > + ret = event->pmu->add(dup->master, PERF_EF_START); > + if (ret) > + return ret; > + } > + > + dup->active_event_count++; > + event->pmu->read(dup->master); > + event->dup_base_count = dup_read_count(dup); > + event->dup_base_child_count = dup_read_child_count(dup); > + > + return 0; > +} yep, seems ok jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
> On Sep 10, 2018, at 1:15 AM, Jiri Olsa wrote: > > On Thu, Aug 30, 2018 at 06:51:07PM +, Song Liu wrote: >> >> >>> On Aug 30, 2018, at 8:18 AM, Jiri Olsa wrote: >>> >>> On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: >>> >>> SNIP >>> @@ -6100,7 +6333,7 @@ static void perf_output_read_group(struct perf_output_handle *handle, if ((sub != event) && (sub->state == PERF_EVENT_STATE_ACTIVE)) - sub->pmu->read(sub); + event_pmu_read(sub); values[n++] = perf_event_count(sub); if (read_format & PERF_FORMAT_ID) @@ -9109,7 +9342,7 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer) if (event->state != PERF_EVENT_STATE_ACTIVE) return HRTIMER_NORESTART; - event->pmu->read(event); + event_pmu_read(event); perf_sample_data_init(, 0, event->hw.last_period); regs = get_irq_regs(); @@ -10504,6 +10737,14 @@ SYSCALL_DEFINE5(perf_event_open, goto err_cred; } + if (perf_event_can_share(event)) { + event->tmp_master = perf_event_alloc(>attr, cpu, + task, NULL, NULL, + NULL, NULL, -1); >>> >>> can't get around this.. I understand the need, but AFAICS you allocate >>> the whole 'struct perf_event', just because there's count field in it >>> otherwise the 'struct hw_perf_event' should be enough to carry all that's >>> needed to read hw event >>> >>> would it be better to move the count to 'struct hw_perf_event' and use >>> that instead? assuming I'm not missing anything.. >>> >>> jirka >> >> I am trying to make the master event function the same as a real event, >> while keep dup events as followers. This avoids "switching master" in >> earlier versions (and Tejun's RFC). > > yep, I understand.. still, it seems too much to allocate > the whole 'struct perf_even't just to get separated 'count' > variable In theory, we only need separated counters. However, in practice, there are other variables we need to handle for a switch_master operation. For example, we need make sure event->state is always set properly. So this optimization is not easy to implement. How about we optimize it after this patch gets in? Thanks, Song
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
> On Sep 10, 2018, at 1:15 AM, Jiri Olsa wrote: > > On Thu, Aug 30, 2018 at 06:51:07PM +, Song Liu wrote: >> >> >>> On Aug 30, 2018, at 8:18 AM, Jiri Olsa wrote: >>> >>> On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: >>> >>> SNIP >>> @@ -6100,7 +6333,7 @@ static void perf_output_read_group(struct perf_output_handle *handle, if ((sub != event) && (sub->state == PERF_EVENT_STATE_ACTIVE)) - sub->pmu->read(sub); + event_pmu_read(sub); values[n++] = perf_event_count(sub); if (read_format & PERF_FORMAT_ID) @@ -9109,7 +9342,7 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer) if (event->state != PERF_EVENT_STATE_ACTIVE) return HRTIMER_NORESTART; - event->pmu->read(event); + event_pmu_read(event); perf_sample_data_init(, 0, event->hw.last_period); regs = get_irq_regs(); @@ -10504,6 +10737,14 @@ SYSCALL_DEFINE5(perf_event_open, goto err_cred; } + if (perf_event_can_share(event)) { + event->tmp_master = perf_event_alloc(>attr, cpu, + task, NULL, NULL, + NULL, NULL, -1); >>> >>> can't get around this.. I understand the need, but AFAICS you allocate >>> the whole 'struct perf_event', just because there's count field in it >>> otherwise the 'struct hw_perf_event' should be enough to carry all that's >>> needed to read hw event >>> >>> would it be better to move the count to 'struct hw_perf_event' and use >>> that instead? assuming I'm not missing anything.. >>> >>> jirka >> >> I am trying to make the master event function the same as a real event, >> while keep dup events as followers. This avoids "switching master" in >> earlier versions (and Tejun's RFC). > > yep, I understand.. still, it seems too much to allocate > the whole 'struct perf_even't just to get separated 'count' > variable In theory, we only need separated counters. However, in practice, there are other variables we need to handle for a switch_master operation. For example, we need make sure event->state is always set properly. So this optimization is not easy to implement. How about we optimize it after this patch gets in? Thanks, Song
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
> On Sep 10, 2018, at 1:13 AM, Jiri Olsa wrote: > > On Thu, Aug 30, 2018 at 06:35:37PM +, Song Liu wrote: >> >> >>> On Aug 30, 2018, at 8:13 AM, Jiri Olsa wrote: >>> >>> On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: >>> >>> SNIP >>> + perf_event_remove_dup(event, ctx); /* * We can have double detach due to exit/hot-unplug + close. */ @@ -1982,6 +2123,92 @@ event_filter_match(struct perf_event *event) perf_cgroup_match(event) && pmu_filter_match(event); } +/* PMU sharing aware version of event->pmu->add() */ +static int event_pmu_add(struct perf_event *event, + struct perf_event_context *ctx) +{ + struct perf_event_dup *dup; + int ret; + + /* no sharing, just do event->pmu->add() */ + if (event->dup_id == -1) + return event->pmu->add(event, PERF_EF_START); + + dup = >dup_events[event->dup_id]; + + if (dup->active_event_count) { + /* already enabled */ + dup->active_event_count++; + dup->master->pmu->read(dup->master); + event->dup_base_count = dup_read_count(dup); + event->dup_base_child_count = dup_read_child_count(dup); + return 0; + } + + /* try add master */ + ret = event->pmu->add(dup->master, PERF_EF_START); + + if (!ret) { + dup->active_event_count = 1; + event->pmu->read(dup->master); + event->dup_base_count = dup_read_count(dup); + event->dup_base_child_count = dup_read_child_count(dup); >>> >>> should you read the base before calling pmu->add ? >>> should be same for any dup event not just master >>> >>> jirka >> >> I am not sure I am following. The pmu is disabled when we call >> event_pmu_add(). Why do we need to read before calling pmu->add()? >> And this is the first added dup event for this master, so we don't >> need to worry about others. >> >> Does this make sense? > > I was just thinking since the pmu is disable we could > we don't need to read the event on 2 places.. it's almost > identic code How about something like: +/* PMU sharing aware version of event->pmu->add() */ +static int event_pmu_add(struct perf_event *event, +struct perf_event_context *ctx) +{ + struct perf_event_dup *dup; + int ret; + + /* no sharing, just do event->pmu->add() */ + if (event->dup_id == -1) + return event->pmu->add(event, PERF_EF_START); + + dup = >dup_events[event->dup_id]; + + if (dup->active_event_count = 0) { + /* try add master */ + ret = event->pmu->add(dup->master, PERF_EF_START); + if (ret) + return ret; + } + + dup->active_event_count++; + event->pmu->read(dup->master); + event->dup_base_count = dup_read_count(dup); + event->dup_base_child_count = dup_read_child_count(dup); + + return 0; +}
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
> On Sep 10, 2018, at 1:13 AM, Jiri Olsa wrote: > > On Thu, Aug 30, 2018 at 06:35:37PM +, Song Liu wrote: >> >> >>> On Aug 30, 2018, at 8:13 AM, Jiri Olsa wrote: >>> >>> On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: >>> >>> SNIP >>> + perf_event_remove_dup(event, ctx); /* * We can have double detach due to exit/hot-unplug + close. */ @@ -1982,6 +2123,92 @@ event_filter_match(struct perf_event *event) perf_cgroup_match(event) && pmu_filter_match(event); } +/* PMU sharing aware version of event->pmu->add() */ +static int event_pmu_add(struct perf_event *event, + struct perf_event_context *ctx) +{ + struct perf_event_dup *dup; + int ret; + + /* no sharing, just do event->pmu->add() */ + if (event->dup_id == -1) + return event->pmu->add(event, PERF_EF_START); + + dup = >dup_events[event->dup_id]; + + if (dup->active_event_count) { + /* already enabled */ + dup->active_event_count++; + dup->master->pmu->read(dup->master); + event->dup_base_count = dup_read_count(dup); + event->dup_base_child_count = dup_read_child_count(dup); + return 0; + } + + /* try add master */ + ret = event->pmu->add(dup->master, PERF_EF_START); + + if (!ret) { + dup->active_event_count = 1; + event->pmu->read(dup->master); + event->dup_base_count = dup_read_count(dup); + event->dup_base_child_count = dup_read_child_count(dup); >>> >>> should you read the base before calling pmu->add ? >>> should be same for any dup event not just master >>> >>> jirka >> >> I am not sure I am following. The pmu is disabled when we call >> event_pmu_add(). Why do we need to read before calling pmu->add()? >> And this is the first added dup event for this master, so we don't >> need to worry about others. >> >> Does this make sense? > > I was just thinking since the pmu is disable we could > we don't need to read the event on 2 places.. it's almost > identic code How about something like: +/* PMU sharing aware version of event->pmu->add() */ +static int event_pmu_add(struct perf_event *event, +struct perf_event_context *ctx) +{ + struct perf_event_dup *dup; + int ret; + + /* no sharing, just do event->pmu->add() */ + if (event->dup_id == -1) + return event->pmu->add(event, PERF_EF_START); + + dup = >dup_events[event->dup_id]; + + if (dup->active_event_count = 0) { + /* try add master */ + ret = event->pmu->add(dup->master, PERF_EF_START); + if (ret) + return ret; + } + + dup->active_event_count++; + event->pmu->read(dup->master); + event->dup_base_count = dup_read_count(dup); + event->dup_base_child_count = dup_read_child_count(dup); + + return 0; +}
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Thu, Aug 30, 2018 at 06:51:07PM +, Song Liu wrote: > > > > On Aug 30, 2018, at 8:18 AM, Jiri Olsa wrote: > > > > On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: > > > > SNIP > > > >> @@ -6100,7 +6333,7 @@ static void perf_output_read_group(struct > >> perf_output_handle *handle, > >> > >>if ((sub != event) && > >>(sub->state == PERF_EVENT_STATE_ACTIVE)) > >> - sub->pmu->read(sub); > >> + event_pmu_read(sub); > >> > >>values[n++] = perf_event_count(sub); > >>if (read_format & PERF_FORMAT_ID) > >> @@ -9109,7 +9342,7 @@ static enum hrtimer_restart > >> perf_swevent_hrtimer(struct hrtimer *hrtimer) > >>if (event->state != PERF_EVENT_STATE_ACTIVE) > >>return HRTIMER_NORESTART; > >> > >> - event->pmu->read(event); > >> + event_pmu_read(event); > >> > >>perf_sample_data_init(, 0, event->hw.last_period); > >>regs = get_irq_regs(); > >> @@ -10504,6 +10737,14 @@ SYSCALL_DEFINE5(perf_event_open, > >>goto err_cred; > >>} > >> > >> + if (perf_event_can_share(event)) { > >> + event->tmp_master = perf_event_alloc(>attr, cpu, > >> + task, NULL, NULL, > >> + NULL, NULL, -1); > > > > can't get around this.. I understand the need, but AFAICS you allocate > > the whole 'struct perf_event', just because there's count field in it > > otherwise the 'struct hw_perf_event' should be enough to carry all that's > > needed to read hw event > > > > would it be better to move the count to 'struct hw_perf_event' and use > > that instead? assuming I'm not missing anything.. > > > > jirka > > I am trying to make the master event function the same as a real event, > while keep dup events as followers. This avoids "switching master" in > earlier versions (and Tejun's RFC). yep, I understand.. still, it seems too much to allocate the whole 'struct perf_even't just to get separated 'count' variable jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Thu, Aug 30, 2018 at 06:51:07PM +, Song Liu wrote: > > > > On Aug 30, 2018, at 8:18 AM, Jiri Olsa wrote: > > > > On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: > > > > SNIP > > > >> @@ -6100,7 +6333,7 @@ static void perf_output_read_group(struct > >> perf_output_handle *handle, > >> > >>if ((sub != event) && > >>(sub->state == PERF_EVENT_STATE_ACTIVE)) > >> - sub->pmu->read(sub); > >> + event_pmu_read(sub); > >> > >>values[n++] = perf_event_count(sub); > >>if (read_format & PERF_FORMAT_ID) > >> @@ -9109,7 +9342,7 @@ static enum hrtimer_restart > >> perf_swevent_hrtimer(struct hrtimer *hrtimer) > >>if (event->state != PERF_EVENT_STATE_ACTIVE) > >>return HRTIMER_NORESTART; > >> > >> - event->pmu->read(event); > >> + event_pmu_read(event); > >> > >>perf_sample_data_init(, 0, event->hw.last_period); > >>regs = get_irq_regs(); > >> @@ -10504,6 +10737,14 @@ SYSCALL_DEFINE5(perf_event_open, > >>goto err_cred; > >>} > >> > >> + if (perf_event_can_share(event)) { > >> + event->tmp_master = perf_event_alloc(>attr, cpu, > >> + task, NULL, NULL, > >> + NULL, NULL, -1); > > > > can't get around this.. I understand the need, but AFAICS you allocate > > the whole 'struct perf_event', just because there's count field in it > > otherwise the 'struct hw_perf_event' should be enough to carry all that's > > needed to read hw event > > > > would it be better to move the count to 'struct hw_perf_event' and use > > that instead? assuming I'm not missing anything.. > > > > jirka > > I am trying to make the master event function the same as a real event, > while keep dup events as followers. This avoids "switching master" in > earlier versions (and Tejun's RFC). yep, I understand.. still, it seems too much to allocate the whole 'struct perf_even't just to get separated 'count' variable jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Thu, Aug 30, 2018 at 06:35:37PM +, Song Liu wrote: > > > > On Aug 30, 2018, at 8:13 AM, Jiri Olsa wrote: > > > > On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: > > > > SNIP > > > >> > >> + perf_event_remove_dup(event, ctx); > >>/* > >> * We can have double detach due to exit/hot-unplug + close. > >> */ > >> @@ -1982,6 +2123,92 @@ event_filter_match(struct perf_event *event) > >> perf_cgroup_match(event) && pmu_filter_match(event); > >> } > >> > >> +/* PMU sharing aware version of event->pmu->add() */ > >> +static int event_pmu_add(struct perf_event *event, > >> + struct perf_event_context *ctx) > >> +{ > >> + struct perf_event_dup *dup; > >> + int ret; > >> + > >> + /* no sharing, just do event->pmu->add() */ > >> + if (event->dup_id == -1) > >> + return event->pmu->add(event, PERF_EF_START); > >> + > >> + dup = >dup_events[event->dup_id]; > >> + > >> + if (dup->active_event_count) { > >> + /* already enabled */ > >> + dup->active_event_count++; > >> + dup->master->pmu->read(dup->master); > >> + event->dup_base_count = dup_read_count(dup); > >> + event->dup_base_child_count = dup_read_child_count(dup); > >> + return 0; > >> + } > >> + > >> + /* try add master */ > >> + ret = event->pmu->add(dup->master, PERF_EF_START); > >> + > >> + if (!ret) { > >> + dup->active_event_count = 1; > >> + event->pmu->read(dup->master); > >> + event->dup_base_count = dup_read_count(dup); > >> + event->dup_base_child_count = dup_read_child_count(dup); > > > > should you read the base before calling pmu->add ? > > should be same for any dup event not just master > > > > jirka > > I am not sure I am following. The pmu is disabled when we call > event_pmu_add(). Why do we need to read before calling pmu->add()? > And this is the first added dup event for this master, so we don't > need to worry about others. > > Does this make sense? I was just thinking since the pmu is disable we could we don't need to read the event on 2 places.. it's almost identic code jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Thu, Aug 30, 2018 at 06:35:37PM +, Song Liu wrote: > > > > On Aug 30, 2018, at 8:13 AM, Jiri Olsa wrote: > > > > On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: > > > > SNIP > > > >> > >> + perf_event_remove_dup(event, ctx); > >>/* > >> * We can have double detach due to exit/hot-unplug + close. > >> */ > >> @@ -1982,6 +2123,92 @@ event_filter_match(struct perf_event *event) > >> perf_cgroup_match(event) && pmu_filter_match(event); > >> } > >> > >> +/* PMU sharing aware version of event->pmu->add() */ > >> +static int event_pmu_add(struct perf_event *event, > >> + struct perf_event_context *ctx) > >> +{ > >> + struct perf_event_dup *dup; > >> + int ret; > >> + > >> + /* no sharing, just do event->pmu->add() */ > >> + if (event->dup_id == -1) > >> + return event->pmu->add(event, PERF_EF_START); > >> + > >> + dup = >dup_events[event->dup_id]; > >> + > >> + if (dup->active_event_count) { > >> + /* already enabled */ > >> + dup->active_event_count++; > >> + dup->master->pmu->read(dup->master); > >> + event->dup_base_count = dup_read_count(dup); > >> + event->dup_base_child_count = dup_read_child_count(dup); > >> + return 0; > >> + } > >> + > >> + /* try add master */ > >> + ret = event->pmu->add(dup->master, PERF_EF_START); > >> + > >> + if (!ret) { > >> + dup->active_event_count = 1; > >> + event->pmu->read(dup->master); > >> + event->dup_base_count = dup_read_count(dup); > >> + event->dup_base_child_count = dup_read_child_count(dup); > > > > should you read the base before calling pmu->add ? > > should be same for any dup event not just master > > > > jirka > > I am not sure I am following. The pmu is disabled when we call > event_pmu_add(). Why do we need to read before calling pmu->add()? > And this is the first added dup event for this master, so we don't > need to worry about others. > > Does this make sense? I was just thinking since the pmu is disable we could we don't need to read the event on 2 places.. it's almost identic code jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
> On Aug 30, 2018, at 8:18 AM, Jiri Olsa wrote: > > On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: > > SNIP > >> @@ -6100,7 +6333,7 @@ static void perf_output_read_group(struct >> perf_output_handle *handle, >> >> if ((sub != event) && >> (sub->state == PERF_EVENT_STATE_ACTIVE)) >> -sub->pmu->read(sub); >> +event_pmu_read(sub); >> >> values[n++] = perf_event_count(sub); >> if (read_format & PERF_FORMAT_ID) >> @@ -9109,7 +9342,7 @@ static enum hrtimer_restart >> perf_swevent_hrtimer(struct hrtimer *hrtimer) >> if (event->state != PERF_EVENT_STATE_ACTIVE) >> return HRTIMER_NORESTART; >> >> -event->pmu->read(event); >> +event_pmu_read(event); >> >> perf_sample_data_init(, 0, event->hw.last_period); >> regs = get_irq_regs(); >> @@ -10504,6 +10737,14 @@ SYSCALL_DEFINE5(perf_event_open, >> goto err_cred; >> } >> >> +if (perf_event_can_share(event)) { >> +event->tmp_master = perf_event_alloc(>attr, cpu, >> + task, NULL, NULL, >> + NULL, NULL, -1); > > can't get around this.. I understand the need, but AFAICS you allocate > the whole 'struct perf_event', just because there's count field in it > otherwise the 'struct hw_perf_event' should be enough to carry all that's > needed to read hw event > > would it be better to move the count to 'struct hw_perf_event' and use > that instead? assuming I'm not missing anything.. > > jirka I am trying to make the master event function the same as a real event, while keep dup events as followers. This avoids "switching master" in earlier versions (and Tejun's RFC). I also read your version that does it at hardware level, and found it simplifies some parts of the change. I picked current approach mostly because this approach keeps all logic about PMU sharing in one place, and the rest of the perf subsystem can stay as-is. If this approach doesn't work out, I will probably try the hardware level approach. Thanks, Song
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
> On Aug 30, 2018, at 8:18 AM, Jiri Olsa wrote: > > On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: > > SNIP > >> @@ -6100,7 +6333,7 @@ static void perf_output_read_group(struct >> perf_output_handle *handle, >> >> if ((sub != event) && >> (sub->state == PERF_EVENT_STATE_ACTIVE)) >> -sub->pmu->read(sub); >> +event_pmu_read(sub); >> >> values[n++] = perf_event_count(sub); >> if (read_format & PERF_FORMAT_ID) >> @@ -9109,7 +9342,7 @@ static enum hrtimer_restart >> perf_swevent_hrtimer(struct hrtimer *hrtimer) >> if (event->state != PERF_EVENT_STATE_ACTIVE) >> return HRTIMER_NORESTART; >> >> -event->pmu->read(event); >> +event_pmu_read(event); >> >> perf_sample_data_init(, 0, event->hw.last_period); >> regs = get_irq_regs(); >> @@ -10504,6 +10737,14 @@ SYSCALL_DEFINE5(perf_event_open, >> goto err_cred; >> } >> >> +if (perf_event_can_share(event)) { >> +event->tmp_master = perf_event_alloc(>attr, cpu, >> + task, NULL, NULL, >> + NULL, NULL, -1); > > can't get around this.. I understand the need, but AFAICS you allocate > the whole 'struct perf_event', just because there's count field in it > otherwise the 'struct hw_perf_event' should be enough to carry all that's > needed to read hw event > > would it be better to move the count to 'struct hw_perf_event' and use > that instead? assuming I'm not missing anything.. > > jirka I am trying to make the master event function the same as a real event, while keep dup events as followers. This avoids "switching master" in earlier versions (and Tejun's RFC). I also read your version that does it at hardware level, and found it simplifies some parts of the change. I picked current approach mostly because this approach keeps all logic about PMU sharing in one place, and the rest of the perf subsystem can stay as-is. If this approach doesn't work out, I will probably try the hardware level approach. Thanks, Song
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
> On Aug 30, 2018, at 8:13 AM, Jiri Olsa wrote: > > On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: > > SNIP > >> >> +perf_event_remove_dup(event, ctx); >> /* >> * We can have double detach due to exit/hot-unplug + close. >> */ >> @@ -1982,6 +2123,92 @@ event_filter_match(struct perf_event *event) >> perf_cgroup_match(event) && pmu_filter_match(event); >> } >> >> +/* PMU sharing aware version of event->pmu->add() */ >> +static int event_pmu_add(struct perf_event *event, >> + struct perf_event_context *ctx) >> +{ >> +struct perf_event_dup *dup; >> +int ret; >> + >> +/* no sharing, just do event->pmu->add() */ >> +if (event->dup_id == -1) >> +return event->pmu->add(event, PERF_EF_START); >> + >> +dup = >dup_events[event->dup_id]; >> + >> +if (dup->active_event_count) { >> +/* already enabled */ >> +dup->active_event_count++; >> +dup->master->pmu->read(dup->master); >> +event->dup_base_count = dup_read_count(dup); >> +event->dup_base_child_count = dup_read_child_count(dup); >> +return 0; >> +} >> + >> +/* try add master */ >> +ret = event->pmu->add(dup->master, PERF_EF_START); >> + >> +if (!ret) { >> +dup->active_event_count = 1; >> +event->pmu->read(dup->master); >> +event->dup_base_count = dup_read_count(dup); >> +event->dup_base_child_count = dup_read_child_count(dup); > > should you read the base before calling pmu->add ? > should be same for any dup event not just master > > jirka I am not sure I am following. The pmu is disabled when we call event_pmu_add(). Why do we need to read before calling pmu->add()? And this is the first added dup event for this master, so we don't need to worry about others. Does this make sense? Thanks for the review! Song
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
> On Aug 30, 2018, at 8:13 AM, Jiri Olsa wrote: > > On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: > > SNIP > >> >> +perf_event_remove_dup(event, ctx); >> /* >> * We can have double detach due to exit/hot-unplug + close. >> */ >> @@ -1982,6 +2123,92 @@ event_filter_match(struct perf_event *event) >> perf_cgroup_match(event) && pmu_filter_match(event); >> } >> >> +/* PMU sharing aware version of event->pmu->add() */ >> +static int event_pmu_add(struct perf_event *event, >> + struct perf_event_context *ctx) >> +{ >> +struct perf_event_dup *dup; >> +int ret; >> + >> +/* no sharing, just do event->pmu->add() */ >> +if (event->dup_id == -1) >> +return event->pmu->add(event, PERF_EF_START); >> + >> +dup = >dup_events[event->dup_id]; >> + >> +if (dup->active_event_count) { >> +/* already enabled */ >> +dup->active_event_count++; >> +dup->master->pmu->read(dup->master); >> +event->dup_base_count = dup_read_count(dup); >> +event->dup_base_child_count = dup_read_child_count(dup); >> +return 0; >> +} >> + >> +/* try add master */ >> +ret = event->pmu->add(dup->master, PERF_EF_START); >> + >> +if (!ret) { >> +dup->active_event_count = 1; >> +event->pmu->read(dup->master); >> +event->dup_base_count = dup_read_count(dup); >> +event->dup_base_child_count = dup_read_child_count(dup); > > should you read the base before calling pmu->add ? > should be same for any dup event not just master > > jirka I am not sure I am following. The pmu is disabled when we call event_pmu_add(). Why do we need to read before calling pmu->add()? And this is the first added dup event for this master, so we don't need to worry about others. Does this make sense? Thanks for the review! Song
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: SNIP > @@ -6100,7 +6333,7 @@ static void perf_output_read_group(struct > perf_output_handle *handle, > > if ((sub != event) && > (sub->state == PERF_EVENT_STATE_ACTIVE)) > - sub->pmu->read(sub); > + event_pmu_read(sub); > > values[n++] = perf_event_count(sub); > if (read_format & PERF_FORMAT_ID) > @@ -9109,7 +9342,7 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct > hrtimer *hrtimer) > if (event->state != PERF_EVENT_STATE_ACTIVE) > return HRTIMER_NORESTART; > > - event->pmu->read(event); > + event_pmu_read(event); > > perf_sample_data_init(, 0, event->hw.last_period); > regs = get_irq_regs(); > @@ -10504,6 +10737,14 @@ SYSCALL_DEFINE5(perf_event_open, > goto err_cred; > } > > + if (perf_event_can_share(event)) { > + event->tmp_master = perf_event_alloc(>attr, cpu, > + task, NULL, NULL, > + NULL, NULL, -1); can't get around this.. I understand the need, but AFAICS you allocate the whole 'struct perf_event', just because there's count field in it otherwise the 'struct hw_perf_event' should be enough to carry all that's needed to read hw event would it be better to move the count to 'struct hw_perf_event' and use that instead? assuming I'm not missing anything.. jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: SNIP > @@ -6100,7 +6333,7 @@ static void perf_output_read_group(struct > perf_output_handle *handle, > > if ((sub != event) && > (sub->state == PERF_EVENT_STATE_ACTIVE)) > - sub->pmu->read(sub); > + event_pmu_read(sub); > > values[n++] = perf_event_count(sub); > if (read_format & PERF_FORMAT_ID) > @@ -9109,7 +9342,7 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct > hrtimer *hrtimer) > if (event->state != PERF_EVENT_STATE_ACTIVE) > return HRTIMER_NORESTART; > > - event->pmu->read(event); > + event_pmu_read(event); > > perf_sample_data_init(, 0, event->hw.last_period); > regs = get_irq_regs(); > @@ -10504,6 +10737,14 @@ SYSCALL_DEFINE5(perf_event_open, > goto err_cred; > } > > + if (perf_event_can_share(event)) { > + event->tmp_master = perf_event_alloc(>attr, cpu, > + task, NULL, NULL, > + NULL, NULL, -1); can't get around this.. I understand the need, but AFAICS you allocate the whole 'struct perf_event', just because there's count field in it otherwise the 'struct hw_perf_event' should be enough to carry all that's needed to read hw event would it be better to move the count to 'struct hw_perf_event' and use that instead? assuming I'm not missing anything.. jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: SNIP > > + perf_event_remove_dup(event, ctx); > /* >* We can have double detach due to exit/hot-unplug + close. >*/ > @@ -1982,6 +2123,92 @@ event_filter_match(struct perf_event *event) > perf_cgroup_match(event) && pmu_filter_match(event); > } > > +/* PMU sharing aware version of event->pmu->add() */ > +static int event_pmu_add(struct perf_event *event, > + struct perf_event_context *ctx) > +{ > + struct perf_event_dup *dup; > + int ret; > + > + /* no sharing, just do event->pmu->add() */ > + if (event->dup_id == -1) > + return event->pmu->add(event, PERF_EF_START); > + > + dup = >dup_events[event->dup_id]; > + > + if (dup->active_event_count) { > + /* already enabled */ > + dup->active_event_count++; > + dup->master->pmu->read(dup->master); > + event->dup_base_count = dup_read_count(dup); > + event->dup_base_child_count = dup_read_child_count(dup); > + return 0; > + } > + > + /* try add master */ > + ret = event->pmu->add(dup->master, PERF_EF_START); > + > + if (!ret) { > + dup->active_event_count = 1; > + event->pmu->read(dup->master); > + event->dup_base_count = dup_read_count(dup); > + event->dup_base_child_count = dup_read_child_count(dup); should you read the base before calling pmu->add ? should be same for any dup event not just master jirka
Re: [PATCH v2 1/1] perf: Sharing PMU counters across compatible events
On Wed, Aug 15, 2018 at 10:03:13AM -0700, Song Liu wrote: SNIP > > + perf_event_remove_dup(event, ctx); > /* >* We can have double detach due to exit/hot-unplug + close. >*/ > @@ -1982,6 +2123,92 @@ event_filter_match(struct perf_event *event) > perf_cgroup_match(event) && pmu_filter_match(event); > } > > +/* PMU sharing aware version of event->pmu->add() */ > +static int event_pmu_add(struct perf_event *event, > + struct perf_event_context *ctx) > +{ > + struct perf_event_dup *dup; > + int ret; > + > + /* no sharing, just do event->pmu->add() */ > + if (event->dup_id == -1) > + return event->pmu->add(event, PERF_EF_START); > + > + dup = >dup_events[event->dup_id]; > + > + if (dup->active_event_count) { > + /* already enabled */ > + dup->active_event_count++; > + dup->master->pmu->read(dup->master); > + event->dup_base_count = dup_read_count(dup); > + event->dup_base_child_count = dup_read_child_count(dup); > + return 0; > + } > + > + /* try add master */ > + ret = event->pmu->add(dup->master, PERF_EF_START); > + > + if (!ret) { > + dup->active_event_count = 1; > + event->pmu->read(dup->master); > + event->dup_base_count = dup_read_count(dup); > + event->dup_base_child_count = dup_read_child_count(dup); should you read the base before calling pmu->add ? should be same for any dup event not just master jirka