On Mon, Sep 30, 2019 at 04:07:55PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 30, 2019 at 03:06:15PM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 16, 2019 at 06:41:21AM -0700, [email protected] wrote:
> 
> > > +static bool is_first_topdown_event_in_group(struct perf_event *event)
> > > +{
> > > + struct perf_event *first = NULL;
> > > +
> > > + if (is_topdown_event(event->group_leader))
> > > +         first = event->group_leader;
> > > + else {
> > > +         for_each_sibling_event(first, event->group_leader)
> > > +                 if (is_topdown_event(first))
> > > +                         break;
> > > + }
> > > +
> > > + if (event == first)
> > > +         return true;
> > > +
> > > + return false;
> > > +}
> > 
> > > +static u64 icl_update_topdown_event(struct perf_event *event)
> > > +{
> > > + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> > > + struct perf_event *other;
> > > + u64 slots, metrics;
> > > + int idx;
> > > +
> > > + /*
> > > +  * Only need to update all events for the first
> > > +  * slots/metrics event in a group
> > > +  */
> > > + if (event && !is_first_topdown_event_in_group(event))
> > > +         return 0;
> > 
> > This is pretty crap and approaches O(n^2); let me think if there's
> > anything saner to do here.
> 
> This is also really complicated in the case where we do
> perf_remove_from_context() in the 'wrong' order.
> 
> In that case we get detached events that are not up-to-date (and never
> will be). It doesn't look like that matters, but it is weird.

So we either get called from the PMI, or read(). In the PMI there is the
perf_output_read_group() path, and that too appears broken vs the above,
it assumes perf_event_count() is up-to-date after calling pmu->read(),
which isn't true.

Now, I'm thinking that is already broken vs TXN_READ, so we should fix
that a little something like the below (needs to be tested on
Power-hv-24x7).

---
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6272,10 +6272,22 @@ static void perf_output_read_group(struc
        if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
                values[n++] = running;
 
+       if (leader->nr_siblings > 1)
+               leader->pmu->start_txn(pmu, PERF_PMU_TXN_READ);
+
        if ((leader != event) &&
            (leader->state == PERF_EVENT_STATE_ACTIVE))
                leader->pmu->read(leader);
 
+       for_each_sibling_event(sub, leader) {
+               if ((sub != event) &&
+                   (sub->state == PERF_EVENT_STATE_ACTIVE))
+                       sub->pmu->read(sub);
+       }
+
+       if (leader->nr_siblings > 1)
+               leader->pmu->commit_tx(pmu, PERF_PMU_TXN_READ);
+
        values[n++] = perf_event_count(leader);
        if (read_format & PERF_FORMAT_ID)
                values[n++] = primary_event_id(leader);
@@ -6285,10 +6297,6 @@ static void perf_output_read_group(struc
        for_each_sibling_event(sub, leader) {
                n = 0;
 
-               if ((sub != event) &&
-                   (sub->state == PERF_EVENT_STATE_ACTIVE))
-                       sub->pmu->read(sub);
-
                values[n++] = perf_event_count(sub);
                if (read_format & PERF_FORMAT_ID)
                        values[n++] = primary_event_id(sub);


After that, I think we can simply do something like:

icl_update_topdown_event(..)
{
        int idx = event->hwc.idx;

        if (is_metric_idx(idx))
                return;

        // must be FIXED_SLOTS

        /* do teh thing and update SLOTS and METRIC together */
}

Hmmm?

Reply via email to