On Mon, Sep 08, 2014 at 01:39:58PM +0200, Peter Zijlstra wrote:
> On Mon, Sep 08, 2014 at 12:01:22PM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 08, 2014 at 11:48:55AM +0200, Peter Zijlstra wrote:
> > 
> > > > The thing is; I don't understand those reasons. That commit log doesn't
> > > > explain.
> > > 
> > > Ah wait, I finally see. I think we want to fix that exit path, not
> > > disallow the cloning.
> > > 
> > > The thing is, by not allowing this optimization simple things like eg.
> > > pipe-test say very expensive.
> > 
> > So its 179033b3e064 ("perf: Add PERF_EVENT_STATE_EXIT state for events
> > with exited task") that introduces the problem. Before that things would
> > work correctly afaict.
> > 
> > The exit would remove from the context but leave the event in existence.
> > Both the fd and the inherited events would have references to it, only
> > once those are gone do we destroy the actual event.
> 
> I have another 'problem' with 179033b3e064. What if you 'want' to
> continue monitoring after the initial task died? Eg. if you want to
> monitor crap that unconditionally daemonizes.

right.. did not think of that.. need to check more, but
seems like just the check for children should be enough

jirka


---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index bf482ccbdbe1..341d0b47ca14 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3568,6 +3568,19 @@ static int perf_event_read_one(struct perf_event *event,
        return n * sizeof(u64);
 }
 
+static bool is_event_hup(struct perf_event *event)
+{
+       bool no_children;
+
+       if (event->state != PERF_EVENT_STATE_EXIT)
+               return false;
+
+       mutex_lock(&event->child_mutex);
+       no_children = list_empty(&event->child_list);
+       mutex_unlock(&event->child_mutex);
+       return no_children;
+}
+
 /*
  * Read the performance event - simple non blocking version for now
  */
@@ -3582,8 +3595,7 @@ perf_read_hw(struct perf_event *event, char __user *buf, 
size_t count)
         * error state (i.e. because it was pinned but it couldn't be
         * scheduled on to the CPU at some point).
         */
-       if ((event->state == PERF_EVENT_STATE_ERROR) ||
-           (event->state == PERF_EVENT_STATE_EXIT))
+       if ((event->state == PERF_EVENT_STATE_ERROR) || (is_event_hup(event)))
                return 0;
 
        if (count < event->read_size)
@@ -3614,7 +3626,7 @@ static unsigned int perf_poll(struct file *file, 
poll_table *wait)
 
        poll_wait(file, &event->waitq, wait);
 
-       if (event->state == PERF_EVENT_STATE_EXIT)
+       if (is_event_hup(event))
                return events;
 
        /*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to