I'm trying to use the perf API to trace a single process, including any child threads that are created after I initially set up the events. (My end goal is to integrate perf-events into google-perftools; my work is open source and available here: https://github.com/justinsb/fathomdb-google-perftools-extensions)
The strategy I've been using is to enumerate all child threads at the first call by reading /proc/<pid>/maps. I then set up my performance events using sys_perf_event_open, one event per thread tid. I mmap each of those fds, and then poll() them. Works great. If a new thread starts, I do get the PERF_RECORD_FORK event, so I can then add additional monitors for the new thread. However, the poll is only signaled when a page is full, by which time that thread could be long gone (or I could have missed all sorts of interesting events on it). I would think that PERF_RECORD_FORK should cause an immediate signal, but I don't see how I can do that without having _every_ event signal my thread (which would be very high overhead I think). I guess I could schedule a second "dummy" perf event monitor on the same process, with immediate signalling, but with a low enough freequency that it wouldn't fire except for process events. I can't use the "inherit" flag (which would otherwise be ideal), because then I can't mmap the result (the kernel deliberately won't allow it - it's the first test in perf_mmap() in kernel/events/core.c). I feel like I'm doing something wrong - what is the suggested approach here? Thanks, Justin -- To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
