* Borislav Petkov <[email protected]> wrote:

> From: Borislav Petkov <[email protected]>
> 
> Yeah,
> 
> here's a refresh of the persistent events deal, accessing those is much
> cleaner now. Here's how:
> 
> So kernel code initializes and enables the event at its convenience
> (during boot, whenever) and userspace goes and says:
> 
>       sys_perf_event_open(pattr,...)
> 
> with pattr.persistent = 1. Userspace gets the persistent buffer file
> descriptor to read from. Without that, we get a normal perf file
> descriptor for the duration of the tracing.
> 
> This saves all the diddling of trying to hand down file descriptors
> through debugfs or whatever. Instead, current perf code simply can use
> it.
> 
> This is still RFC but things are starting to fall into place slowly. As 
> always, any and all comments/suggestions are welcome.

That definitely looks interesting and desirable. It would be nice to have 
more generic/flexible semantics by using the VFS for tracing context 
discovery.

That would allow 'stateful tracing', and not just in a kernel initiated 
fashion: we could basically do ftrace-alike tracing, into persistent, 
VFS-named buffers.

The question is, how are the individual buffers identified when and after 
they have been created? An option would be to use cgroups for that - 
cgroups already has its own VFS and syscall interfaces. But maybe some 
other, explicit interface is needed (eventfs).

All the usecases we talked about in the past would work fine that way:

 - the MCE events would show up as an already created set of buffers, 
   discoverable via the VFS interface.

 - user-space could generate more 'tracing/profiling contexts' runtime.

 - a boot tracer would activate via a boot option, and it would create a 
   tracing context - visible via the VFS interface.

 - modern RAS daemon replacing mcelog

If you make that work, via a new perf tool side as well that allows the 
creation of a tracing context (and a separate extraction as well), via 
modified 'perf trace' or a new subcommand, that would be an major, 
upstream-worthy perf feature IMO which would go way beyond the RAS usecase 
...

Such a feature would become a popular instrumentation tool pretty quickly.

Thanks,
        
        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to