Hi Stefane,

Thanks for the nice writeup. Some comments.

- Are there actually 2 syscalls each for the _pmrs functions?
- For attach_session, I think target needs to be an unsigned long, not 
an int.
- I need delete_eventset functionality. Dynamic tools set up and tear 
down eventsets, multiplexing, sampling, etc... How can I do it with the 
new code?

Thanks,

Phil

stephane eranian wrote:
> Hello,
> 
> As you may know, I have been working on redesigning the perfmon2 system call
> API to address some issues raised by the LKML review a couple of months back.
> 
> I have made some good progress on this and I'd like to share the current 
> design
> with you.
> 
> There were three major goals in the redesign:
> 
>   - minimize the number of system calls. Pefmon2 v2.x has 12 syscalls. This 
> was
>     considered quite  a lot for a single subsystem.
> 
>   - make sure we could extend the syscall API without necessarily
> adding new syscalls.
>     That means having flexibility  in each syscall and also in the
> structures passed
>     in syscalls.
> 
>   - make sure it would be possible to use existing v2.0 (IA-64) and
> also v2.x (x>0)
>     applications with, at worse, a recompilation. Unlike v2.0 (IA-64
> single syscall API)
>     the compatibility would have to be provided by user level code,
> e.g., libpfm.
>     Obviously statically linked applications would have to be recompiled.
> 
> I am happy to report that I has successfully fulfilled those three
> goals. Here is how
> the new API looks like now:
> 
>    - down to 8 system calls, all with different names from v2.81
>    - several data structures, pfarg_*, have been removed
>    - full compatibility with v2.81 except for one system call
> (pfm_delete_evtsets)
>    - all system calls are extensible via a flags parameter
>    - all structures have reserved fields
>    - the context abstraction has been replaced with the notion of a session.
> 
> Here are the details:
> 
>   I) session creation
> 
>   With v2.81:
>      int pfm_create_context(pfarg_ctx_t *ctx, char *smpl_name, void
> *smpl_arg, size_t smpl_size);
> 
>  With v3.0:
>      int pfm_create_session(int flags);
>      int pfm_create_session(int flags, char *smpl_name, void
> *smpl_arg, size_t smpl_size);
> 
>      New Flags:
>           PFM_FL_SMPL_FMT        : indicate using sampling format and
> that 3 additional parameters are passed
> 
>   The pfarg_ctx_t structure has been abandoned. The flags parameter is
> used very much like for the open(2)
>    syscall to indicate that additional (optional) parameters are passed.
> 
>    All old flags are preserved.
> 
>    The call still returns the file descriptor uniquely identifying the 
> session.
> 
>    Just like with context, a session can either be monitoring a thread or a 
> CPU.
> 
>  II) programming the registers
> 
>   With v2.81:
>        int pfm_write_pmcs(int fd, pfarg_pmc_t *pmds, int  n);
>        int pfm_write_pmds(int fd, pfarg_pmd_t *pmcs, int n);
>        int pfm_read_pmds(int fd, parg_pmd_t *pmds, int n);
> 
>   With v3.0:
>        int pfm_write_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n);
>        int pfm_write_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n,
> pfarg_pmd_attr_t *pmas);
> 
>        int pfm_read_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n);
>        int pfm_read_pmrs(int fd, int flags, parg_pmr_t *pmrs, int n,
> pfarg_pmd_attr_t *pmas);
> 
>   New structures:
> 
>     typedef struct {
>         u16 reg_num;
>         u16 reg_set;
>         u32 reg_flags;
>         u64 reg_value;
>     } pfarg_pmr_t;
> 
>     typedef struct {
>         u64 reg_long_reset;
>         u64 reg_short_reset;
>         u64 reg_random_mask;
>         u64 reg_smpl_pmds[PFM_PMD_BV];
>         u64 reg_reset_pmds[PFM_PMD_BV];
>         u64 reg_ovfl_swcnt;
>         u64 reg_smpl_eventid;
>         u64 reg_last_value;
>         u64 reg_reserved[8];
>     };
> 
>   New flags:
>      PFM_RWFL_PMD  : pmrs contains PMD register descriptions
>      PFM_RWFL_PMC  : pmrs contains PMC register descriptions
>      PFM_RWFL_PMD_ATTR: an additional vector is passed in pmas
> 
>   We now use only 2 system calls to read and write the PMU registers.
> This is possible because
>   we are sharing the same register description data structure,
> pfarg_pmr_t. They key attributes
>   of each register are encapsulated into this structure. Additional
> PMD attributes related to sampling
>   and multiplexing are off-loaded into another optional structure,
> pfarg_pmd_attr_t. This structure
>   becomes optional and is only looked at by the kernel if the
> PFM_RWFL_PMD_ATTR flag is passed.
> 
>   For all counting applications, using pfarg_pmr_t is enough. The nice
> side effect of this split is that
>   the cost of reading and writing PMD register is now reduced because
> we have less data to copy in
>   and out of the kernel.
> 
>   Unlike suggested by some people, I have not merged the notions of
> PMD and PMC registers. I think
>   it is cleaner to separate them out. It also makes it much easier to
> provide backward compatibility with
>    v2.81.
> 
> III) attaching and detaching
> 
>    With v2.81:
>       int pfm_load_context(int fd, pfarg_load_t *load);
>       int pfm_unload_context(int fd);
> 
>    With v3.0:
>       int pfm_attach_session(int fd, int flags, int target);
>       int pfm_detach_session(int fd, int flags);
> 
>    The pfarg_load_t structure has been abandoned. The information about what 
> to
>    attach to is passed as a parameter to the syscall in "target". It
> can either be
>    a thread id or a CPU id.
> 
>    There are currently no flags defined for either call.
> 
>     Note that we have lost the ability to specify which event set is
> to be activated first.
>     There was no actual use of this option anyway.
> 
>  IV) starting and stopping
> 
>     With v2.81:
>        int pfm_start(int fd, pfarg_start_t *st);
>        int pfm_stop(int fd);
>        int pfm_restart(int fd);
> 
>     With v3.0:
>        int pfm_start_session(int fd, int flags);
>        int pfm_stop_session(int fd, int flags);
> 
>       New flags:
>       PFM_STFL_RESTART: resume monitoring after an overflow notification
> 
>       The pfarg_start_t structure has been abandoned.
> 
>       The pfm_restart() syscall has been merged with pfm_start() by
> using the PFM_STFL_RESTART
>        flag. It is not possible to just use pfm_start_session() and
> internally determine what to do because
>        this is dependent on the sampling format.
> 
>        We have lost the ability to specify on which event set to
> start. I don't think this option was ever
>        used.
> 
>    V) event set and multiplexing
> 
>       With v2.81:
>          int pfm_create_evtsets(int fd, pfarg_setdesc_t *s, int n);
>          int pfm_getinfo_evtsets(int fd, pfarg_setinfo_t *s, int n);
>          int pfm_delete_evtsets(int fd, pfarg_setdesc_t *s, int n);
> 
>      With v3.0:
>          int pfm_create_sets(int fd, int flags, pfarg_setdesc_t *s, int n);
>          int pfm_getinfo_evtsets(int fd, int flags, pfarg_setinfo_t *s, int 
> n);
> 
>      We have kept the same data structures and simply added a flags
> parameters to provide
>      for extensibility of the calls.
> 
>      We have removed pfm_delete_evtsets() because it was not used by a
> lot of applications.
>      We could add it back later if there is a good reason for it,
> something stronger than saying
>      it needs to be there for symmetry.
> 
>   VI) libpfm
> 
>      The libpfm library will support v2.81 and v3.0 transparently. The
> examples have all been rewritten
>       to the new API. The v2.81 examples are still there for testing
> and comparison purposes. They
>      compile with no modifications.
> 
>      The perfmon.h header has been strongly restructured to accomodate
> v3.0, v2.81 and older v2.x
>      releases as used by Cray and SiCortex or even older v2.0 IA-64 programs.
> 
>      When using a the v2.81 API, the library is able to adapt based on
> the host kernel perfmon version.
>      If the host kernel is using v2.81 then the calls are simply
> passed through. If instead, the kernel
>      uses v3.0, then each v2.81 call is mapped onto its equivalent
> v3.0 whenever possible. If the
>      syscall has no equivalent, e.g., pfm_delete_evtsets(), then an
> error is returned. The errno
>      value is updated correctly when using v2.81 <-> v3.0 glue layer.
> 
>  VII) conclusions
> 
>      There are no other modifications to other parts of the API.
> 
>      I think this ultimate modification addresses ALL outstanding
> issues raised by LKML.
> 
>      I have presented the new API for a fully featured perfmon, with
> sampling, system-wide and multiplexing.
>      This is not what is going to go into the mainline kernel
> initially. The plan is to use this as the development
>      code and pull bits and pieces into the minimal kernel patch that
> I maintain separately and which is
>      what I intend to post to LKML. The reason I modified the fully
> featured version is to gauge all the changes
>      needed and their potential connections.
> 
>      I will soon, create branches on both the GIT tree, libpfm CVS so
> that you will be able to experiment with this
>      API. The v3.0 has been added to all architectures currently
> supported. They all compile. I was not able to
>      tests many of them for lack of hardware.
> 
>      I welcome any comments you may have about this new API.
> 
> S.Eranian
> 
> Thanks.
> 
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> perfmon2-devel mailing list
> perfmon2-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to