Hello,

As you may know, I have been working on redesigning the perfmon2 system call
API to address some issues raised by the LKML review a couple of months back.

I have made some good progress on this and I'd like to share the current design
with you.

There were three major goals in the redesign:

  - minimize the number of system calls. Pefmon2 v2.x has 12 syscalls. This was
    considered quite  a lot for a single subsystem.

  - make sure we could extend the syscall API without necessarily
adding new syscalls.
    That means having flexibility  in each syscall and also in the
structures passed
    in syscalls.

  - make sure it would be possible to use existing v2.0 (IA-64) and
also v2.x (x>0)
    applications with, at worse, a recompilation. Unlike v2.0 (IA-64
single syscall API)
    the compatibility would have to be provided by user level code,
e.g., libpfm.
    Obviously statically linked applications would have to be recompiled.

I am happy to report that I has successfully fulfilled those three
goals. Here is how
the new API looks like now:

   - down to 8 system calls, all with different names from v2.81
   - several data structures, pfarg_*, have been removed
   - full compatibility with v2.81 except for one system call
(pfm_delete_evtsets)
   - all system calls are extensible via a flags parameter
   - all structures have reserved fields
   - the context abstraction has been replaced with the notion of a session.

Here are the details:

  I) session creation

  With v2.81:
     int pfm_create_context(pfarg_ctx_t *ctx, char *smpl_name, void
*smpl_arg, size_t smpl_size);

 With v3.0:
     int pfm_create_session(int flags);
     int pfm_create_session(int flags, char *smpl_name, void
*smpl_arg, size_t smpl_size);

     New Flags:
          PFM_FL_SMPL_FMT        : indicate using sampling format and
that 3 additional parameters are passed

  The pfarg_ctx_t structure has been abandoned. The flags parameter is
used very much like for the open(2)
   syscall to indicate that additional (optional) parameters are passed.

   All old flags are preserved.

   The call still returns the file descriptor uniquely identifying the session.

   Just like with context, a session can either be monitoring a thread or a CPU.

 II) programming the registers

  With v2.81:
       int pfm_write_pmcs(int fd, pfarg_pmc_t *pmds, int  n);
       int pfm_write_pmds(int fd, pfarg_pmd_t *pmcs, int n);
       int pfm_read_pmds(int fd, parg_pmd_t *pmds, int n);

  With v3.0:
       int pfm_write_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n);
       int pfm_write_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n,
pfarg_pmd_attr_t *pmas);

       int pfm_read_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n);
       int pfm_read_pmrs(int fd, int flags, parg_pmr_t *pmrs, int n,
pfarg_pmd_attr_t *pmas);

  New structures:

    typedef struct {
        u16 reg_num;
        u16 reg_set;
        u32 reg_flags;
        u64 reg_value;
    } pfarg_pmr_t;

    typedef struct {
        u64 reg_long_reset;
        u64 reg_short_reset;
        u64 reg_random_mask;
        u64 reg_smpl_pmds[PFM_PMD_BV];
        u64 reg_reset_pmds[PFM_PMD_BV];
        u64 reg_ovfl_swcnt;
        u64 reg_smpl_eventid;
        u64 reg_last_value;
        u64 reg_reserved[8];
    };

  New flags:
     PFM_RWFL_PMD  : pmrs contains PMD register descriptions
     PFM_RWFL_PMC  : pmrs contains PMC register descriptions
     PFM_RWFL_PMD_ATTR: an additional vector is passed in pmas

  We now use only 2 system calls to read and write the PMU registers.
This is possible because
  we are sharing the same register description data structure,
pfarg_pmr_t. They key attributes
  of each register are encapsulated into this structure. Additional
PMD attributes related to sampling
  and multiplexing are off-loaded into another optional structure,
pfarg_pmd_attr_t. This structure
  becomes optional and is only looked at by the kernel if the
PFM_RWFL_PMD_ATTR flag is passed.

  For all counting applications, using pfarg_pmr_t is enough. The nice
side effect of this split is that
  the cost of reading and writing PMD register is now reduced because
we have less data to copy in
  and out of the kernel.

  Unlike suggested by some people, I have not merged the notions of
PMD and PMC registers. I think
  it is cleaner to separate them out. It also makes it much easier to
provide backward compatibility with
   v2.81.

III) attaching and detaching

   With v2.81:
      int pfm_load_context(int fd, pfarg_load_t *load);
      int pfm_unload_context(int fd);

   With v3.0:
      int pfm_attach_session(int fd, int flags, int target);
      int pfm_detach_session(int fd, int flags);

   The pfarg_load_t structure has been abandoned. The information about what to
   attach to is passed as a parameter to the syscall in "target". It
can either be
   a thread id or a CPU id.

   There are currently no flags defined for either call.

    Note that we have lost the ability to specify which event set is
to be activated first.
    There was no actual use of this option anyway.

 IV) starting and stopping

    With v2.81:
       int pfm_start(int fd, pfarg_start_t *st);
       int pfm_stop(int fd);
       int pfm_restart(int fd);

    With v3.0:
       int pfm_start_session(int fd, int flags);
       int pfm_stop_session(int fd, int flags);

      New flags:
      PFM_STFL_RESTART: resume monitoring after an overflow notification

      The pfarg_start_t structure has been abandoned.

      The pfm_restart() syscall has been merged with pfm_start() by
using the PFM_STFL_RESTART
       flag. It is not possible to just use pfm_start_session() and
internally determine what to do because
       this is dependent on the sampling format.

       We have lost the ability to specify on which event set to
start. I don't think this option was ever
       used.

   V) event set and multiplexing

      With v2.81:
         int pfm_create_evtsets(int fd, pfarg_setdesc_t *s, int n);
         int pfm_getinfo_evtsets(int fd, pfarg_setinfo_t *s, int n);
         int pfm_delete_evtsets(int fd, pfarg_setdesc_t *s, int n);

     With v3.0:
         int pfm_create_sets(int fd, int flags, pfarg_setdesc_t *s, int n);
         int pfm_getinfo_evtsets(int fd, int flags, pfarg_setinfo_t *s, int n);

     We have kept the same data structures and simply added a flags
parameters to provide
     for extensibility of the calls.

     We have removed pfm_delete_evtsets() because it was not used by a
lot of applications.
     We could add it back later if there is a good reason for it,
something stronger than saying
     it needs to be there for symmetry.

  VI) libpfm

     The libpfm library will support v2.81 and v3.0 transparently. The
examples have all been rewritten
      to the new API. The v2.81 examples are still there for testing
and comparison purposes. They
     compile with no modifications.

     The perfmon.h header has been strongly restructured to accomodate
v3.0, v2.81 and older v2.x
     releases as used by Cray and SiCortex or even older v2.0 IA-64 programs.

     When using a the v2.81 API, the library is able to adapt based on
the host kernel perfmon version.
     If the host kernel is using v2.81 then the calls are simply
passed through. If instead, the kernel
     uses v3.0, then each v2.81 call is mapped onto its equivalent
v3.0 whenever possible. If the
     syscall has no equivalent, e.g., pfm_delete_evtsets(), then an
error is returned. The errno
     value is updated correctly when using v2.81 <-> v3.0 glue layer.

 VII) conclusions

     There are no other modifications to other parts of the API.

     I think this ultimate modification addresses ALL outstanding
issues raised by LKML.

     I have presented the new API for a fully featured perfmon, with
sampling, system-wide and multiplexing.
     This is not what is going to go into the mainline kernel
initially. The plan is to use this as the development
     code and pull bits and pieces into the minimal kernel patch that
I maintain separately and which is
     what I intend to post to LKML. The reason I modified the fully
featured version is to gauge all the changes
     needed and their potential connections.

     I will soon, create branches on both the GIT tree, libpfm CVS so
that you will be able to experiment with this
     API. The v3.0 has been added to all architectures currently
supported. They all compile. I was not able to
     tests many of them for lack of hardware.

     I welcome any comments you may have about this new API.

S.Eranian

Thanks.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to