Stephane, could you apply your latest patch set to a perfmon3 development branch in your git repository at kernel.org? This would make review and testing easier.
Thanks, -Robert On 23.09.08 23:32:24, stephane eranian wrote: > Hello, > > As you may know, I have been working on redesigning the perfmon2 system call > API to address some issues raised by the LKML review a couple of months back. > > I have made some good progress on this and I'd like to share the current > design > with you. > > There were three major goals in the redesign: > > - minimize the number of system calls. Pefmon2 v2.x has 12 syscalls. This > was > considered quite a lot for a single subsystem. > > - make sure we could extend the syscall API without necessarily > adding new syscalls. > That means having flexibility in each syscall and also in the > structures passed > in syscalls. > > - make sure it would be possible to use existing v2.0 (IA-64) and > also v2.x (x>0) > applications with, at worse, a recompilation. Unlike v2.0 (IA-64 > single syscall API) > the compatibility would have to be provided by user level code, > e.g., libpfm. > Obviously statically linked applications would have to be recompiled. > > I am happy to report that I has successfully fulfilled those three > goals. Here is how > the new API looks like now: > > - down to 8 system calls, all with different names from v2.81 > - several data structures, pfarg_*, have been removed > - full compatibility with v2.81 except for one system call > (pfm_delete_evtsets) > - all system calls are extensible via a flags parameter > - all structures have reserved fields > - the context abstraction has been replaced with the notion of a session. > > Here are the details: > > I) session creation > > With v2.81: > int pfm_create_context(pfarg_ctx_t *ctx, char *smpl_name, void > *smpl_arg, size_t smpl_size); > > With v3.0: > int pfm_create_session(int flags); > int pfm_create_session(int flags, char *smpl_name, void > *smpl_arg, size_t smpl_size); > > New Flags: > PFM_FL_SMPL_FMT : indicate using sampling format and > that 3 additional parameters are passed > > The pfarg_ctx_t structure has been abandoned. The flags parameter is > used very much like for the open(2) > syscall to indicate that additional (optional) parameters are passed. > > All old flags are preserved. > > The call still returns the file descriptor uniquely identifying the > session. > > Just like with context, a session can either be monitoring a thread or a > CPU. > > II) programming the registers > > With v2.81: > int pfm_write_pmcs(int fd, pfarg_pmc_t *pmds, int n); > int pfm_write_pmds(int fd, pfarg_pmd_t *pmcs, int n); > int pfm_read_pmds(int fd, parg_pmd_t *pmds, int n); > > With v3.0: > int pfm_write_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n); > int pfm_write_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n, > pfarg_pmd_attr_t *pmas); > > int pfm_read_pmrs(int fd, int flags, pfarg_pmr_t *pmrs, int n); > int pfm_read_pmrs(int fd, int flags, parg_pmr_t *pmrs, int n, > pfarg_pmd_attr_t *pmas); > > New structures: > > typedef struct { > u16 reg_num; > u16 reg_set; > u32 reg_flags; > u64 reg_value; > } pfarg_pmr_t; > > typedef struct { > u64 reg_long_reset; > u64 reg_short_reset; > u64 reg_random_mask; > u64 reg_smpl_pmds[PFM_PMD_BV]; > u64 reg_reset_pmds[PFM_PMD_BV]; > u64 reg_ovfl_swcnt; > u64 reg_smpl_eventid; > u64 reg_last_value; > u64 reg_reserved[8]; > }; > > New flags: > PFM_RWFL_PMD : pmrs contains PMD register descriptions > PFM_RWFL_PMC : pmrs contains PMC register descriptions > PFM_RWFL_PMD_ATTR: an additional vector is passed in pmas > > We now use only 2 system calls to read and write the PMU registers. > This is possible because > we are sharing the same register description data structure, > pfarg_pmr_t. They key attributes > of each register are encapsulated into this structure. Additional > PMD attributes related to sampling > and multiplexing are off-loaded into another optional structure, > pfarg_pmd_attr_t. This structure > becomes optional and is only looked at by the kernel if the > PFM_RWFL_PMD_ATTR flag is passed. > > For all counting applications, using pfarg_pmr_t is enough. The nice > side effect of this split is that > the cost of reading and writing PMD register is now reduced because > we have less data to copy in > and out of the kernel. > > Unlike suggested by some people, I have not merged the notions of > PMD and PMC registers. I think > it is cleaner to separate them out. It also makes it much easier to > provide backward compatibility with > v2.81. > > III) attaching and detaching > > With v2.81: > int pfm_load_context(int fd, pfarg_load_t *load); > int pfm_unload_context(int fd); > > With v3.0: > int pfm_attach_session(int fd, int flags, int target); > int pfm_detach_session(int fd, int flags); > > The pfarg_load_t structure has been abandoned. The information about what > to > attach to is passed as a parameter to the syscall in "target". It > can either be > a thread id or a CPU id. > > There are currently no flags defined for either call. > > Note that we have lost the ability to specify which event set is > to be activated first. > There was no actual use of this option anyway. > > IV) starting and stopping > > With v2.81: > int pfm_start(int fd, pfarg_start_t *st); > int pfm_stop(int fd); > int pfm_restart(int fd); > > With v3.0: > int pfm_start_session(int fd, int flags); > int pfm_stop_session(int fd, int flags); > > New flags: > PFM_STFL_RESTART: resume monitoring after an overflow notification > > The pfarg_start_t structure has been abandoned. > > The pfm_restart() syscall has been merged with pfm_start() by > using the PFM_STFL_RESTART > flag. It is not possible to just use pfm_start_session() and > internally determine what to do because > this is dependent on the sampling format. > > We have lost the ability to specify on which event set to > start. I don't think this option was ever > used. > > V) event set and multiplexing > > With v2.81: > int pfm_create_evtsets(int fd, pfarg_setdesc_t *s, int n); > int pfm_getinfo_evtsets(int fd, pfarg_setinfo_t *s, int n); > int pfm_delete_evtsets(int fd, pfarg_setdesc_t *s, int n); > > With v3.0: > int pfm_create_sets(int fd, int flags, pfarg_setdesc_t *s, int n); > int pfm_getinfo_evtsets(int fd, int flags, pfarg_setinfo_t *s, int > n); > > We have kept the same data structures and simply added a flags > parameters to provide > for extensibility of the calls. > > We have removed pfm_delete_evtsets() because it was not used by a > lot of applications. > We could add it back later if there is a good reason for it, > something stronger than saying > it needs to be there for symmetry. > > VI) libpfm > > The libpfm library will support v2.81 and v3.0 transparently. The > examples have all been rewritten > to the new API. The v2.81 examples are still there for testing > and comparison purposes. They > compile with no modifications. > > The perfmon.h header has been strongly restructured to accomodate > v3.0, v2.81 and older v2.x > releases as used by Cray and SiCortex or even older v2.0 IA-64 programs. > > When using a the v2.81 API, the library is able to adapt based on > the host kernel perfmon version. > If the host kernel is using v2.81 then the calls are simply > passed through. If instead, the kernel > uses v3.0, then each v2.81 call is mapped onto its equivalent > v3.0 whenever possible. If the > syscall has no equivalent, e.g., pfm_delete_evtsets(), then an > error is returned. The errno > value is updated correctly when using v2.81 <-> v3.0 glue layer. > > VII) conclusions > > There are no other modifications to other parts of the API. > > I think this ultimate modification addresses ALL outstanding > issues raised by LKML. > > I have presented the new API for a fully featured perfmon, with > sampling, system-wide and multiplexing. > This is not what is going to go into the mainline kernel > initially. The plan is to use this as the development > code and pull bits and pieces into the minimal kernel patch that > I maintain separately and which is > what I intend to post to LKML. The reason I modified the fully > featured version is to gauge all the changes > needed and their potential connections. > > I will soon, create branches on both the GIT tree, libpfm CVS so > that you will be able to experiment with this > API. The v3.0 has been added to all architectures currently > supported. They all compile. I was not able to > tests many of them for lack of hardware. > > I welcome any comments you may have about this new API. > > S.Eranian > > Thanks. > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > perfmon2-devel mailing list > perfmon2-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel > -- Advanced Micro Devices, Inc. Operating System Research Center email: [EMAIL PROTECTED] ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel