Hello Frank, thanks. That is somewhat reassuring. I also did some experiments and consulted the PAPI thorns source code (its clocks file) and it seems as if it always accumulates counter values over all threads when reading out PAPI counters so things do in fact work as hoped for (namely the flop counter counts all flops in a MPI rank and not just on thread zero). My threads were bound to kernel level threads (and cores for that matter) since I ran my tests on Blue Waters.
Yours, Roland > On Thu, Jan 12, 2017 at 11:43:43AM -0600, Roland Haas wrote: > >does anyone know if the floating point event counts reported by PAPI > >are summed over all threads inside of a MPI rank? Or is it only the > >count on thread 0? > > From the documentation the answer seems to be "it depends": > > In order to support threaded operation, the operating system must save and > restore the counter hardware upon context switches among different threads or > processes. However, OpenMP hides the concept of user and kernel level threads > from the user. As a result, unless the user explicitly takes action to bind > their thread to a kernel thread (sometimes called a Light Weight Process or > LWP), the counts returned by PAPI will not necessarily be accurate. > > To address this situation, PAPI treats every platform as if it is running on > top of kernel threads. > > Unbound, user level threads that call PAPI will function properly, but will > most likely return unreliable or inaccurate event counts. > > Fortunately, in the batch environments of the HPC community, there is no > significant advantage to user level threads and thus kernel level threads are > the default. Frank > -- My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from http://keys.gnupg.net.
pgpEp8pMsXgCF.pgp
Description: OpenPGP digital signature
_______________________________________________ Users mailing list [email protected] http://cactuscode.org/mailman/listinfo/users
