Roland

PAPI has a mechanism to work either on a process or on a single
thread. I believe Cactus switches PAPI to threaded mode by default.
This works only for operating system threads (OpenMP), not for
user-level threads (FunHPC). I don't recall the details, but the PAPI
documentation should describe this in its API documentation.

In Cactus, we initially run some tests (probably a DGEMM) to check
whether PAPI's numbers are consistent with what we expect. This might
help answer this question.

I think that handling multi-threading correctly requires the operating
system to cooperate. On an HPC system, the kernel might have been
modified and cause problems. This is just a wild guess, though.

-erik


On Thu, Jan 12, 2017 at 12:43 PM, Roland Haas <rh...@illinois.edu> wrote:
> Hello all,
>
> does anyone know if the floating point event counts reported by PAPI
> are summed over all threads inside of a MPI rank? Or is it only the
> count on thread 0?
>
> I would hope for the former but suspect the latter.
>
> That is, if I was to run the same job with using ncores
> cores and would run once with nranks MPI ranks and nthreads threads
> per rank and onece with ncores MPI ranks and 1 thread per rank, would
> the sum over all *reported* event counts of all ranks (roughly,
> neglecting ghost zones etc) agree?
>
> Yours,
> Roland
>
> --
> My email is as private as my paper mail. I therefore support encrypting
> and signing email messages. Get my PGP key from http://keys.gnupg.net.
>
> _______________________________________________
> Users mailing list
> Users@cactuscode.org
> http://cactuscode.org/mailman/listinfo/users
>



-- 
Erik Schnetter <schnet...@cct.lsu.edu>
http://www.perimeterinstitute.ca/personal/eschnetter/
_______________________________________________
Users mailing list
Users@cactuscode.org
http://cactuscode.org/mailman/listinfo/users

Reply via email to