Dear Matthieu, Rolf,
Thank you!
But normally CUDA device selection is based on MPI process index. So,
cuda context must exist where MPI index is not yet available. What is
the best practice of process<->GPU mapping in this case? Or can I
select any device prior to MPI_Init and later change to
To add to this, yes, we recommend that the CUDA context exists prior to a call
to MPI_Init. That is because a CUDA context needs to exist prior to MPI_Init
as the library attempts to register some internal buffers with the CUDA library
that require a CUDA context exists already. Note that
Dear colleagues,
For GPU Winter School powered by Moscow State University cluster
"Lomonosov", the OpenMPI 1.7 was built to test and popularize CUDA
capabilities of MPI. There is one strange warning I cannot understand:
OpenMPI runtime suggests to initialize CUDA prior to MPI_Init. Sorry,
but how