Hello,

I recently figured out that when running multi-GPU MPI application (one MPI 
process to one GPU) on a computer using Intel Omni-Path, you need to do the GPU 
binding before MPI initialization, according to Intel 
documentation<https://www.intel.com/content/dam/support/us/en/documents/network-and-i-o/fabric-products/Intel_PSM2_PG_H76473_v13_0.pdf>.
 If this seems correct to you, could you update your "Running CUDA-aware" web 
page accordingly ? This would help people to know what is the correct order.

Sincerely
Thomas

Reply via email to