On Nov 13, 2018, at 8:52 PM, Weicheng Xue <weic...@vt.edu> wrote: > > I am a student whose research work includes using MPI and OpenACC to > accelerate our in-house research CFD code on multiple GPUs. I am having a big > issue related to the "progression of operations in MPI" and am thinking your > inputs can be very helpful.
Someone asked me about an Open MPI + OpenACC issue this past week at the Supercomputing trade show. I'm not sure if anyone in the Open MPI development community is testing with Open MPI + OpenACC. I don't know much about it -- I would *hope* that it "just works", but I don't know that for sure. > I am now testing the performance of overlapping communication and > computation for a code. Communication exists between hosts (CPUs) and > computations are done on devices (GPUs). However, in my case, the actual > communication always starts when the computations finish. Therefore, even > though I wrote my code in an overlapping way, there is no overlapping because > of the OpenMPI not supporting asynchronous progression. I found that MPI > often does progress (i.e. actually send or receive the data) only if I am > blocking in a call to MPI_Wait (Then no overlapping occurs at all). My > purpose is to use overlapping to hide communication latency and thus improve > the performance of my code. Is there a way you can suggest to me? Thank you > very much! Nearly all transports in Open MPI support asynchronous progress -- but only some of them offer hardware- and/or OS-assisted asynchronous progress (which is probably what you are assuming). Specifically: I'm quibbling with your choice of wording, but the end effect you are observing is likely a) correct, and b) dependent upon the network transport that you are using. > I am now using PGI/17.5 compiler and openmpi/2.0.0. A 100 Gbps > EDR-Infiniband is used for MPI traffic. If I use "ompi_info", then info. > about the thread support is "Thread support: posix (MPI_THREAD_MULTIPLE: yes, > OPAL support: yes, OMPI progress: no, ORTE progress: yes, Event lib: yes)". That's a little surprising -- IB should be one of the transports that actually supports asynchronous progress. Are you using UCX for the IB transport? -- Jeff Squyres jsquy...@cisco.com _______________________________________________ users mailing list firstname.lastname@example.org https://lists.open-mpi.org/mailman/listinfo/users