On Thu, 17 Aug 2017, Renato Poli wrote:
That's MPI busy-waiting. In real use cases, where you have at least one core for every MPI rank and you're trying to run as fast as possible, I guess it's the lowest latency thing to do, but I wish I know how to turn it off and make mpich2 or openmpi use blocking waits instead. Sometimes I have one processor stopped by gdb and the others are just wasting CPU, sometimes I'd like to use N cores to run N*2 MPI ranks for debugging purposes and I'd like them not to step on each other's toes... Can these processes simply be exit? In my case I simply do not need them anymore.
Probably not. MPI_Finalize is a collective operation too. So if processor N exits without finalize, then processor 0 will hang on finalize waiting to hear from it, and if processor N does try to finalize first, then it won't be able to exit until processor 0 hits finalize... and you can't let processor 0 try to finalize early and then finish working alone, because the MPI standard says "The number of processes running after this routine is called is undefined; it is best not to perform much more than a 'return rc' after calling MPI_Finalize". I wonder if MPI implementations do busy-waiting in MPI_Finalize. On the one hand, you'd think that this is the one routine that doesn't need sub-millisecond latency, so they could forgo the busy wait without hurting anybody. On the other hand, they probably implement it in terms of other MPI communication routines which do busy-wait for low latency in other sue cases. --- Roy ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Libmesh-users mailing list Libmesh-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libmesh-users