Samuel, I am a developer of Fujitsu MPI. Thanks for using the K computer. For official support, please consult with the helpdesk of K, as Gilles said. The helpdesk may have information based on past inquiries. If not, the inquiry will be forwarded to our team.
As other people said, Fujitsu MPI used in K is based on old Open MPI (v1.6.3 with bug fixes). We don't have a plan to update it to newer version because it is in a maintenance phase regarding system softwares. At first glance, I also suspect the cost of multiple allreduce. Takahiro Kawashima, MPI development team, Fujitsu > Some of my collaborators have had issues with one of my benchmarks at high > concurrency (82K MPI procs) on the K machine in Japan. I believe K uses > OpenMPI and the issues has been tracked to time in MPI_Comm_dup/Comm_split > increasing quadratically with process concurrency. At 82K processes, each > call to dup/split is taking 15s to complete. These high times restrict > comm_split/dup to be used statically (at the beginning) and not dynamically > in an application. > > I had a similar issue a few years ago on ANL/Mira/MPICH where they called > qsort to split the ranks. Although qsort/quicksort has ideal computational > complexity of O(PlogP) [P is the number of MPI ranks], it can have worst > case complexity of O(P^2)... at 82K, P/logP is a 5000x slowdown. > > Can you confirm whether qsort (or the like) is (still) used in these routines > in OpenMPI? It seems mergesort (worst case complexity of PlogP) would be a > more scalable approach. I have not observed this issue on the Cray MPICH > implementation and the Mira MPICH issues has since been resolved. _______________________________________________ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel