Adam - There are a couple of theoretical limits on how many requests you can have outstanding (at some point, you will run the host out of memory). However, those issues should be a problem when posting the MPI_Isend or MPI_Irecv, not during MPI_Waitall. 2.1.0 is pretty old; the first step in further debugging is to upgrade to one of the recent releases (3.1.3 or 4.0.0) and verify that the bug still exists.
Brian > On Dec 16, 2018, at 6:52 AM, Adam Sylvester <op8...@gmail.com> wrote: > > I'm running OpenMPI 2.1.0 on RHEL 7 using TCP communication. For the > specific run that's crashing on me, I'm running with 17 ranks (on 17 > different physical machines). I've got a stage in my application where ranks > need to transfer chunks of data where the size of each chunk is trivial (on > the order of 100 MB) compared to the overall imagery. However, the chunks > are spread out across many buffers in a way that makes the indexing > complicated (and the memory is not all within a single buffer)... the > simplest way to express the data movement in code is by a large number of > MPI_Isend() and MPI_Ireceive() calls followed of course by an eventual > MPI_Waitall(). This works fine for many cases, but I've run into a case now > where the chunks are imbalanced such that a few ranks have a total of ~450 > MPI_Request objects (I do a single MPI_Waitall() with all requests at once) > and the remaining ranks have < 10 MPI_Requests. In this scenario, I get a > seg fault inside PMPI_Waitall(). > > Is there an implementation limit as to how many asynchronous requests are > allowed? Is there a way this can be queried either via a #define value or > runtime call? I probably won't go this route, but when initially compiling > OpenMPI, is there a configure option to increase it? > > I've done a fair amount of debugging and am pretty confident this is where > the error is occurring as opposed to indexing out of bounds somewhere, but if > there is no such limit in OpenMPI, that would be useful to know too. > > Thanks. > -Adam > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users