Hi Gilles and Josh, I think my reply apply to both of your answers which I thank you for.
On 21/10/15 08:31 PM, Gilles Gouaillardet wrote:
Eric, #2 maybe not ... a tree based approach has O(log(n)) scaling (compared to O(n) scaling with your linear method. so at scale, MPI_Igather will hopefully scale better (and especially if you are transmitting small messages)
I see. Now, please don't blame me for not reading the mpi standard, but: is it or should it be guaranteed by the standard? If not, isn't it a repetitive work for all mpi users to re-implement a (debugged) tree based approach in all their code?
In other words, if everybody knows that to scale well, you have to program a tree based approach for the communications, why isn't it in the standard?
#3 difficult question ... first, keep in mind there is currently no progress thread in Open MPI. that means messages can be received only when MPI_Wait* or MPI_Test* is invoked. you might hope messages are received when doing some
ok! So it may be different with another MPI...?
computation (overlap of computation and communication) but unfortunatly, that does not happen most of the time. linear gather does not scale well (see my previous comment) plus you openmpi might malloc some space "under the hood" so MPI_Igather will hopefully scale better.
That is something I was asking myself about... will I over-allocate memory with all our MPI_Isend/Irecv...? I simple test with the code I sent in the first mail show a small extra use of memory, but I didn't go very far with the test yet...
is there any hard reason why you are using non blocking collective ?
No. Our home-made non-blocking collective is just an initial design that is still used in our code, but I want a sufficient number of good reasons to change it or not, to non-blocking or blocking collective calls...
if your application is known to be highly asynchronous and some message might arrive (way) later than others, and computation is quite expensive, then your approach might be a good fit. if your application is pretty synchronous, and cost of computation that might overlap with communication is not significant, your approach might have little benefits and poor scalability, so MPI_Gather (not MPI_Igather since you might have no computation that could overlap with communication) might be a better choice.
Good Question. It is a finite elements code. Some work may be highly asynchronous but other not...
Do you have any suggestion for some good further reading about all this matter?
Thanks, Eric