Re: [OMPI devel] Memory performance with Bcast

2019-03-21 Thread marcin.krotkiewski
Thanks, George! So, the function you mentioned is used when I turn off HCOLL and use OpenMPI's tuned coll instead. That helps a lot. Another thing that makes me think is that in my case the data is sent to the targets asynchronously, or rather - it is a 'put' operation in nature, and the target

Re: [OMPI devel] Memory performance with Bcast

2019-03-21 Thread Joshua Ladd
Marcin, HPC-X implements the MPI BCAST operation by leveraging hardware multicast capabilities. Starting with HPC-X v2.3 we introduced a new multicast based algorithm for large messages as well. Hardware multicast scales as O(1) modulo switch hops. It is the most efficient way to broadcast a messa

Re: [OMPI devel] Memory performance with Bcast

2019-03-21 Thread George Bosilca
Marcin, I am not sure I understand your question, a bcast is a collective operation that must be posted by all participants. Independently at what level the bcast is serviced, if some of the participants have not posted their participation to the collective, only partial progress can be made. G