Thanks, George! So, the function you mentioned is used when I turn off
HCOLL and use OpenMPI's tuned coll instead. That helps a lot. Another
thing that makes me think is that in my case the data is sent to the
targets asynchronously, or rather - it is a 'put' operation in nature,
and the target
Marcin,
HPC-X implements the MPI BCAST operation by leveraging hardware multicast
capabilities. Starting with HPC-X v2.3 we introduced a new multicast based
algorithm for large messages as well. Hardware multicast scales as O(1)
modulo switch hops. It is the most efficient way to broadcast a messa
Marcin,
I am not sure I understand your question, a bcast is a collective operation
that must be posted by all participants. Independently at what level the
bcast is serviced, if some of the participants have not posted their
participation to the collective, only partial progress can be made.
G