> Eric, > > In the 1.3 and some of the latest 1.2.X versions tuned is the default > component for collectives. However, the tuned currently in the trunk > are optimized for high performance networks (such as IB or MX), and > they do not deliver the best performance on slower devices such as > Ethernet.
I forgot to mention the version I am running is 1.2.7. Since I am running Ethernet I know I can't expect miracles but I was at least wondering if I could expect some performance gain by using Allgather compared to Send/Recv, even givent that context. > In order to play with the different implementation of allgather you > should either on the $(HOME)/.openmpi/mca-params.conf or command line > set the following MCA parameters: > 1) coll_tuned_use_dynamic_rules to one in order to enable fine grain > selection of the algorithms The decription wasn't too clear about it's usage, thanks. > 2) coll_tuned_allgather_algorithm to a value between 0 and 6 (read the > output corresponding to this algorithm from 'ompi_info --param coll > tuned' once you enabled the dynamic rules). Since `ompi_info --param coll tuned|grep coll_tuned_allgather_algorithm` returns null, I'll assume it's not part of 1.2.7. I'll dig into the code to see what my options are, otherwise I'll be forced to install 1.3 ;) > > This will allow you to select a specific algorithm for the allgather. > You can further tuned it, by playing with the fanout (in case of trees > topologies), and with the segment size (for the pipelined ones). Thanks! > > george. > > > On Oct 3, 2008, at 8:48 AM, Eric Thibodeau wrote: > >> Hello all, >> >> I am currently profiling a simple case where I replace multiple S/ >> R calls with Allgather calls and it would _seem_ the simple S/R >> calls are faster. Now, *before* I come to any conclusion on this, >> one of the pieces I am missing is more details on how /if/when the >> tuned coll MCA is selected. In other words, can I assume the tuned >> versions are used by default? I skimmed through the well documented >> source code but before I can even start to analyze the replacement's >> impact (in a small cluster), I need to know how and when the tuned >> coll MCA is used/selected. >> >> Thanks, >> >> Eric Eric