Hi,
thanks for the reply,
This all sounds logical. My current leading theory is also that it has to do
with the memcpy
implementation, in conjunction with all the other factors (arch, cpu & its
details, copy size, etc).
I tested on additional/different architectures, and there the pattern was
The idea here is that the dynamic rules are defined by an entire set of
parameters, and that we want a quick way to allow OMPI to ignore them all.
If we follow your suggestion and remove coll_tuned_use_dynamic_rules, then
turning on/off dynamic rules involves a lot of changes into the MCA file
(ins