I don't need your branch because 1) we are not doing any communication in this VecAssembly and I added the IGNORE stuff just to make sure and 2) we are catching load imbalance from another part of the code, I am now sure. I can now see that VecAssemblyBegin's timers do not report this type of load imbalance because it starts with a synch. This is what I suspected but I was wondering why the timers did not report this load imbalance.
On Fri, May 29, 2015 at 11:12 AM, Jed Brown <[email protected]> wrote: > Mark Adams <[email protected]> writes: > > > Thanks, this code does not have any communication in VecAssembly so I > added > > the INGNORE stuff and added a barrier before this section of code. I am > > suspecting that VecAssembly is catching load imbalance but not reporting > it > > for some reason. > > If you put a barrier before, it starts at about the same time on all > processes. The operation contains synchronization followed by > (scalable) point-to-point messaging. The slowness is almost certainly > caused by the non-scalable algorithm -- I've seen it with other > applications. So try my branch. >
