Paul Kapinos <kapi...@itc.rwth-aachen.de> writes:

> Hi Dave,
>
>
> On 03/06/17 18:09, Dave Love wrote:
>> I've been looking at a new version of an application (cp2k, for for what
>> it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't
>
> Welcome to the club! :o)
> In our measures we see some 70% of time in 'mpi_free_mem'... and 15x
> performance loss if using Open MPI vs. Intel MPI. So it goes.
>
> https://www.mail-archive.com/users@lists.open-mpi.org//msg30593.html

Ah, that didn't match my search terms.

Did cp2k's own profile not show the site of the slowdown (MP_Mem, if I
recall correctly)?  Maybe it's a different issue, especially if IMPI
surprisingly wins so much over IB -- even if it isn't subject to the
same pathology and is using a better collective algorithms.  For a
previous version of cp2k, my all-free software build was reported faster
than an all-Intel build on a similar system with faster processors.

OPA performance would be interesting if you could report it, say, for a
reasonably large cp2k quickstep run, especially if IB+libfabric results
were available on the same system.  (The two people I know who were
measuring OPA were NDA'd when I last knew.)

>> think it did so the previous version I looked at.  I found on an
>> IB-based system it's spending about half its time in those allocation
>> routines (according to its own profiling) -- a tad surprising.
>>
>> It turns out that's due to some pathological interaction with openib,
>> and just having openib loaded.  It shows up on a single-node run iff I
>> don't suppress the openib btl, and doesn't for multi-node PSM runs iff I
>> suppress openib (on a mixed Mellanox/Infinipath system).
>
> we're lucky - our issue is on Intel OmniPath (OPA) network (and we
> will junk IB hardware in near future, I think) - so we disabled the IB
> transport failback,
> --mca btl ^tcp,openib

That's what I did, but could still run with IB under OMPI 1.10 using the
ofi mtl.

> For single-node jobs this will also help on plain IB nodes,
> likely. (you can disable IB if you do not use it)

Yes, I guess I wasn't clear.

I'd still like to know the basic reason for this, and whether it's
OMPI-specific, if someone can say.
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to