Re: [OMPI users] growing memory use from MPI application

2019-06-21 Thread Noam Bernstein via users
> On Jun 21, 2019, at 9:57 PM, Carlson, Timothy S  
> wrote:
> 
> Switch back to stock OFED?   

Well, CentOS included OFED has a memory leak (at least when using ucx).  I 
haven't tried OFED's stack yet.

> 
> Make sure all your cards are patched to the latest firmware.   

That's a good idea.  I'll try that.  If only SuperMicro didn't make it so 
difficult to find the correct firmware.

Noam


||
|U.S. NAVAL|
|_RESEARCH_|
LABORATORY
Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546
https://www.nrl.navy.mil 
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-21 Thread Noam Bernstein via users
Perhaps I spoke too soon.  Now, with the Mellanox OFED stack, we occasionally 
get the following failure on exit:
[compute-4-20:68008:0:68008] Caught signal 11 (Segmentation fault: address not 
mapped to object at address 0x10)
0 0x0002a3c5 opal_free_list_destruct()  opal_free_list.c:0
1 0x1e89 mca_rcache_grdma_finalize()  rcache_grdma_module.c:0
2 0x000cbfdf mca_rcache_base_module_destroy()  ???:0
3 0xdfef device_destruct()  btl_openib_component.c:0
4 0x9c61 mca_btl_openib_finalize()  ???:0
5 0x000796f3 mca_btl_base_close()  btl_base_frame.c:0
6 0x00062c99 mca_base_framework_close()  ???:0
7 0x00062c99 mca_base_framework_close()  ???:0
8 0x00052a2a ompi_mpi_finalize()  ???:0
9 0x00046449 mpi_finalize__()  ???:0
It appears to be non-deterministic, as far as my users can tell.  

I have no idea how to even begin debugging this, but it started when we 
switched from the CentOS OFED stuff to the Mellanox version (which, 
incidentally, seems to be failing to even recognize our oldest FDR IB cards).  
If anyone has any suggestions, I'd appreciate it.

Noam___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Noam Bernstein via users
> On Jun 20, 2019, at 1:38 PM, Nathan Hjelm via users 
>  wrote:
> 
> THAT is a good idea. When using Omnipath we see an issue with stale files in 
> /dev/shm if the application exits abnormally. I don't know if UCX uses that 
> space as well.

No stale shm files.  echo 3 > /proc/sys/vm/drop_caches  doesn't do anything 
either.  But waiting a couple of minutes does cause the output of "free" to 
drop down to the normal idle level.  So not really a leak.


Anyway, thanks to everyone who gave suggestions.  For the moment I'm going to 
hope that the Mellanox OFED package will continue to work.  I've combined it 
with the SDSC Mellanox OFED roll, and it's a reasonably clean process (although 
I will have to do it (perhaps automated some day) for each kernel version).

Noam

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Jeff Squyres (jsquyres) via users
On Jun 20, 2019, at 1:34 PM, Noam Bernstein  wrote:
> 
> Aha - using Mellanox’s OFED packaging seems to essentially (if not 100%) 
> fixed the issue.  There still appears to be some small leak, but it’s of 
> order 1 GB, not 10s of GB, and it doesn’t grow continuously.   And on later 
> runs of the same code it doesn’t grow any further, so whatever the kernel 
> memory is being used for and not released, it can at least be used again for 
> the same purpose.
> 
> Thanks for the nudge to check out this option.  Do you happen to know how the 
> installer handles kernel updates?  Is it at all automated, or do I need to 
> rerun the installer to build new kernel modules each time?

I'm afraid I don't know anything about the Mellanox OFED installer; you'll need 
to check their documentation / check with them.

-- 
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Nathan Hjelm via users

THAT is a good idea. When using Omnipath we see an issue with stale files in 
/dev/shm if the application exits abnormally. I don't know if UCX uses that 
space as well.


-Nathan

On June 20, 2019 at 11:05 AM, Joseph Schuchart via users 
 wrote:


Noam,

Another idea: check for stale files in /dev/shm/ (or a subdirectory that
looks like it belongs to UCX/OpenMPI) and SysV shared memory using `ipcs
-m`.

Joseph

On 6/20/19 3:31 PM, Noam Bernstein via users wrote:





On Jun 20, 2019, at 4:44 AM, Charles A Taylor mailto:chas...@ufl.edu>> wrote:


This looks a lot like a problem I had with OpenMPI 3.1.2.  I thought
the fix was landed in 4.0.0 but you might
want to check the code to be sure there wasn’t a regression in 4.1.x.
 Most of our codes are still running
3.1.2 so I haven’t built anything beyond 4.0.0 which definitely
included the fix.


Unfortunately, 4.0.0 behaves the same.


One thing that I’m wondering if anyone familiar with the internals can
explain is how you get a memory leak that isn’t freed when then program
ends?  Doesn’t that suggest that it’s something lower level, like maybe
a kernel issue?


Noam



|
|
|
*U.S. NAVAL*
|
|
_*RESEARCH*_
|
LABORATORY


Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546
https://www.nrl.navy.mil




___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Noam Bernstein via users
> On Jun 20, 2019, at 10:42 AM, Noam Bernstein via users 
>  wrote:
> 
> I haven’t yet tried the latest OFED or Mellanox low level stuff.  That’s next 
> on my list, but slightly more involved to do, so I’ve been avoiding it.
> 

Aha - using Mellanox’s OFED packaging seems to essentially (if not 100%) fixed 
the issue.  There still appears to be some small leak, but it’s of order 1 GB, 
not 10s of GB, and it doesn’t grow continuously.   And on later runs of the 
same code it doesn’t grow any further, so whatever the kernel memory is being 
used for and not released, it can at least be used again for the same purpose.

Thanks for the nudge to check out this option.  Do you happen to know how the 
installer handles kernel updates?  Is it at all automated, or do I need to 
rerun the installer to build new kernel modules each time?

thanks,
Noam

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Joseph Schuchart via users

Noam,

Another idea: check for stale files in /dev/shm/ (or a subdirectory that 
looks like it belongs to UCX/OpenMPI) and SysV shared memory using `ipcs 
-m`.


Joseph

On 6/20/19 3:31 PM, Noam Bernstein via users wrote:



On Jun 20, 2019, at 4:44 AM, Charles A Taylor > wrote:


This looks a lot like a problem I had with OpenMPI 3.1.2.  I thought 
the fix was landed in 4.0.0 but you might
want to check the code to be sure there wasn’t a regression in 4.1.x. 
 Most of our codes are still running
3.1.2 so I haven’t built anything beyond 4.0.0 which definitely 
included the fix.


Unfortunately, 4.0.0 behaves the same.

One thing that I’m wondering if anyone familiar with the internals can 
explain is how you get a memory leak that isn’t freed when then program 
ends?  Doesn’t that suggest that it’s something lower level, like maybe 
a kernel issue?


Noam


|
|
|
*U.S. NAVAL*
|
|
_*RESEARCH*_
|
LABORATORY

Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546
https://www.nrl.navy.mil


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread John Hearns via users
Errr..   you chave dropped caches?   echo 3 > /proc/sys/vm/drop_caches


On Thu, 20 Jun 2019 at 15:59, Yann Jobic via users 
wrote:

> Hi,
>
> Le 6/20/2019 à 3:31 PM, Noam Bernstein via users a écrit :
> >
> >
> >> On Jun 20, 2019, at 4:44 AM, Charles A Taylor  >> > wrote:
> >>
> >> This looks a lot like a problem I had with OpenMPI 3.1.2.  I thought
> >> the fix was landed in 4.0.0 but you might
> >> want to check the code to be sure there wasn’t a regression in 4.1.x.
> >>  Most of our codes are still running
> >> 3.1.2 so I haven’t built anything beyond 4.0.0 which definitely
> >> included the fix.
> >
> > Unfortunately, 4.0.0 behaves the same.
> >
> > One thing that I’m wondering if anyone familiar with the internals can
> > explain is how you get a memory leak that isn’t freed when then program
> > ends?  Doesn’t that suggest that it’s something lower level, like maybe
> > a kernel issue?
>
> Maybe it's only some data in cache memory, which is tagged as "used",
> but the kernel could use it, if needed. Have you tried to use the whole
> memory again with your code ? It sould work.
>
> Yann
>
> >
> > Noam
> >
> > 
> > |
> > |
> > |
> > *U.S. NAVAL*
> > |
> > |
> > _*RESEARCH*_
> > |
> > LABORATORY
> >
> > Noam Bernstein, Ph.D.
> > Center for Materials Physics and Technology
> > U.S. Naval Research Laboratory
> > T +1 202 404 8628  F +1 202 404 7546
> > https://www.nrl.navy.mil
> >
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
> >
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Yann Jobic via users

Hi,

Le 6/20/2019 à 3:31 PM, Noam Bernstein via users a écrit :



On Jun 20, 2019, at 4:44 AM, Charles A Taylor > wrote:


This looks a lot like a problem I had with OpenMPI 3.1.2.  I thought 
the fix was landed in 4.0.0 but you might
want to check the code to be sure there wasn’t a regression in 4.1.x. 
 Most of our codes are still running
3.1.2 so I haven’t built anything beyond 4.0.0 which definitely 
included the fix.


Unfortunately, 4.0.0 behaves the same.

One thing that I’m wondering if anyone familiar with the internals can 
explain is how you get a memory leak that isn’t freed when then program 
ends?  Doesn’t that suggest that it’s something lower level, like maybe 
a kernel issue?


Maybe it's only some data in cache memory, which is tagged as "used", 
but the kernel could use it, if needed. Have you tried to use the whole 
memory again with your code ? It sould work.


Yann



Noam


|
|
|
*U.S. NAVAL*
|
|
_*RESEARCH*_
|
LABORATORY

Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546
https://www.nrl.navy.mil


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Noam Bernstein via users
> On Jun 20, 2019, at 9:40 AM, Jeff Squyres (jsquyres)  
> wrote:
> 
> On Jun 20, 2019, at 9:31 AM, Noam Bernstein via users 
>  wrote:
>> 
>> One thing that I’m wondering if anyone familiar with the internals can 
>> explain is how you get a memory leak that isn’t freed when then program 
>> ends?  Doesn’t that suggest that it’s something lower level, like maybe a 
>> kernel issue?
> 
> If "top" doesn't show processes eating up the memory, and killing processes 
> (e.g., MPI processes) doesn't give you memory back, then it's likely that 
> something in the kernel is leaking memory.

That’s definitely what’s happening.  “free" is reporting a lot of memory used, 
but adding the values from ps is much lower.

> 
> Have you tried the latest version of UCX -- including their kernel drivers -- 
> from Mellanox (vs. inbox/CentOS)?
> 

I’ve tried the latest ucx from the ucx web site, 1.5.1, which doesn’t change 
the behavior.

I haven’t yet tried the latest OFED or Mellanox low level stuff.  That’s next 
on my list, but slightly more involved to do, so I’ve been avoiding it.

thanks,
Noam
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread John Hearns via users
The kernel using memory is why I suggested running slabtop, to see the
kernel slab allocations.
Clearly I Was barking up a wrong tree there...

On Thu, 20 Jun 2019 at 14:41, Jeff Squyres (jsquyres) via users <
users@lists.open-mpi.org> wrote:

> On Jun 20, 2019, at 9:31 AM, Noam Bernstein via users <
> users@lists.open-mpi.org> wrote:
> >
> > One thing that I’m wondering if anyone familiar with the internals can
> explain is how you get a memory leak that isn’t freed when then program
> ends?  Doesn’t that suggest that it’s something lower level, like maybe a
> kernel issue?
>
> If "top" doesn't show processes eating up the memory, and killing
> processes (e.g., MPI processes) doesn't give you memory back, then it's
> likely that something in the kernel is leaking memory.
>
> Have you tried the latest version of UCX -- including their kernel drivers
> -- from Mellanox (vs. inbox/CentOS)?
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Jeff Squyres (jsquyres) via users
On Jun 20, 2019, at 9:31 AM, Noam Bernstein via users 
 wrote:
> 
> One thing that I’m wondering if anyone familiar with the internals can 
> explain is how you get a memory leak that isn’t freed when then program ends? 
>  Doesn’t that suggest that it’s something lower level, like maybe a kernel 
> issue?

If "top" doesn't show processes eating up the memory, and killing processes 
(e.g., MPI processes) doesn't give you memory back, then it's likely that 
something in the kernel is leaking memory.

Have you tried the latest version of UCX -- including their kernel drivers -- 
from Mellanox (vs. inbox/CentOS)?

-- 
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Noam Bernstein via users


> On Jun 20, 2019, at 4:44 AM, Charles A Taylor  wrote:
> 
> This looks a lot like a problem I had with OpenMPI 3.1.2.  I thought the fix 
> was landed in 4.0.0 but you might
> want to check the code to be sure there wasn’t a regression in 4.1.x.  Most 
> of our codes are still running
> 3.1.2 so I haven’t built anything beyond 4.0.0 which definitely included the 
> fix.

Unfortunately, 4.0.0 behaves the same.  

One thing that I’m wondering if anyone familiar with the internals can explain 
is how you get a memory leak that isn’t freed when then program ends?  Doesn’t 
that suggest that it’s something lower level, like maybe a kernel issue?

Noam


||
|U.S. NAVAL|
|_RESEARCH_|
LABORATORY
Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546
https://www.nrl.navy.mil 
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-20 Thread Charles A Taylor via users
This looks a lot like a problem I had with OpenMPI 3.1.2.  I thought the fix 
was landed in 4.0.0 but you might
want to check the code to be sure there wasn’t a regression in 4.1.x.  Most of 
our codes are still running
3.1.2 so I haven’t built anything beyond 4.0.0 which definitely included the 
fix.

See…

- Apply patch for memory leak associated with UCX PML.
-https://github.com/openucx/ucx/issues/2921
-https://github.com/open-mpi/ompi/pull/5878

Charles Taylor
UF Research Computing


> On Jun 19, 2019, at 2:26 PM, Noam Bernstein via users 
>  wrote:
> 
>> On Jun 19, 2019, at 2:00 PM, John Hearns via users > > wrote:
>> 
>> Noam, it may be a stupid question. Could you try runningslabtop   ss the 
>> program executes
> 
> The top SIZE usage is this line
>OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME  
>  
> 5937540 5937540 100%0.09K 141370 42565480K kmalloc-96
> which seems to be growing continuously. However, it’s much smaller than the 
> drop in free memory.  It gets to around 1 GB after tens of seconds (500 MB 
> here), but the overall free memory is dropping by about 1 GB / second, so 
> tens of GB over the same time.
> 
>> 
>> Also  'watch  cat /proc/meminfo'is also a good diagnostic
> 
> Other than MemFree dropping, I don’t see much. Here’s a diff, 10 seconds 
> apart:
> 2,3c2,3
> < MemFree:54229400 kB
> < MemAvailable:   54271804 kB
> ---
> > MemFree:45010772 kB
> > MemAvailable:   45054200 kB
> 19c19
> < AnonPages:  22063260 kB
> ---
> > AnonPages:  22526300 kB
> 22,24c22,24
> < Slab: 851380 kB
> < SReclaimable:  87100 kB
> < SUnreclaim:   764280 kB
> ---
> > Slab:1068208 kB
> > SReclaimable:  89148 kB
> > SUnreclaim:   979060 kB
> 31c31
> < Committed_AS:   34976896 kB
> ---
> > Committed_AS:   34977680 kB
> 
> MemFree has dropped by 9 GB, but as far as I can tell nothing else has 
> increased by anything near as much, so I don’t know where the memory is going.
> 
>   Noam
> 
> 
> 
> ||
> |U.S. NAVAL|
> |_RESEARCH_|
> LABORATORY
> 
> Noam Bernstein, Ph.D.
> Center for Materials Physics and Technology
> U.S. Naval Research Laboratory
> T +1 202 404 8628  F +1 202 404 7546
> https://www.nrl.navy.mil 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users=DwICAg=sJ6xIWYx-zLMB3EPkvcnVg=NpYP1iUbEbTx87BW8Gx5ow=uR1yQLj0g46Qb_ELHglK3ck3gNxjVqxMHyRu2bcfRQo=oTZPqoXvy0rvbh3Ni6Mquuzel8PXWG1ub4-c6xleDnQ=

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-19 Thread Noam Bernstein via users
> On Jun 19, 2019, at 5:05 PM, Joshua Ladd  wrote:
> 
> Hi, Noam
> 
> Can you try your original command line with the following addition:
> 
> mpirun —mca pml ucx —mca btl ^vader,tcp,openib -mca osc ucx  
> 
> I think we're seeing some conflict between UCX PML and UCT OSC. 

I did this, although meanwhile I also did a clean compile (to add some 
debugging statements) and switched from running on 1 node (36 cores) to 2 
nodes. The problem is slightly different, but still similar.  Now the memory 
doesn’t continue to expand until it runs out.  Instead, one node (the head 
node?) is using 55 GB, while the other is using only 23 GB.  The latter value 
(23 GB) is consistent with the usage from ps or top (36 * 640 MB/proc).  When I 
kill the job, the node that used to use 55 GB goes down to 34 GB (with nothing 
running), and the other is down to about 1 GB. 

Noam


||
|U.S. NAVAL|
|_RESEARCH_|
LABORATORY
Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546
https://www.nrl.navy.mil 
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-19 Thread Joshua Ladd via users
Hi, Noam

Can you try your original command line with the following addition:

mpirun —mca pml ucx —mca btl ^vader,tcp,openib -*mca osc ucx  *

I think we're seeing some conflict between UCX PML and UCT OSC.

Josh

On Wed, Jun 19, 2019 at 4:36 PM Noam Bernstein via users <
users@lists.open-mpi.org> wrote:

> On Jun 19, 2019, at 2:44 PM, George Bosilca  wrote:
>
> To completely disable UCX you need to disable the UCX MTL and not only the
> BTL. I would use "--mca pml ob1 --mca btl ^ucx —mca btl_openib_allow_ib 1”.
>
>
> Thanks for the pointer.  Disabling ucx this way _does_ seem to fix the
> memory issue.  That’s a very helpful workaround, if nothing else.
>
> Using ucx 1.5.1 downloaded from the ucx web site at runtime (just by
> inserting it into LD_LIBRARY_PATH, without recompiling openmpi) doesn’t
> seem to fix the problem.
>
>
> As you have a gdb session on the processes you can try to break on some of
> the memory allocations function (malloc, realloc, calloc).
>
>
> Good idea.  I set breakpoints on all 3 of those, then did “c” 3 times.
> Does this mean anything to anyone?  I’m investigating the upstream calls
> (not included below) that generate these calls to mpi_bcast, but given that
> it works on other types of nodes, I doubt those are problematic.
>
> #0  0x2b9e5303e160 in malloc () from /lib64/libc.so.6
> #1  0x2b9e651f358a in ucs_rcache_create_region
> (region_p=0x7fff82806da0, arg=0x7fff82806d9c, prot=3, length=131072,
> address=0x2b9e76102070, rcache=0xb341a50) at sys/rcache.c:500
> #2  ucs_rcache_get (rcache=0xb341a50, address=0x2b9e76102070,
> length=131072, prot=prot@entry=3, arg=arg@entry=0x7fff82806d9c,
> region_p=region_p@entry=0x7fff82806da0) at sys/rcache.c:612
> #3  0x2b9e64f7a3d4 in uct_ib_mem_rcache_reg (uct_md=,
> address=, length=, flags=96,
> memh_p=0xbc409b0) at ib/base/ib_md.c:990
> #4  0x2b9e64d245e2 in ucp_mem_rereg_mds (context=,
> reg_md_map=4, address=address@entry=0x2b9e76102070, length= out>, uct_flags=uct_flags@entry=96,
> alloc_md=alloc_md@entry=0x0, mem_type=mem_type@entry
> =UCT_MD_MEM_TYPE_HOST, alloc_md_memh_p=alloc_md_memh_p@entry=0x0,
> uct_memh=uct_memh@entry=0xbc409b0, md_map_p=md_map_p@entry=0xbc409a8)
> at core/ucp_mm.c:100
> #5  0x2b9e64d260f0 in ucp_request_memory_reg (context=0xb340800,
> md_map=4, buffer=0x2b9e76102070, length=131072, datatype=128,
> state=state@entry=0xbc409a0, mem_type=UCT_MD_MEM_TYPE_HOST,
> req_dbg=req_dbg@entry=0xbc40940, uct_flags=,
> uct_flags@entry=0) at core/ucp_request.c:218
> #6  0x2b9e64d3716b in ucp_request_send_buffer_reg (md_map= out>, req=0xbc40940)
> at 
> /home_tin/bernadm/configuration/330_OFED/ucx-1.5.1/src/ucp/core/ucp_request.inl:343
> #7  ucp_tag_send_start_rndv (sreq=sreq@entry=0xbc40940) at tag/rndv.c:153
> #8  0x2b9e64d3abb9 in ucp_tag_send_req (enable_zcopy=1,
> proto=0x2b9e64f569c0 , cb=0x2b9e64467350
> , rndv_am_thresh=,
> rndv_rma_thresh=, msg_config=0xb3ea278, dt_count=8192,
> req=) at tag/tag_send.c:78
> #9  ucp_tag_send_nb (ep=, buffer=,
> count=8192, datatype=, tag=,
> cb=0x2b9e64467350 ) at tag/tag_send.c:203
> #10 0x2b9e64465fa6 in mca_pml_ucx_isend ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/openmpi/mca_pml_ucx.so
> #11 0x2b9e52211900 in ompi_coll_base_bcast_intra_generic ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
> #12 0x2b9e52211d4b in ompi_coll_base_bcast_intra_pipeline ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
> #13 0x2b9e673bc384 in ompi_coll_tuned_bcast_intra_dec_fixed ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/openmpi/mca_coll_tuned.so
> #14 0x2b9e521dbb79 in PMPI_Bcast () from
> /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
> #15 0x2b9e51f623df in pmpi_bcast__ () from
> /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi_mpifh.so.40
>
>
> #0  0x2b9e5303e160 in malloc () from /lib64/libc.so.6
> #1  0x2b9e651ed684 in ucs_pgt_dir_alloc (pgtable=0xb341ab8) at
> datastruct/pgtable.c:69
> #2  ucs_pgtable_insert_page (region=0xc6919d0, order=12,
> address=47959585718272, pgtable=0xb341ab8) at datastruct/pgtable.c:299
> #3  ucs_pgtable_insert (pgtable=pgtable@entry=0xb341ab8,
> region=region@entry=0xc6919d0) at datastruct/pgtable.c:403
> #4  0x2b9e651f35bc in ucs_rcache_create_region
> (region_p=0x7fff82806da0, arg=0x7fff82806d9c, prot=3, length=131072,
> address=0x2b9e76102070, rcache=0xb341a50) at sys/rcache.c:511
> #5  ucs_rcache_get (rcache=0xb341a50, address=0x2b9e76102070,
> length=131072, prot=prot@entry=3, arg=arg@entry=0x7fff82806d9c,
> region_p=region_p@entry=0x7fff82806da0) at sys/rcache.c:612
> #6  0x2b9e64f7a3d4 in uct_ib_mem_rcache_reg (uct_md=,
> address=, length=, flags=96,
> memh_p=0xbc409b0) at ib/base/ib_md.c:990
> #7  0x2b9e64d245e2 in ucp_mem_rereg_mds (context=,
> reg_md_map=4, address=address@entry=0x2b9e76102070, length= out>, uct_flags=uct_flags@entry=96,
> alloc_md=alloc_md@entry=0x0, mem_type=mem_type@entry
> 

Re: [OMPI users] growing memory use from MPI application

2019-06-19 Thread Noam Bernstein via users
> On Jun 19, 2019, at 2:44 PM, George Bosilca  wrote:
> 
> To completely disable UCX you need to disable the UCX MTL and not only the 
> BTL. I would use "--mca pml ob1 --mca btl ^ucx —mca btl_openib_allow_ib 1”.

Thanks for the pointer.  Disabling ucx this way _does_ seem to fix the memory 
issue.  That’s a very helpful workaround, if nothing else.

Using ucx 1.5.1 downloaded from the ucx web site at runtime (just by inserting 
it into LD_LIBRARY_PATH, without recompiling openmpi) doesn’t seem to fix the 
problem.

> 
> As you have a gdb session on the processes you can try to break on some of 
> the memory allocations function (malloc, realloc, calloc).

Good idea.  I set breakpoints on all 3 of those, then did “c” 3 times.  Does 
this mean anything to anyone?  I’m investigating the upstream calls (not 
included below) that generate these calls to mpi_bcast, but given that it works 
on other types of nodes, I doubt those are problematic.

#0  0x2b9e5303e160 in malloc () from /lib64/libc.so.6
#1  0x2b9e651f358a in ucs_rcache_create_region (region_p=0x7fff82806da0, 
arg=0x7fff82806d9c, prot=3, length=131072, address=0x2b9e76102070, 
rcache=0xb341a50) at sys/rcache.c:500
#2  ucs_rcache_get (rcache=0xb341a50, address=0x2b9e76102070, length=131072, 
prot=prot@entry=3, arg=arg@entry=0x7fff82806d9c, 
region_p=region_p@entry=0x7fff82806da0) at sys/rcache.c:612
#3  0x2b9e64f7a3d4 in uct_ib_mem_rcache_reg (uct_md=, 
address=, length=, flags=96, memh_p=0xbc409b0) at 
ib/base/ib_md.c:990
#4  0x2b9e64d245e2 in ucp_mem_rereg_mds (context=, 
reg_md_map=4, address=address@entry=0x2b9e76102070, length=, 
uct_flags=uct_flags@entry=96, 
alloc_md=alloc_md@entry=0x0, mem_type=mem_type@entry=UCT_MD_MEM_TYPE_HOST, 
alloc_md_memh_p=alloc_md_memh_p@entry=0x0, uct_memh=uct_memh@entry=0xbc409b0, 
md_map_p=md_map_p@entry=0xbc409a8)
at core/ucp_mm.c:100
#5  0x2b9e64d260f0 in ucp_request_memory_reg (context=0xb340800, md_map=4, 
buffer=0x2b9e76102070, length=131072, datatype=128, 
state=state@entry=0xbc409a0, mem_type=UCT_MD_MEM_TYPE_HOST, 
req_dbg=req_dbg@entry=0xbc40940, uct_flags=, 
uct_flags@entry=0) at core/ucp_request.c:218
#6  0x2b9e64d3716b in ucp_request_send_buffer_reg (md_map=, 
req=0xbc40940) at 
/home_tin/bernadm/configuration/330_OFED/ucx-1.5.1/src/ucp/core/ucp_request.inl:343
#7  ucp_tag_send_start_rndv (sreq=sreq@entry=0xbc40940) at tag/rndv.c:153
#8  0x2b9e64d3abb9 in ucp_tag_send_req (enable_zcopy=1, 
proto=0x2b9e64f569c0 , cb=0x2b9e64467350 
, rndv_am_thresh=, 
rndv_rma_thresh=, msg_config=0xb3ea278, dt_count=8192, 
req=) at tag/tag_send.c:78
#9  ucp_tag_send_nb (ep=, buffer=, count=8192, 
datatype=, tag=, cb=0x2b9e64467350 
) at tag/tag_send.c:203
#10 0x2b9e64465fa6 in mca_pml_ucx_isend () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/openmpi/mca_pml_ucx.so
#11 0x2b9e52211900 in ompi_coll_base_bcast_intra_generic () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
#12 0x2b9e52211d4b in ompi_coll_base_bcast_intra_pipeline () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
#13 0x2b9e673bc384 in ompi_coll_tuned_bcast_intra_dec_fixed () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/openmpi/mca_coll_tuned.so
#14 0x2b9e521dbb79 in PMPI_Bcast () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
#15 0x2b9e51f623df in pmpi_bcast__ () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi_mpifh.so.40


#0  0x2b9e5303e160 in malloc () from /lib64/libc.so.6
#1  0x2b9e651ed684 in ucs_pgt_dir_alloc (pgtable=0xb341ab8) at 
datastruct/pgtable.c:69
#2  ucs_pgtable_insert_page (region=0xc6919d0, order=12, 
address=47959585718272, pgtable=0xb341ab8) at datastruct/pgtable.c:299
#3  ucs_pgtable_insert (pgtable=pgtable@entry=0xb341ab8, 
region=region@entry=0xc6919d0) at datastruct/pgtable.c:403
#4  0x2b9e651f35bc in ucs_rcache_create_region (region_p=0x7fff82806da0, 
arg=0x7fff82806d9c, prot=3, length=131072, address=0x2b9e76102070, 
rcache=0xb341a50) at sys/rcache.c:511
#5  ucs_rcache_get (rcache=0xb341a50, address=0x2b9e76102070, length=131072, 
prot=prot@entry=3, arg=arg@entry=0x7fff82806d9c, 
region_p=region_p@entry=0x7fff82806da0) at sys/rcache.c:612
#6  0x2b9e64f7a3d4 in uct_ib_mem_rcache_reg (uct_md=, 
address=, length=, flags=96, memh_p=0xbc409b0) at 
ib/base/ib_md.c:990
#7  0x2b9e64d245e2 in ucp_mem_rereg_mds (context=, 
reg_md_map=4, address=address@entry=0x2b9e76102070, length=, 
uct_flags=uct_flags@entry=96, 
alloc_md=alloc_md@entry=0x0, mem_type=mem_type@entry=UCT_MD_MEM_TYPE_HOST, 
alloc_md_memh_p=alloc_md_memh_p@entry=0x0, uct_memh=uct_memh@entry=0xbc409b0, 
md_map_p=md_map_p@entry=0xbc409a8)
at core/ucp_mm.c:100
#8  0x2b9e64d260f0 in ucp_request_memory_reg (context=0xb340800, md_map=4, 
buffer=0x2b9e76102070, length=131072, datatype=128, 
state=state@entry=0xbc409a0, mem_type=UCT_MD_MEM_TYPE_HOST, 
req_dbg=req_dbg@entry=0xbc40940, uct_flags=, 
uct_flags@entry=0) at 

Re: [OMPI users] growing memory use from MPI application

2019-06-19 Thread George Bosilca via users
To completely disable UCX you need to disable the UCX MTL and not only the
BTL. I would use "--mca pml ob1 --mca btl ^ucx —mca btl_openib_allow_ib 1".

As you have a gdb session on the processes you can try to break on some of
the memory allocations function (malloc, realloc, calloc).

  George.


On Wed, Jun 19, 2019 at 2:37 PM Noam Bernstein via users <
users@lists.open-mpi.org> wrote:

> I tried to disable ucx (successfully, I think - I replaced the “—mca btl
> ucx —mca btl ^vader,tcp,openib” with “—mca btl_openib_allow_ib 1”, and
> attaching gdb to a running process shows no ucx-related routines active).
> It still has the same fast growing (1 GB/s) memory usage problem.
>
>
>   Noam
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-19 Thread Noam Bernstein via users
I tried to disable ucx (successfully, I think - I replaced the “—mca btl ucx 
—mca btl ^vader,tcp,openib” with “—mca btl_openib_allow_ib 1”, and attaching 
gdb to a running process shows no ucx-related routines active).  It still has 
the same fast growing (1 GB/s) memory usage problem.


Noam
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-19 Thread Noam Bernstein via users
> On Jun 19, 2019, at 2:00 PM, John Hearns via users  
> wrote:
> 
> Noam, it may be a stupid question. Could you try runningslabtop   ss the 
> program executes

The top SIZE usage is this line
   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME   
5937540 5937540 100%0.09K 141370   42565480K kmalloc-96
which seems to be growing continuously. However, it’s much smaller than the 
drop in free memory.  It gets to around 1 GB after tens of seconds (500 MB 
here), but the overall free memory is dropping by about 1 GB / second, so tens 
of GB over the same time.

> 
> Also  'watch  cat /proc/meminfo'is also a good diagnostic

Other than MemFree dropping, I don’t see much. Here’s a diff, 10 seconds apart:
2,3c2,3
< MemFree:54229400 kB
< MemAvailable:   54271804 kB
---
> MemFree:45010772 kB
> MemAvailable:   45054200 kB
19c19
< AnonPages:  22063260 kB
---
> AnonPages:  22526300 kB
22,24c22,24
< Slab: 851380 kB
< SReclaimable:  87100 kB
< SUnreclaim:   764280 kB
---
> Slab:1068208 kB
> SReclaimable:  89148 kB
> SUnreclaim:   979060 kB
31c31
< Committed_AS:   34976896 kB
---
> Committed_AS:   34977680 kB

MemFree has dropped by 9 GB, but as far as I can tell nothing else has 
increased by anything near as much, so I don’t know where the memory is going.

Noam


||
|U.S. NAVAL|
|_RESEARCH_|
LABORATORY
Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546
https://www.nrl.navy.mil 
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] growing memory use from MPI application

2019-06-19 Thread John Hearns via users
Noam, it may be a stupid question. Could you try runningslabtop   ss
the program executes

Also  'watch  cat /proc/meminfo'is also a good diagnostic

On Wed, 19 Jun 2019 at 18:32, Noam Bernstein via users <
users@lists.open-mpi.org> wrote:

> Hi - we’re having a weird problem with OpenMPI on our newish infiniband
> EDR (mlx5) nodes.  We're running CentOS 7.6, with all the infiniband and
> ucx libraries as provided by CentOS, i.e.
>
> ucx-1.4.0-1.el7.x86_64
> libibverbs-utils-17.2-3.el7.x86_64
> libibverbs-17.2-3.el7.x86_64
> libibumad-17.2-3.el7.x86_64
>
> kernel is
>
> 3.10.0-957.21.2.el7.x86_64
>
> I’ve compiled my open OpenMPI, version 4.0.1 (—with-verbs —with-ofi
> —with-ucx).
>
> The job is started with
>
> mpirun —mca pml ucx —mca btl ^vader,tcp,openib
>
> as recommended for ucx.
>
> We have some jobs (one particular code, some but not all sets of input
> parameters) that appear to take an increasing amount of memory (in MPI?)
> until the node crashes.  The total memory used by all processes (reported
> by ps or top) is not increasing, but “free” reports less and less available
> memory.  Within a couple of minutes it uses all of the 96GB on each of the
> nodes. When the job is killed the processes go away, but the memory usage
> (as reported by “free”) stays the same, e.g.:
>
>   totalusedfree
>   shared  buff/cache   available
> Mem:   9842395688750140 70216882184 2652128
>   6793020
> Swap:  65535996  36531265170684
>
> As far as I can tell I have to reboot to get the memory back.
>
> If I attach to a running process with “gdb -p”, I see stack traces that
> look like these two examples (starting from the first mpi-related call):
>
>
> #0  0x2b22a95134a3 in pthread_spin_lock () from /lib64/libpthread.so.0
> #1  0x2b22be73a3e8 in mlx5_poll_cq_v1 () from
> /usr/lib64/libibverbs/libmlx5-rdmav17.so
> #2  0x2b22bcb267de in uct_ud_verbs_iface_progress () from
> /lib64/libuct.so.0
> #3  0x2b22bc8d28b2 in ucp_worker_progress () from /lib64/libucp.so.0
> #4  0x2b22b7cd14e7 in mca_pml_ucx_progress ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/openmpi/mca_pml_ucx.so
> #5  0x2b22ab6064fc in opal_progress () from
> /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libopen-pal.so.40
> #6  0x2b22a9f51dc5 in ompi_request_default_wait ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
> #7  0x2b22a9fa355c in ompi_coll_base_allreduce_intra_ring ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
> #8  0x2b22a9f65cb3 in PMPI_Allreduce () from
> /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
> #9  0x2b22a9cedf9b in pmpi_allreduce__ ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi_mpifh.so.40
>
>
> #0  0x2ae0518de69d in write () from /lib64/libpthread.so.0
> #1  0x2ae064458d7f in ibv_cmd_reg_mr () from /usr/lib64/libibverbs.so.1
> #2  0x2ae066b9221b in mlx5_reg_mr () from
> /usr/lib64/libibverbs/libmlx5-rdmav17.so
> #3  0x2ae064461f08 in ibv_reg_mr () from /usr/lib64/libibverbs.so.1
> #4  0x2ae064f6e312 in uct_ib_md_reg_mr.isra.11.constprop () from
> /lib64/libuct.so.0
> #5  0x2ae064f6e4f2 in uct_ib_rcache_mem_reg_cb () from
> /lib64/libuct.so.0
> #6  0x2ae0651aec0f in ucs_rcache_get () from /lib64/libucs.so.0
> #7  0x2ae064f6d6a4 in uct_ib_mem_rcache_reg () from /lib64/libuct.so.0
> #8  0x2ae064d1fa58 in ucp_mem_rereg_mds () from /lib64/libucp.so.0
> #9  0x2ae064d21438 in ucp_request_memory_reg () from /lib64/libucp.so.0
> #10 0x2ae064d21663 in ucp_request_send_start () from /lib64/libucp.so.0
> #11 0x2ae064d335dd in ucp_tag_send_nb () from /lib64/libucp.so.0
> #12 0x2ae06420a5e6 in mca_pml_ucx_start ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/openmpi/mca_pml_ucx.so
> #13 0x2ae05236fc06 in ompi_coll_base_alltoall_intra_basic_linear ()
> from /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
> #14 0x2ae05232f347 in PMPI_Alltoall () from
> /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
> #15 0x2ae0520b704c in pmpi_alltoall__ () from
> /share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi_mpifh.so.40
>
> This doesn’t seem to happen on our older nodes (which have FDR mlx4
> interfaces).
>
> I don’t really have a mental model for OpenMPI's memory use, so I don’t
> know what component I should investigate: OpenMPI itself? ucx?  OFED?
> Something else?  IF anyone has any suggestions for what to try, and/or what
> other information would be useful, I’d appreciate it.
>
> thanks,
> Noam
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] growing memory use from MPI application

2019-06-19 Thread Noam Bernstein via users
Hi - we’re having a weird problem with OpenMPI on our newish infiniband EDR 
(mlx5) nodes.  We're running CentOS 7.6, with all the infiniband and ucx 
libraries as provided by CentOS, i.e.
ucx-1.4.0-1.el7.x86_64
libibverbs-utils-17.2-3.el7.x86_64
libibverbs-17.2-3.el7.x86_64
libibumad-17.2-3.el7.x86_64
kernel is 
3.10.0-957.21.2.el7.x86_64
I’ve compiled my open OpenMPI, version 4.0.1 (—with-verbs —with-ofi —with-ucx).

The job is started with
mpirun —mca pml ucx —mca btl ^vader,tcp,openib
as recommended for ucx.  

We have some jobs (one particular code, some but not all sets of input 
parameters) that appear to take an increasing amount of memory (in MPI?) until 
the node crashes.  The total memory used by all processes (reported by ps or 
top) is not increasing, but “free” reports less and less available memory.  
Within a couple of minutes it uses all of the 96GB on each of the nodes. When 
the job is killed the processes go away, but the memory usage (as reported by 
“free”) stays the same, e.g.:
  totalusedfree  shared  buff/cache   available
Mem:   9842395688750140 70216882184 2652128 6793020
Swap:  65535996  36531265170684
As far as I can tell I have to reboot to get the memory back.

If I attach to a running process with “gdb -p”, I see stack traces that look 
like these two examples (starting from the first mpi-related call):


#0  0x2b22a95134a3 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x2b22be73a3e8 in mlx5_poll_cq_v1 () from 
/usr/lib64/libibverbs/libmlx5-rdmav17.so
#2  0x2b22bcb267de in uct_ud_verbs_iface_progress () from /lib64/libuct.so.0
#3  0x2b22bc8d28b2 in ucp_worker_progress () from /lib64/libucp.so.0
#4  0x2b22b7cd14e7 in mca_pml_ucx_progress () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/openmpi/mca_pml_ucx.so
#5  0x2b22ab6064fc in opal_progress () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libopen-pal.so.40
#6  0x2b22a9f51dc5 in ompi_request_default_wait () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
#7  0x2b22a9fa355c in ompi_coll_base_allreduce_intra_ring () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
#8  0x2b22a9f65cb3 in PMPI_Allreduce () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
#9  0x2b22a9cedf9b in pmpi_allreduce__ () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi_mpifh.so.40


#0  0x2ae0518de69d in write () from /lib64/libpthread.so.0
#1  0x2ae064458d7f in ibv_cmd_reg_mr () from /usr/lib64/libibverbs.so.1
#2  0x2ae066b9221b in mlx5_reg_mr () from 
/usr/lib64/libibverbs/libmlx5-rdmav17.so
#3  0x2ae064461f08 in ibv_reg_mr () from /usr/lib64/libibverbs.so.1
#4  0x2ae064f6e312 in uct_ib_md_reg_mr.isra.11.constprop () from 
/lib64/libuct.so.0
#5  0x2ae064f6e4f2 in uct_ib_rcache_mem_reg_cb () from /lib64/libuct.so.0
#6  0x2ae0651aec0f in ucs_rcache_get () from /lib64/libucs.so.0
#7  0x2ae064f6d6a4 in uct_ib_mem_rcache_reg () from /lib64/libuct.so.0
#8  0x2ae064d1fa58 in ucp_mem_rereg_mds () from /lib64/libucp.so.0
#9  0x2ae064d21438 in ucp_request_memory_reg () from /lib64/libucp.so.0
#10 0x2ae064d21663 in ucp_request_send_start () from /lib64/libucp.so.0
#11 0x2ae064d335dd in ucp_tag_send_nb () from /lib64/libucp.so.0
#12 0x2ae06420a5e6 in mca_pml_ucx_start () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/openmpi/mca_pml_ucx.so
#13 0x2ae05236fc06 in ompi_coll_base_alltoall_intra_basic_linear () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
#14 0x2ae05232f347 in PMPI_Alltoall () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi.so.40
#15 0x2ae0520b704c in pmpi_alltoall__ () from 
/share/apps/mpi/openmpi/4.0.1/ib/gnu/lib/libmpi_mpifh.so.40

This doesn’t seem to happen on our older nodes (which have FDR mlx4 
interfaces). 

I don’t really have a mental model for OpenMPI's memory use, so I don’t know 
what component I should investigate: OpenMPI itself? ucx?  OFED? Something 
else?  IF anyone has any suggestions for what to try, and/or what other 
information would be useful, I’d appreciate it.

thanks,
Noam

 ___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users