> On Jun 20, 2019, at 9:40 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> On Jun 20, 2019, at 9:31 AM, Noam Bernstein via users 
> <users@lists.open-mpi.org> wrote:
>> One thing that I’m wondering if anyone familiar with the internals can 
>> explain is how you get a memory leak that isn’t freed when then program 
>> ends?  Doesn’t that suggest that it’s something lower level, like maybe a 
>> kernel issue?
> If "top" doesn't show processes eating up the memory, and killing processes 
> (e.g., MPI processes) doesn't give you memory back, then it's likely that 
> something in the kernel is leaking memory.

That’s definitely what’s happening.  “free" is reporting a lot of memory used, 
but adding the values from ps is much lower.

> Have you tried the latest version of UCX -- including their kernel drivers -- 
> from Mellanox (vs. inbox/CentOS)?

I’ve tried the latest ucx from the ucx web site, 1.5.1, which doesn’t change 
the behavior.

I haven’t yet tried the latest OFED or Mellanox low level stuff.  That’s next 
on my list, but slightly more involved to do, so I’ve been avoiding it.

users mailing list

Reply via email to