THAT is a good idea. When using Omnipath we see an issue with stale files in 
/dev/shm if the application exits abnormally. I don't know if UCX uses that 
space as well.


On June 20, 2019 at 11:05 AM, Joseph Schuchart via users 
<> wrote:


Another idea: check for stale files in /dev/shm/ (or a subdirectory that
looks like it belongs to UCX/OpenMPI) and SysV shared memory using `ipcs


On 6/20/19 3:31 PM, Noam Bernstein via users wrote:

On Jun 20, 2019, at 4:44 AM, Charles A Taylor <
<>> wrote:

This looks a lot like a problem I had with OpenMPI 3.1.2.  I thought
the fix was landed in 4.0.0 but you might
want to check the code to be sure there wasn’t a regression in 4.1.x.
 Most of our codes are still running
3.1.2 so I haven’t built anything beyond 4.0.0 which definitely
included the fix.

Unfortunately, 4.0.0 behaves the same.

One thing that I’m wondering if anyone familiar with the internals can
explain is how you get a memory leak that isn’t freed when then program
ends?  Doesn’t that suggest that it’s something lower level, like maybe
a kernel issue?



Noam Bernstein, Ph.D.
Center for Materials Physics and Technology
U.S. Naval Research Laboratory
T +1 202 404 8628  F +1 202 404 7546

users mailing list

users mailing list
users mailing list

Reply via email to