Ok, that would be great -- thanks.
Recompiling Open MPI with --enable-debug will turn on several debugging/sanity
checks inside Open MPI, and it will also enable debugging symbols. Hence, If
you can get a failure when a debug Open MPI build, it might give you a core
file that can be used to get a more detailed stack trace, poke around and see
if there's a NULL pointer somewhere, ...etc.
> On Jul 11, 2018, at 11:03 AM, Noam Bernstein <noam.bernst...@nrl.navy.mil>
>> On Jul 11, 2018, at 9:58 AM, Noam Bernstein <noam.bernst...@nrl.navy.mil>
>>> On Jul 10, 2018, at 5:15 PM, Noam Bernstein <noam.bernst...@nrl.navy.mil>
>>> What are useful steps I can do to debug? Recompile with —enable-debug?
>>> Are there any other versions that are worth trying? I don’t recall this
>>> error happening before we switched to 3.1.0.
>> It appears that the problem is there with OpenMPI 3.1.1, but not 2.1.3. Of
>> course I can’t be 100% sure, since it’s non deterministic, but 3 runs died
>> after 0-3 iterations with 3.1.1, and did 3 runs with 10 iterations each with
> After more extensive testing it’s clear that it still happens with 2.1.3, but
> much less frequently. I’m going to try to get more detailed info with
> version 3.1.1, where it’s easier to reproduce.
> |U.S. NAVAL|
> Noam Bernstein, Ph.D.
> Center for Materials Physics and Technology
> U.S. Naval Research Laboratory
> T +1 202 404 8628 F +1 202 404 7546
> users mailing list
users mailing list