On 04/08/2018 22:41, Dima Kogan wrote:
Answering some of my own questions inline...
Dima Kogan <[email protected]> writes:
2. Is the MPI implementation significant? Would mpich behave
potentially differently here from openmpi?
I rebuilt sundials with mpich instead of openmpi, and those tests now
work just fine: nothing locks up. There might be other issues involved
here, but I'd like to fully figure out what's going on before proposing
any such change.
I also poked around with a debugger a little bit: the lockup is inside
some MPI calls. Anybody have experience here? I can imagine this is an
openmpi bug, but the last time I touched MPI anything was about 20 years
ago, so any info from more-recently-experienced people would be welcome.
There have been a number of lockup issues with openmpi recently,
including thread lockups in underlying pmix libs.
There is a new 3.1.2rc2 release that I'm testing. Can you give a
reproducible test case for me to test against, as I think this release
may have the necessary fix.
regards
Alastair
--
Alastair McKinstry, <[email protected]>, <[email protected]>,
https://diaspora.sceal.ie/u/amckinstry
Misentropy: doubting that the Universe is becoming more disordered.