Re: [OMPI users] High errorcode message

2021-02-01 Thread Arturo Fernandez via users
The app is not calling MPI_ABORT directly. I dug a little deeper into it but didn't find anything interesting. It just doesn't find the subdirectory for output purposes (the internal error variable is 0) and simply crashes when returning from the subroutine. It was just me not setting things up

Re: [OMPI users] High errorcode message

2021-01-31 Thread Jeff Squyres (jsquyres) via users
Is your app calling MPI_Abort directly? There's a 2nd argument to MPI_ABORT that should be passed to the output message. If it's not, we should investigate that. Or is your app aborting in some other, indirect method? If so, perhaps somehow that 2nd argument is getting dropped somewhere

Re: [OMPI users] High errorcode message

2021-01-30 Thread Arturo Fernandez via users
Hi Jeff. Sorry for the delay. It took a while but I was finally error to track down the point where the app breaks down. The problem seems to originate in an output subroutine, not because any MPI communication is malfunctioning. My guess is that MPI_Abort needs to produce some error message. Why

Re: [OMPI users] High errorcode message

2021-01-29 Thread Jeff Squyres (jsquyres) via users
It's somewhat hard to say without more information. What is your app doing when it calls abort? On Jan 29, 2021, at 8:49 PM, Arturo Fernandez via users mailto:users@lists.open-mpi.org>> wrote: Hello, My system is running CentOS8 & OpenMPI v4.1.0. Most stuff is working fine but one app is

[OMPI users] High errorcode message

2021-01-29 Thread Arturo Fernandez via users
Hello, My system is running CentOS8 & OpenMPI v4.1.0. Most stuff is working fine but one app is aborting with: MPI_ABORT was invoked on rank 7 in communicator MPI_COMM_WORLD with errorcode 1734831948. The other 23 MPI ranks also abort. I'm a bit confused by the high error code. Does it mean