Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-11 Thread George Bosilca via users
*MPI_ERR_PROC_FAILED is not yet a valid error in MPI. It is coming from ULFM, an extension to MPI that is not yet in the OMPI master.* *Daniel what version of Open MPI are you using ? Are you sure you are not mixing multiple versions due to PATH/LD_LIBRARY_PATH ?* *George.* On Mon, Jan 11,

Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-11 Thread Gilles Gouaillardet via users
Daniel, the test works in my environment (1 node, 32 GB memory) with all the mentioned parameters. Did you check the memory usage on your nodes and made sure the oom killer did not shoot any process? Cheers, Gilles On Tue, Jan 12, 2021 at 1:48 AM Daniel Torres via users wrote: > > Hi. > >

Re: [OMPI users] PRRTE DVM: how to specify rankfile per prun invocation?

2021-01-11 Thread Josh Hursey via users
Thank you for the bug report. I filed a bug against PRRTE so this doesn't get lost that you can follow below:   https://github.com/openpmix/prrte/issues/720 Making rankfle a per-job instead of a per-DVM option might require some internal plumbing work. So I'm not sure how quickly this will be

Re: [OMPI users] Timeout in MPI_Bcast/MPI_Barrier?

2021-01-11 Thread Daniel Torres via users
Hi. Thanks for responding. I have taken the most important parts from my code and I created a test that reproduces the behavior I described previously. I attach to this e-mail the compressed file "*test.tar.gz*". Inside him, you can find: 1.- The .c source code "test.c", which I compiled

Re: [OMPI users] [ORTE] Connecting back to parent - Forcing tcp port

2021-01-11 Thread Vincent via users
On 07/01/2021 19:51, Josh Hursey via users wrote: I posted a fix for the static ports issue (currently on the v4.1.x branch): https://github.com/open-mpi/ompi/pull/8339 If you have time do you want to give it a try and confirm that it fixes your issue? Hello Josh Definitely yes ! It does