Re: [OMPI users] MPI_INIT failed 4.0.1

2019-04-19 Thread Mahmood Naderan
Thanks for the hint.

Regards,
Mahmood




On Thu, Apr 18, 2019 at 2:47 AM Reuti  wrote:

> Hi,
>
> Am 17.04.2019 um 11:07 schrieb Mahmood Naderan:
>
> > Hi,
> > After successful installation of v4 on a custom location, I see some
> errors while the default installation (v2) hasn't.
>
> Did you also recompile your application with this version of Open MPI?
>
> -- Reuti
>
>
> > $ /share/apps/softwares/openmpi-4.0.1/bin/mpirun --version
> > mpirun (Open MPI) 4.0.1
> >
> > Report bugs to http://www.open-mpi.org/community/help/
> > $ /share/apps/softwares/openmpi-4.0.1/bin/mpirun -np 4 pw.x -i
> mos2.rlx.in
> >
> --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or
> environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> >   ompi_mpi_init: ompi_rte_init failed
> >   --> Returned "(null)" (-43) instead of "Success" (0)
> >
> --
> >
> --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or
> environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> >   ompi_mpi_init: ompi_rte_init failed
> >   --> Returned "(null)" (-43) instead of "Success" (0)
> >
> --
> > *** An error occurred in MPI_Init
> > *** on a NULL communicator
> > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> > ***and potentially your MPI job)
> > [rocks7.jupiterclusterscu.com:18531] Local abort before MPI_INIT
> completed completed successfully, but am not able to aggregate error
> messages, and not able to guarantee that all other processes were killed!
> > *** An error occurred in MPI_Init
> > *** on a NULL communicator
> > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> > ***and potentially your MPI job)
> > [rocks7.jupiterclusterscu.com:18532] Local abort before MPI_INIT
> completed completed successfully, but am not able to aggregate error
> messages, and not able to guarantee that all other processes were killed!
> >
> --
> > Primary job  terminated normally, but 1 process returned
> > a non-zero exit code. Per user-direction, the job has been aborted.
> >
> --
> >
> --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or
> environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> >   ompi_mpi_init: ompi_rte_init failed
> >   --> Returned "(null)" (-43) instead of "Success" (0)
> >
> --
> >
> --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or
> environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> >   ompi_mpi_init: ompi_rte_init failed
> >   --> Returned "(null)" (-43) instead of "Success" (0)
> >
> --
> > *** An error occurred in MPI_Init
> > *** on a NULL communicator
> > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> > ***and potentially your MPI job)
> > [rocks7.jupiterclusterscu.com:18530] Local abort before MPI_INIT
> completed completed successfully, but am not able to aggregate error
> messages, and not able to guarantee that all other processes were killed!
> > *** An error occurred in MPI_Init
> > *** on a NULL communicator
> > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> > ***and potentially your MPI job)
> > [rocks7.jupiterclusterscu.com:18533] Local abort before 

Re: [OMPI users] MPI_INIT failed 4.0.1

2019-04-17 Thread Reuti
Hi,

Am 17.04.2019 um 11:07 schrieb Mahmood Naderan:

> Hi,
> After successful installation of v4 on a custom location, I see some errors 
> while the default installation (v2) hasn't.

Did you also recompile your application with this version of Open MPI?

-- Reuti


> $ /share/apps/softwares/openmpi-4.0.1/bin/mpirun --version
> mpirun (Open MPI) 4.0.1
> 
> Report bugs to http://www.open-mpi.org/community/help/
> $ /share/apps/softwares/openmpi-4.0.1/bin/mpirun -np 4 pw.x -i mos2.rlx.in
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
> 
>   ompi_mpi_init: ompi_rte_init failed
>   --> Returned "(null)" (-43) instead of "Success" (0)
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
> 
>   ompi_mpi_init: ompi_rte_init failed
>   --> Returned "(null)" (-43) instead of "Success" (0)
> --
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
> [rocks7.jupiterclusterscu.com:18531] Local abort before MPI_INIT completed 
> completed successfully, but am not able to aggregate error messages, and not 
> able to guarantee that all other processes were killed!
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
> [rocks7.jupiterclusterscu.com:18532] Local abort before MPI_INIT completed 
> completed successfully, but am not able to aggregate error messages, and not 
> able to guarantee that all other processes were killed!
> --
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
> 
>   ompi_mpi_init: ompi_rte_init failed
>   --> Returned "(null)" (-43) instead of "Success" (0)
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
> 
>   ompi_mpi_init: ompi_rte_init failed
>   --> Returned "(null)" (-43) instead of "Success" (0)
> --
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
> [rocks7.jupiterclusterscu.com:18530] Local abort before MPI_INIT completed 
> completed successfully, but am not able to aggregate error messages, and not 
> able to guarantee that all other processes were killed!
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
> [rocks7.jupiterclusterscu.com:18533] Local abort before MPI_INIT completed 
> completed successfully, but am not able to aggregate error messages, and not 
> able to guarantee that all other processes were killed!
> --
> mpirun detected that one or more