Martin,

this is a connectivity issue reported by the btl/tcp component.

You can try restricting the IP interface to a subnet known to work
(and with no firewall) between both hosts

mpirun --mca btl_tcp_if_include 192.168.0.0/24 ...

If the error persists, you can

mpirun --mca btl_tcp_base_verbose 20 ...

and then compress and post the logs so we can have a look


Cheers,

Gilles

On Thu, Feb 4, 2021 at 9:33 PM Martín Morales via users
<users@lists.open-mpi.org> wrote:
>
> Hi Marcos,
>
>
>
> Yes, I have a problem with spawning to a “worker” host (on localhost, works). 
> There are just two machines: “master” and “worker”.  I’m using Windows 10 in 
> both with same Cygwin and packages. Pasted below some details.
>
> Thanks for your help. Regards,
>
>
>
> Martín
>
>
>
> ----
>
>
>
> Running:
>
>
>
> mpirun -np 1 -hostfile ./hostfile ./spawner.exe 8
>
>
>
> hostfile:
>
>
>
> master slots=5
>
> worker slots=5
>
>
>
> Error:
>
>
>
> At least one pair of MPI processes are unable to reach each other for
>
> MPI communications.  This means that no Open MPI device has indicated
>
> that it can be used to communicate between these processes.  This is
>
> an error; Open MPI requires that all MPI processes be able to reach
>
> each other.  This error can sometimes be the result of forgetting to
>
> specify the "self" BTL.
>
>
>
> Process 1 ([[31598,1],0]) is on host: DESKTOP-C0G4680
>
> Process 2 ([[31598,2],2]) is on host: worker
>
> BTLs attempted: self tcp
>
>
>
> Your MPI job is now going to abort; sorry.
>
> --------------------------------------------------------------------------
>
> [DESKTOP-C0G4680:02828] [[31598,1],0] ORTE_ERROR_LOG: Unreachable in file 
> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c
>  at line 493
>
> [DESKTOP-C0G4680:02828] *** An error occurred in MPI_Comm_spawn
>
> [DESKTOP-C0G4680:02828] *** reported by process [2070806529,0]
>
> [DESKTOP-C0G4680:02828] *** on communicator MPI_COMM_SELF
>
> [DESKTOP-C0G4680:02828] *** MPI_ERR_INTERN: internal error
>
> [DESKTOP-C0G4680:02828] *** MPI_ERRORS_ARE_FATAL (processes in this 
> communicator will now abort,
>
> [DESKTOP-C0G4680:02828] ***    and potentially your MPI job)
>
>
>
> USER_SSH@DESKTOP-C0G4680 ~
>
> $ [WinDev2012Eval:00120] [[31598,2],2] ORTE_ERROR_LOG: Unreachable in file 
> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c
>  at line 493
>
> [WinDev2012Eval:00121] [[31598,2],3] ORTE_ERROR_LOG: Unreachable in file 
> /pub/devel/openmpi/v4.0/openmpi-4.0.5-1.x86_64/src/openmpi-4.0.5/ompi/dpm/dpm.c
>  at line 493
>
> --------------------------------------------------------------------------
>
> It looks like MPI_INIT failed for some reason; your parallel process is
>
> likely to abort.  There are many reasons that a parallel process can
>
> fail during MPI_INIT; some of which are due to configuration or environment
>
> problems.  This failure appears to be an internal failure; here's some
>
> additional information (which may only be relevant to an Open MPI
>
> developer):
>
>
>
> ompi_dpm_dyn_init() failed
>
> --> Returned "Unreachable" (-12) instead of "Success" (0)
>
> --------------------------------------------------------------------------
>
> [WinDev2012Eval:00121] *** An error occurred in MPI_Init
>
> [WinDev2012Eval:00121] *** reported by process 
> [15289389101093879810,12884901891]
>
> [WinDev2012Eval:00121] *** on a NULL communicator
>
> [WinDev2012Eval:00121] *** Unknown error
>
> [WinDev2012Eval:00121] *** MPI_ERRORS_ARE_FATAL (processes in this 
> communicator will now abort,
>
> [WinDev2012Eval:00121] ***    and potentially your MPI job)
>
> [DESKTOP-C0G4680:02831] 2 more processes have sent help message 
> help-mca-bml-r2.txt / unreachable proc
>
> [DESKTOP-C0G4680:02831] Set MCA parameter "orte_base_help_aggregate" to 0 to 
> see all help / error messages
>
> [DESKTOP-C0G4680:02831] 1 more process has sent help message 
> help-mpi-runtime.txt / mpi_init:startup:internal-failure
>
> [DESKTOP-C0G4680:02831] 1 more process has sent help message 
> help-mpi-errors.txt / mpi_errors_are_fatal unknown handle
>
>
>
> Script spawner:
>
>
>
> #include "mpi.h"
>
> #include <stdio.h>
>
> #include <stdlib.h>
>
> #include <unistd.h>
>
>
>
> int main(int argc, char ** argv){
>
>     int processesToRun;
>
>     MPI_Comm intercomm;
>
>     MPI_Info info;
>
>
>
>            if(argc < 2 ){
>
>                       printf("Processes number needed!\n");
>
>                       return 0;
>
>            }
>
>            processesToRun = atoi(argv[1]);
>
>     MPI_Init( NULL, NULL );
>
>            printf("Spawning from parent:...\n");
>
>            MPI_Comm_spawn( "./spawned.exe", MPI_ARGV_NULL, processesToRun, 
> MPI_INFO_NULL, 0, MPI_COMM_SELF, &intercomm, MPI_ERRCODES_IGNORE);
>
>
>
>     MPI_Finalize();
>
>     return 0;
>
> }
>
>
>
> Script spawned:
>
>
>
> #include "mpi.h"
>
> #include <stdio.h>
>
> #include <stdlib.h>
>
>
>
> int main(int argc, char ** argv){
>
>     int hostName_len,rank, size;
>
>     MPI_Comm parentcomm;
>
>     char hostName[200];
>
>
>
>     MPI_Init( NULL, NULL );
>
>     MPI_Comm_get_parent( &parentcomm );
>
>     MPI_Get_processor_name(hostName, &hostName_len);
>
>     MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>
>     MPI_Comm_size(MPI_COMM_WORLD, &size);
>
>
>
>     if (parentcomm != MPI_COMM_NULL) {
>
> printf("I'm the spawned h: %s  r/s: %i/%i\n", hostName, rank, size );
>
>     }
>
>
>
>     MPI_Finalize();
>
>     return 0;
>
> }
>
>
>
>
>
>
>
>
>
> From: Marco Atzeri via users
> Sent: miércoles, 3 de febrero de 2021 17:58
> To: users@lists.open-mpi.org
> Cc: Marco Atzeri
> Subject: Re: [OMPI users] OMPI 4.1 in Cygwin packages?
>
>
>
> On 03.02.2021 21:35, Martín Morales via users wrote:
> > Hello,
> >
> > I would like to know if any OMPI 4.1.* is going to be available in the
> > Cygwin packages.
> >
> > Thanks and regards,
> >
> > Martín
> >
>
> Hi Martin,
> anything in it that is abolutely needed short term ?
>
> Any problem with current 4.0.5 package ?
>
>
> Usually it is very time consuming the build
> and I am busy with other cygwin stuff
>
> Regards
> Marco
>
>

Reply via email to