Setting aside the known issue with comm_spawn in v4.0.4, how are you planning 
to forward stdin without the use of "mpirun"? Something has to collect stdin of 
the terminal and distribute it to the stdin of the processes

> On Aug 12, 2020, at 9:20 AM, Alvaro Payero Pinto via users 
> <users@lists.open-mpi.org> wrote:
> 
> Hi,
> 
> I’m using OpenMPI 4.0.4, where the Fortran side has been compiled with Intel 
> Fortran suite v17.0.6 and the C/C++ side with GNU suite v4.8.5 due to 
> requirements I cannot modify.
> 
> I am trying to parallelise a Fortran application by dynamically creating 
> processes on the fly with “MPI_Comm_Spawn” subroutine. The application starts 
> with only one parent and it takes a file throughout standard input, but it 
> looks like children are not inheriting the access to such file. I would like 
> to have all children processes inheriting the standard input of the parent. 
> I’m aware that perhaps the “-stdin all” argument of the “mpirun“ binary might 
> do it, but I am attempting to execute the binary without mpirun unless 
> strictly necessary.
>  
> So far, I have already tried to pass a non-null “MPI_Info“ to 
> “MPI_Comm_Spawn” with a key of “ompi_stdin_target“ and a value of “all” but 
> it does not work. I have also tried other values like none, 0, 1, -1, etc.) 
> without success either.
> 
> Here is the subroutine provoking the error at the MPI_Comm_spawn call:
> 
> ===========================================================
> SUBROUTINE ADmn_createSpawn(iNumberChilds, iCommBigWorld, iIDMpi,    
> iNumberProcess)
>     IMPLICIT NONE
> 
>     !ID of the communicator that contains all the process
>     INTEGER:: iCommBigWorld
>     !Number of child process
>     INTEGER :: iNumberChilds
>     INTEGER:: iNumberProcess
>      CHARACTER(LEN=1)                         :: arguments(1)
>      INTEGER                                  :: bigWorld, iC, iInic, iFinal;
>      INTEGER                                  :: ierror
>       INTEGER                                  :: iIDFamiliar=0;
> 
>     CHARACTER(LEN=128)        :: command
>     INTEGER                               :: iInfoMPI
>     CHARACTER(LEN=*), Parameter  :: key=" ompi_stdin_target ",valueMPI= "all";
>     logical :: FLAG
> 
>     !Id number of the current process
>     INTEGER :: iIDMpi
> 
>     CALL GET_COMMAND_ARGUMENT(0, command)
>     CALL MPI_Comm_get_parent(iParent, ierror)
> 
>     IF (iParent .EQ. MPI_COMM_NULL) THEN
>         arguments(1) = ''
>         iIDFamiliar = 0;
> 
>         call MPI_INFO_CREATE(iInfoMPI, ierror)
>         call MPI_INFO_SET(iInfoMPI, key, valueMPI, ierror)
> 
>         CALL MPI_Comm_spawn(command, arguments, iNumberChilds, iInfoMPI, 0, 
> MPI_COMM_WORLD, iChild, iSpawn_error, ierror)
>        
>         CALL MPI_INTERCOMM_MERGE(iChild, .false., iCommBigWorld, ierror)
>     ELSE
>         call MPI_COMM_RANK(MPI_COMM_WORLD, iIDFamiliar, ierror)
> 
>         iIDFamiliar = iIDFamiliar + 1;
> 
>         CALL MPI_INTERCOMM_MERGE(iParent, .true., iCommBigWorld, ierror)
>     END IF
> 
>     CALL MPI_COMM_RANK(iCommBigWorld,iIDMpi,ierror)
>     call MPI_COMM_SIZE(iCommBigWorld, intasks, ierror)
>     iProcessIDInternal = iIDMpi
>     iNumberProcess = intasks
> 
> END SUBROUTINE ADmn_createSpawn
> ===========================================================
> 
> Binary is executed as:
> Binaryname.bin < inputfilename.dat
> And here is the segmentation fault produced when passing the MPI_Info 
> variable to MPI_Comm_Spawn:
> 
> ===========================================================
> [sles12sp3-srv:10384] *** Process received signal ***
> [sles12sp3-srv:10384] Signal: Segmentation fault (11)
> [sles12sp3-srv:10384] Signal code: Address not mapped (1)
> [sles12sp3-srv:10384] Failing at address: 0xfffffffe
> [sles12sp3-srv:10384] [ 0] /lib64/libpthread.so.0(+0x10c10)[0x7fc6a8dd5c10]
> [sles12sp3-srv:10384] [ 1] 
> /usr/local/lib64/libopen-rte.so.40(pmix_server_spawn_fn+0x1052)[0x7fc6aa283232]
> [sles12sp3-srv:10384] [ 2] 
> /usr/local/lib64/openmpi/mca_pmix_pmix3x.so(+0x46210)[0x7fc6a602b210]
> [sles12sp3-srv:10384] [ 3] 
> /usr/local/lib64/openmpi/mca_pmix_pmix3x.so(pmix_server_spawn+0x7c6)[0x7fc6a60a5ab6]
> [sles12sp3-srv:10384] [ 4] 
> /usr/local/lib64/openmpi/mca_pmix_pmix3x.so(+0xb1a2f)[0x7fc6a6096a2f]
> [sles12sp3-srv:10384] [ 5] 
> /usr/local/lib64/openmpi/mca_pmix_pmix3x.so(pmix_server_message_handler+0x41)[0x7fc6a6097511]
> [sles12sp3-srv:10384] [ 6] 
> /usr/local/lib64/openmpi/mca_pmix_pmix3x.so(OPAL_MCA_PMIX3X_pmix_ptl_base_process_msg+0x1bf)[0x7fc6a610481f]
> [sles12sp3-srv:10384] [ 7] 
> /usr/local/lib64/libopen-pal.so.40(opal_libevent2022_event_base_loop+0x8fc)[0x7fc6a9facd6c]
> [sles12sp3-srv:10384] [ 8] 
> /usr/local/lib64/openmpi/mca_pmix_pmix3x.so(+0xcf7ce)[0x7fc6a60b47ce]
> [sles12sp3-srv:10384] [ 9] /lib64/libpthread.so.0(+0x8724)[0x7fc6a8dcd724]
> [sles12sp3-srv:10384] [10] /lib64/libc.so.6(clone+0x6d)[0x7fc6a8b0ce8d]
> [sles12sp3-srv:10384] *** End of error message ***
> 
> ===========================================================
>  
> Do you have any idea about what might be happening?
> 
> Thank you in advance!
> 
> Best regards,
> Álvaro


Reply via email to