Hi Matias, i read putenv() instead of setenv(), so the code is correct as it is, sorry about the noise
as far as i am concerned, i'd rather add a FAQ in the open-mpi.org web site. you can simply issue a PR vs the https://github.com/open-mpi/ompi-www.git repo Cheers, Gilles On Fri, Sep 30, 2016 at 10:53 AM, Cabral, Matias A <matias.a.cab...@intel.com> wrote: > Hey Gilles, > > > > Quick answer on the first part until I read a little more about > num_max_procs :O > > Being the third parameter of setenv a 0 means: do not override if found in > the env. So the workaround does work today. Moreover, I would like to know > if there is a place in some OMPI wiki to document this behavior. > > > > Thanks, > > > > _MAC > > > > From: devel [mailto:devel-boun...@lists.open-mpi.org] On Behalf Of Gilles > Gouaillardet > Sent: Thursday, September 29, 2016 6:14 PM > To: Open MPI Developers <devel@lists.open-mpi.org> > Subject: [OMPI devel] mtl/psm2 and $PSM2_DEVICES > > > > This is a follow-up of > https://mail-archive.com/users@lists.open-mpi.org/msg30055.html > > > > Thanks Matias for the lengthy explanation. > > > > currently, PSM2_DEVICES is overwritten, so i do not think setting it before > invoking mpirun will help > > > > also, in this specific case > > - the user is running within a SLURM allocation with 2 nodes > > - the user specified a host file with 2 distinct nodes > > > > my first impression is that mtl/psm2 could/should handle this (well only one > condition has to be met) properly and *not* set > > export PSM2_DEVICES="self,shm" > > > the patch below > - does not overwrite PSM2_DEVICES > - does not set PSM2_DEVICES when num_max_procs > num_total_procs > this is suboptimal, but i could not find a way to get the number of orted. > iirc, MPI_Comm_spawn can have an orted dynamically spawned by passing a host > in the MPI_Info. > if this host is not part of the hostfile (nor RM allocation ?), then > PSM2_DEVICES must be set manually by the user > > > Ralph, > > is there a way to get the number of orted ? > - if i mpirun -np 1 --host n0,n1 ... orte_process_info.num_nodes is 1 (i > wish i could get 2) > - if running in singleton mode, orte_process_info.num_max_procs is 0 (is > this a bug or a feature ?) > > Cheers, > > Gilles > > > diff --git a/ompi/mca/mtl/psm2/mtl_psm2_component.c > b/ompi/mca/mtl/psm2/mtl_psm2_component.c > index 26bccd2..52b906b 100644 > --- a/ompi/mca/mtl/psm2/mtl_psm2_component.c > +++ b/ompi/mca/mtl/psm2/mtl_psm2_component.c > @@ -14,6 +14,8 @@ > * Copyright (c) 2012-2015 Los Alamos National Security, LLC. > * All rights reserved. > * Copyright (c) 2013-2016 Intel, Inc. All rights reserved > + * Copyright (c) 2016 Research Organization for Information Science > + * and Technology (RIST). All rights reserved. > * $COPYRIGHT$ > * > * Additional copyrights may follow > @@ -170,6 +172,13 @@ get_num_total_procs(int *out_ntp) > } > > static int > +get_num_max_procs(int *out_nmp) > +{ > + *out_nmp = (int)ompi_process_info.max_procs; > + return OMPI_SUCCESS; > +} > + > +static int > get_num_local_procs(int *out_nlp) > { > /* num_local_peers does not include us in > @@ -201,7 +210,7 @@ ompi_mtl_psm2_component_init(bool > enable_progress_threads, > int verno_major = PSM2_VERNO_MAJOR; > int verno_minor = PSM2_VERNO_MINOR; > int local_rank = -1, num_local_procs = 0; > - int num_total_procs = 0; > + int num_total_procs = 0, num_max_procs = 0; > > /* Compute the total number of processes on this host and our local > rank > * on that node. We need to provide PSM2 with these values so it can > @@ -221,6 +230,11 @@ ompi_mtl_psm2_component_init(bool > enable_progress_threads, > "Cannot continue.\n"); > return NULL; > } > + if (OMPI_SUCCESS != get_num_max_procs(&num_max_procs)) { > + opal_output(0, "Cannot determine max number of processes. " > + "Cannot continue.\n"); > + return NULL; > + } > > err = psm2_error_register_handler(NULL /* no ep */, > PSM2_ERRHANDLER_NOP); > @@ -230,8 +244,10 @@ ompi_mtl_psm2_component_init(bool > enable_progress_threads, > return NULL; > } > > - if (num_local_procs == num_total_procs) { > - setenv("PSM2_DEVICES", "self,shm", 0); > + if ((num_local_procs == num_total_procs) && (num_max_procs <= > num_total_procs)) { > + if (NULL == getenv("PSM2_DEVICES")) { > + setenv("PSM2_DEVICES", "self,shm", 0); > + } > } > > err = psm2_init(&verno_major, &verno_minor); > > > > > > > > On 9/30/2016 12:38 AM, Cabral, Matias A wrote: > > Hi Giles et.al., > > > > You are right, ptl.c is in PSM2 code. As Ralph mentions, dynamic process > support was/is not working in OMPI when using PSM2 because of an issue > related to the transport keys. This was fixed in PR #1602 > (https://github.com/open-mpi/ompi/pull/1602) and should be included in > v2.0.2. HOWEVER, this not the error Juraj is seeing. The root of the > assertion is because the PSM/PSM2 MTLs will check for where the “original” > process are running and, if detects all are local to the node, it will ONLY > initialize the shared memory device (variable PSM2_DEVICES="self,shm” ). > This is to avoid “reserving” HW resources in the HFI card that wouldn’t be > used unless you later on spawn ranks in other nodes. Therefore, to allow > dynamic process to be spawned on other nodes you need to tell PSM2 to > instruct the HW to initialize all the de devices by making the environment > variable PSM2_DEVICES="self,shm,hfi" available before running the job. > > Note that setting PSM2_DEVICES (*) will solve the below assertion, you will > most likely still see the transport key issue if PR1602 if is not included. > > > > Thanks, > > > > _MAC > > > > (*) > > PSM2_DEVICES -> Omni Path > > PSM_DEVICES -> TrueScale > > > > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of > r...@open-mpi.org > Sent: Thursday, September 29, 2016 7:12 AM > To: Open MPI Users <us...@lists.open-mpi.org> > Subject: Re: [OMPI users] MPI_Comm_spawn > > > > Ah, that may be why it wouldn’t show up in the OMPI code base itself. If > that is the case here, then no - OMPI v2.0.1 does not support comm_spawn for > PSM. It is fixed in the upcoming 2.0.2 > > > > On Sep 29, 2016, at 6:58 AM, Gilles Gouaillardet > <gilles.gouaillar...@gmail.com> wrote: > > > > Ralph, > > > > My guess is that ptl.c comes from PSM lib ... > > > > Cheers, > > > > Gilles > > On Thursday, September 29, 2016, r...@open-mpi.org <r...@open-mpi.org> wrote: > > Spawn definitely does not work with srun. I don’t recognize the name of the > file that segfaulted - what is “ptl.c”? Is that in your manager program? > > > > > > On Sep 29, 2016, at 6:06 AM, Gilles Gouaillardet > <gilles.gouaillar...@gmail.com> wrote: > > > > Hi, > > > > I do not expect spawn can work with direct launch (e.g. srun) > > > > Do you have PSM (e.g. Infinipath) hardware ? That could be linked to the > failure > > > > Can you please try > > > > mpirun --mca pml ob1 --mca btl tcp,sm,self -np 1 --hostfile my_hosts > ./manager 1 > > > > and see if it help ? > > > > Note if you have the possibility, I suggest you first try that without > slurm, and then within a slurm job > > > > Cheers, > > > > Gilles > > On Thursday, September 29, 2016, juraj2...@gmail.com <juraj2...@gmail.com> > wrote: > > Hello, > > > > I am using MPI_Comm_spawn to dynamically create new processes from single > manager process. Everything works fine when all the processes are running on > the same node. But imposing restriction to run only a single process per > node does not work. Below are the errors produced during multinode > interactive session and multinode sbatch job. > > > > The system I am using is: Linux version 3.10.0-229.el7.x86_64 > (buil...@kbuilder.dev.centos.org) (gcc version 4.8.2 20140120 (Red Hat > 4.8.2-16) (GCC) ) > > I am using Open MPI 2.0.1 > > Slurm is version 15.08.9 > > > > What is preventing my jobs to spawn on multiple nodes? Does slurm requires > some additional configuration to allow it? Is it issue on the MPI side, does > it need to be compiled with some special flag (I have compiled it with > --enable-mpi-fortran=all --with-pmi)? > > > > The code I am launching is here: https://github.com/goghino/dynamicMPI > > > > Manager tries to launch one new process (./manager 1), the error produced by > requesting each process to be located on different node (interactive > session): > > $ salloc -N 2 > > $ cat my_hosts > > icsnode37 > > icsnode38 > > $ mpirun -np 1 -npernode 1 --hostfile my_hosts ./manager 1 > > [manager]I'm running MPI 3.1 > > [manager]Runing on node icsnode37 > > icsnode37.12614Assertion failure at ptl.c:183: epaddr == ((void *)0) > > icsnode38.32443Assertion failure at ptl.c:183: epaddr == ((void *)0) > > [icsnode37:12614] *** Process received signal *** > > [icsnode37:12614] Signal: Aborted (6) > > [icsnode37:12614] Signal code: (-6) > > [icsnode38:32443] *** Process received signal *** > > [icsnode38:32443] Signal: Aborted (6) > > [icsnode38:32443] Signal code: (-6) > > > > The same example as above via sbatch job submission: > > $ cat job.sbatch > > #!/bin/bash > > > > #SBATCH --nodes=2 > > #SBATCH --ntasks-per-node=1 > > > > module load openmpi/2.0.1 > > srun -n 1 -N 1 ./manager 1 > > > > $ cat output.o > > [manager]I'm running MPI 3.1 > > [manager]Runing on node icsnode39 > > srun: Job step aborted: Waiting up to 32 seconds for job step to finish. > > [icsnode39:9692] *** An error occurred in MPI_Comm_spawn > > [icsnode39:9692] *** reported by process [1007812608,0] > > [icsnode39:9692] *** on communicator MPI_COMM_SELF > > [icsnode39:9692] *** MPI_ERR_SPAWN: could not spawn processes > > [icsnode39:9692] *** MPI_ERRORS_ARE_FATAL (processes in this communicator > will now abort, > > [icsnode39:9692] *** and potentially your MPI job) > > In: PMI_Abort(50, N/A) > > slurmstepd: *** STEP 15378.0 ON icsnode39 CANCELLED AT 2016-09-26T16:48:20 > *** > > srun: error: icsnode39: task 0: Exited with exit code 50 > > > > Thank for any feedback! > > > > Best regards, > > Juraj > > _______________________________________________ > users mailing list > us...@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > _______________________________________________ > users mailing list > us...@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > > > > _______________________________________________ > > users mailing list > > us...@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel _______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel