Hi Matias,

i read putenv() instead of setenv(), so the code is correct as it is,
sorry about the noise

as far as i am concerned, i'd rather add a FAQ in the open-mpi.org web site.
you can simply issue a PR vs the https://github.com/open-mpi/ompi-www.git repo

Cheers,

Gilles

On Fri, Sep 30, 2016 at 10:53 AM, Cabral, Matias A
<matias.a.cab...@intel.com> wrote:
> Hey Gilles,
>
>
>
> Quick answer on the first part until I read a little more about
> num_max_procs :O
>
> Being the third parameter of setenv a 0 means:  do not override if found in
> the env.  So the workaround does work today. Moreover, I would like to know
> if there is a place in some OMPI wiki to document this behavior.
>
>
>
> Thanks,
>
>
>
> _MAC
>
>
>
> From: devel [mailto:devel-boun...@lists.open-mpi.org] On Behalf Of Gilles
> Gouaillardet
> Sent: Thursday, September 29, 2016 6:14 PM
> To: Open MPI Developers <devel@lists.open-mpi.org>
> Subject: [OMPI devel] mtl/psm2 and $PSM2_DEVICES
>
>
>
> This is a follow-up of
> https://mail-archive.com/users@lists.open-mpi.org/msg30055.html
>
>
>
> Thanks Matias for the lengthy explanation.
>
>
>
> currently, PSM2_DEVICES is overwritten, so i do not think setting it before
> invoking mpirun will help
>
>
>
> also, in this specific case
>
> - the user is running within a SLURM allocation with 2 nodes
>
> - the user specified a host file with 2 distinct nodes
>
>
>
> my first impression is that mtl/psm2 could/should handle this (well only one
> condition has to be met) properly and *not* set
>
> export PSM2_DEVICES="self,shm"
>
>
> the patch below
> - does not overwrite PSM2_DEVICES
> - does not set PSM2_DEVICES when num_max_procs > num_total_procs
> this is suboptimal, but i could not find a way to get the number of orted.
> iirc, MPI_Comm_spawn can have an orted dynamically spawned by passing a host
> in the MPI_Info.
> if this host is not part of the hostfile (nor RM allocation ?), then
> PSM2_DEVICES must be set manually by the user
>
>
> Ralph,
>
> is there a way to get the number of orted ?
> - if i mpirun -np 1 --host n0,n1 ... orte_process_info.num_nodes is 1 (i
> wish i could get 2)
> - if running in singleton mode, orte_process_info.num_max_procs is 0 (is
> this a bug or a feature ?)
>
> Cheers,
>
> Gilles
>
>
> diff --git a/ompi/mca/mtl/psm2/mtl_psm2_component.c
> b/ompi/mca/mtl/psm2/mtl_psm2_component.c
> index 26bccd2..52b906b 100644
> --- a/ompi/mca/mtl/psm2/mtl_psm2_component.c
> +++ b/ompi/mca/mtl/psm2/mtl_psm2_component.c
> @@ -14,6 +14,8 @@
>   * Copyright (c) 2012-2015 Los Alamos National Security, LLC.
>   *                         All rights reserved.
>   * Copyright (c) 2013-2016 Intel, Inc. All rights reserved
> + * Copyright (c) 2016      Research Organization for Information Science
> + *                         and Technology (RIST). All rights reserved.
>   * $COPYRIGHT$
>   *
>   * Additional copyrights may follow
> @@ -170,6 +172,13 @@ get_num_total_procs(int *out_ntp)
>  }
>
>  static int
> +get_num_max_procs(int *out_nmp)
> +{
> +  *out_nmp = (int)ompi_process_info.max_procs;
> +  return OMPI_SUCCESS;
> +}
> +
> +static int
>  get_num_local_procs(int *out_nlp)
>  {
>      /* num_local_peers does not include us in
> @@ -201,7 +210,7 @@ ompi_mtl_psm2_component_init(bool
> enable_progress_threads,
>      int        verno_major = PSM2_VERNO_MAJOR;
>      int verno_minor = PSM2_VERNO_MINOR;
>      int local_rank = -1, num_local_procs = 0;
> -    int num_total_procs = 0;
> +    int num_total_procs = 0, num_max_procs = 0;
>
>      /* Compute the total number of processes on this host and our local
> rank
>       * on that node. We need to provide PSM2 with these values so it can
> @@ -221,6 +230,11 @@ ompi_mtl_psm2_component_init(bool
> enable_progress_threads,
>                      "Cannot continue.\n");
>          return NULL;
>      }
> +    if (OMPI_SUCCESS != get_num_max_procs(&num_max_procs)) {
> +        opal_output(0, "Cannot determine max number of processes. "
> +                    "Cannot continue.\n");
> +        return NULL;
> +    }
>
>      err = psm2_error_register_handler(NULL /* no ep */,
>                                      PSM2_ERRHANDLER_NOP);
> @@ -230,8 +244,10 @@ ompi_mtl_psm2_component_init(bool
> enable_progress_threads,
>         return NULL;
>      }
>
> -    if (num_local_procs == num_total_procs) {
> -      setenv("PSM2_DEVICES", "self,shm", 0);
> +    if ((num_local_procs == num_total_procs) && (num_max_procs <=
> num_total_procs)) {
> +        if (NULL == getenv("PSM2_DEVICES")) {
> +            setenv("PSM2_DEVICES", "self,shm", 0);
> +        }
>      }
>
>      err = psm2_init(&verno_major, &verno_minor);
>
>
>
>
>
>
>
> On 9/30/2016 12:38 AM, Cabral, Matias A wrote:
>
> Hi Giles et.al.,
>
>
>
> You are right, ptl.c is in PSM2 code. As Ralph mentions, dynamic process
> support was/is not working in OMPI when using PSM2 because of an issue
> related to the transport keys. This was fixed in PR #1602
> (https://github.com/open-mpi/ompi/pull/1602) and should be included in
> v2.0.2. HOWEVER, this not the error Juraj is seeing. The root of the
> assertion is because the PSM/PSM2 MTLs will check for where the “original”
> process are running and, if detects all are local to the node, it will ONLY
> initialize the shared memory device (variable PSM2_DEVICES="self,shm” ).
> This is to avoid “reserving” HW resources in the HFI card that wouldn’t be
> used unless you later on spawn ranks in other nodes.  Therefore, to allow
> dynamic process to be spawned on other nodes you need to tell PSM2 to
> instruct the HW to initialize all the de devices by making the environment
> variable PSM2_DEVICES="self,shm,hfi" available before running the job.
>
> Note that setting PSM2_DEVICES (*) will solve the below assertion, you will
> most likely still see the transport key issue if PR1602 if is not included.
>
>
>
> Thanks,
>
>
>
> _MAC
>
>
>
> (*)
>
> PSM2_DEVICES  -> Omni Path
>
>                 PSM_DEVICES  -> TrueScale
>
>
>
> From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of
> r...@open-mpi.org
> Sent: Thursday, September 29, 2016 7:12 AM
> To: Open MPI Users <us...@lists.open-mpi.org>
> Subject: Re: [OMPI users] MPI_Comm_spawn
>
>
>
> Ah, that may be why it wouldn’t show up in the OMPI code base itself. If
> that is the case here, then no - OMPI v2.0.1 does not support comm_spawn for
> PSM. It is fixed in the upcoming 2.0.2
>
>
>
> On Sep 29, 2016, at 6:58 AM, Gilles Gouaillardet
> <gilles.gouaillar...@gmail.com> wrote:
>
>
>
> Ralph,
>
>
>
> My guess is that ptl.c comes from PSM lib ...
>
>
>
> Cheers,
>
>
>
> Gilles
>
> On Thursday, September 29, 2016, r...@open-mpi.org <r...@open-mpi.org> wrote:
>
> Spawn definitely does not work with srun. I don’t recognize the name of the
> file that segfaulted - what is “ptl.c”? Is that in your manager program?
>
>
>
>
>
> On Sep 29, 2016, at 6:06 AM, Gilles Gouaillardet
> <gilles.gouaillar...@gmail.com> wrote:
>
>
>
> Hi,
>
>
>
> I do not expect spawn can work with direct launch (e.g. srun)
>
>
>
> Do you have PSM (e.g. Infinipath) hardware ? That could be linked to the
> failure
>
>
>
> Can you please try
>
>
>
> mpirun --mca pml ob1 --mca btl tcp,sm,self -np 1 --hostfile my_hosts
> ./manager 1
>
>
>
> and see if it help ?
>
>
>
> Note if you have the possibility, I suggest you first try that without
> slurm, and then within a slurm job
>
>
>
> Cheers,
>
>
>
> Gilles
>
> On Thursday, September 29, 2016, juraj2...@gmail.com <juraj2...@gmail.com>
> wrote:
>
> Hello,
>
>
>
> I am using MPI_Comm_spawn to dynamically create new processes from single
> manager process. Everything works fine when all the processes are running on
> the same node. But imposing restriction to run only a single process per
> node does not work. Below are the errors produced during multinode
> interactive session and multinode sbatch job.
>
>
>
> The system I am using is: Linux version 3.10.0-229.el7.x86_64
> (buil...@kbuilder.dev.centos.org) (gcc version 4.8.2 20140120 (Red Hat
> 4.8.2-16) (GCC) )
>
> I am using Open MPI 2.0.1
>
> Slurm is version 15.08.9
>
>
>
> What is preventing my jobs to spawn on multiple nodes? Does slurm requires
> some additional configuration to allow it? Is it issue on the MPI side, does
> it need to be compiled with some special flag (I have compiled it with
> --enable-mpi-fortran=all --with-pmi)?
>
>
>
> The code I am launching is here: https://github.com/goghino/dynamicMPI
>
>
>
> Manager tries to launch one new process (./manager 1), the error produced by
> requesting each process to be located on different node (interactive
> session):
>
> $ salloc -N 2
>
> $ cat my_hosts
>
> icsnode37
>
> icsnode38
>
> $ mpirun -np 1 -npernode 1 --hostfile my_hosts ./manager 1
>
> [manager]I'm running MPI 3.1
>
> [manager]Runing on node icsnode37
>
> icsnode37.12614Assertion failure at ptl.c:183: epaddr == ((void *)0)
>
> icsnode38.32443Assertion failure at ptl.c:183: epaddr == ((void *)0)
>
> [icsnode37:12614] *** Process received signal ***
>
> [icsnode37:12614] Signal: Aborted (6)
>
> [icsnode37:12614] Signal code:  (-6)
>
> [icsnode38:32443] *** Process received signal ***
>
> [icsnode38:32443] Signal: Aborted (6)
>
> [icsnode38:32443] Signal code:  (-6)
>
>
>
> The same example as above via sbatch job submission:
>
> $ cat job.sbatch
>
> #!/bin/bash
>
>
>
> #SBATCH --nodes=2
>
> #SBATCH --ntasks-per-node=1
>
>
>
> module load openmpi/2.0.1
>
> srun -n 1 -N 1 ./manager 1
>
>
>
> $ cat output.o
>
> [manager]I'm running MPI 3.1
>
> [manager]Runing on node icsnode39
>
> srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
>
> [icsnode39:9692] *** An error occurred in MPI_Comm_spawn
>
> [icsnode39:9692] *** reported by process [1007812608,0]
>
> [icsnode39:9692] *** on communicator MPI_COMM_SELF
>
> [icsnode39:9692] *** MPI_ERR_SPAWN: could not spawn processes
>
> [icsnode39:9692] *** MPI_ERRORS_ARE_FATAL (processes in this communicator
> will now abort,
>
> [icsnode39:9692] ***    and potentially your MPI job)
>
> In: PMI_Abort(50, N/A)
>
> slurmstepd: *** STEP 15378.0 ON icsnode39 CANCELLED AT 2016-09-26T16:48:20
> ***
>
> srun: error: icsnode39: task 0: Exited with exit code 50
>
>
>
> Thank for any feedback!
>
>
>
> Best regards,
>
> Juraj
>
> _______________________________________________
> users mailing list
> us...@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> _______________________________________________
> users mailing list
> us...@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
>
>
>
> _______________________________________________
>
> users mailing list
>
> us...@lists.open-mpi.org
>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
>
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to