Hi Nadia

That sounds like a bug in your SLURM config file - SLURM certainly doesn't 
propagate "hostname" by default as that would definitely mess things up for 
more than OMPI.

Are you sure that SLURM is propagating the environment (something I have never 
seen before)? Or is OMPI mistakenly picking it up and propagating it?

On Jan 22, 2010, at 7:25 AM, Nadia Derbey wrote:

> Hi,
> 
> I'm wondering whether the HOSTNAME environment variable shouldn't be
> handled as a "special case" when the orted daemons launch the remote
> jobs. This particularly applies to batch schedulers where the caller's
> environment is copied to the remote job: we are inheriting a $HOSTNAME
> which is the name of the host mpirun was called from:
> 
> I tried to run the following small test (see getenv.c in attachment - it
> substantially gets the hostname once through $HOSTNAME, and once through
> gethostname(2)):
> 
> ------------
> [derbeyn@pichu0 ~]$ hostname
> pichu0
> [derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv
> salloc: Granted job allocation 358789
> Processor 0 of 2 on $HOSTNAME pichu0: Hello World
> Processor 0 of 2 on host pichu93: Hello World
> Processor 1 of 2 on $HOSTNAME pichu0: Hello World
> Processor 1 of 2 on host pichu94: Hello World
> salloc: Relinquishing job allocation 358789
> ------------
> 
> Shouldn't we be getting the same value when using getenv("HOSTNAME") and 
> gethsotname()?
> Applying the following small patch, we actually do.
> 
> Regards,
> Nadia
> 
> --------------
> 
> Do not propagate the HOSTNAME environment variable on remote hosts
> 
> diff -r 4ab256be2a17 orte/orted/orted_main.c
> --- a/orte/orted/orted_main.c   Wed Jan 20 16:45:07 2010 +0100
> +++ b/orte/orted/orted_main.c   Fri Jan 22 14:54:02 2010 +0100
> @@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[])
>      */
>     orte_launch_environ = opal_argv_copy(environ);
> 
> +    /*
> +     * Set HOSTNAME to the actual hostname in order to avoid propagating
> +     * the caller's HOSTNAME.
> +     */
> +    gethostname(hostname, 100);
> +    opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ);
> 
>     /* if orte_daemon_debug is set, let someone know we are alive right
>      * away just in case we have a problem along the way
>      */
>     if (orted_globals.debug) {
> -        gethostname(hostname, 100);
>         fprintf(stderr, "Daemon was launched on %s - beginning to 
> initialize\n", hostname);
>     }
> 
> <getenv.c>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to