Is there anything that could/should be done for the release to ease future user pain? (in terms of either code or documentation changes?)


-Brad


On Fri, 16 Sep 2016, Michael Ferguson wrote:

Hi -

(For the archives). I was able to help Hui and got 3 different ways of
launching Chapel programs working on that Infiniband cluster:

1) export CHPL_LAUNCHER=slurm-gasnetrun_ibv
  export CHPL_LAUNCHER_WALLTIME=00:15:00

  export SLURM_PARTITION=debug
  make
  chpl program.chpl
  ./a.out -nl 3

2) export CHPL_LAUNCHER=gasnetrun_ibv
  export GASNET_IBV_SPAWNER=S
  make
  chpl program.chpl
  salloc -N number-of-locales
    # in the salloc shell:
    export GASNET_SSH_SERVERS=`scontrol show hostnames`
    ./a.out -nl 3

3) export CHPL_LAUNCHER=gasnetrun_ibv
  export GASNET_IBV_SPAWNER=S
  make
  chpl program.chpl
  sbatch job.sh

  where job.sh is an sbatch script that contains
  export GASNET_SSH_SERVERS=`scontrol show hostnames`
  among other things:

  job.sh file contains:

#!/bin/bash
#SBATCH -t 0:10:0
#SBATCH --nodes=3
#SBATCH --exclusive
#SBATCH --partition=debug
#SBATCH --output=/path-to-job-output

export GASNET_SSH_SERVERS=`scontrol show hostnames`
export GASNET_IBV_SPAWNER=ssh
export GASNET_PHYSMEM_MAX=1G # Limit GASNet's IBV conduit probing

export GASNET_SSH_OPTIONS="-o LogLevel=Error" #disable login banner into
the output

cd some-directory

./a.out -nl 3



Note:

* GASNET_CSPAWN_CMD does not work with GASNet's ibv launcher.

* It appears to be necessary to run GASNet's ibv launcher
  (simply running the _real executables in sbatch or srun
   isn't sufficient).
* Setting GASNET_PHYSMEM_MAX and possibly GASNET_PHYSMEM_NOPROBE
  is important for job launches to take a reasonable amount of time

Cheers,

-michael





On 9/10/16, 12:17 AM, "Hui Zhang" <wayne.huizh...@gmail.com> wrote:

Hello, Greg


I did two ways:
1. use batch script
CHPL_COMM=gasnet
CHPL_LAUNCHER=slurm-gasnetrun_ibv
CHPL_COMM_SUBSTRATE=ibv
GASNET_ROUTE_OUTPUT=0
GASNET_VERBOSEENV=1
GASNET_SSH_OPTIONS="-o LogLevel=Error" #disable login banner



GASNET_SPAWNFN=C
GASNET_CSPAWN_CMD='srun -N%N %C'


cmd:
$CHPL_HOME/test/release/examples/hello6-taskpar-dist_real -nl 4
--tasksPerLocale=6 -v


2. use interactive:
same Envs, except I didn't set GASNET_SPAWNFN, and use srun explicitly:


salloc -N 4 -t 00:15:00 -p debug
srun $CHPL_HOME/test/release/examples/hello6-taskpar-dist_real -nl 4
--tasksPerLocale=6 -v


Both gives me the same error:


*** FATAL ERROR: Requested spawner "(not set)" is unknown or not
supported in this build
WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
gasneti_backtrace_init
*** FATAL ERROR: Requested spawner "(not set)" is unknown or not
supported in this build
WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
gasneti_backtrace_init
*** FATAL ERROR: Requested spawner "(not set)" is unknown or not
supported in this build
WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
gasneti_backtrace_init
*** FATAL ERROR: Requested spawner "(not set)" is unknown or not
supported in this build
WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
gasneti_backtrace_init
srun: error: compute-b28-47: task 0: Aborted (core dumped)
srun: error: compute-b28-49: task 2: Aborted (core dumped)
srun: error: compute-b28-48: task 1: Aborted (core dumped)
srun: error: compute-b28-50: task 3: Aborted (core dumped)


Thanks




On Fri, Sep 9, 2016 at 7:20 PM, Greg Titus
<g...@cray.com> wrote:

Hello Hui --

I've somewhat lost track of your environment settings.  What do you have
CHPL_LAUNCHER and CHPL_COMM_SUBSTRATE set to now, and also what are the
settings of all of your GASNet-specific env vars, such as GASNET_SPAWNFN
and the like?

thanks,
greg



On Fri, 9 Sep 2016, Hui Zhang wrote:


Hello, team
Following up the previous issue, I've found out that was because I was
missing libibverbs.so.1 in the machine. After adding that, I came to an
error exactly the same as I found in an old thread in the mailing list:
https://sourceforge.net/p/chapel/mailman/message/34769706/

** FATAL ERROR: Requested spawner "(not set)" is unknown or not supported
in this build
WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
gasneti_backtrace_init

srun: error: node01: task 0: Aborted
srun: error: node03: task 2: Aborted
srun: error: node02: task 1: Aborted

But I don't see a solution provided, so is there any method tried to fix
this problem ?

Thanks


On Wed, Sep 7, 2016 at 11:22 PM, Hui Zhang <wayne.huizh...@gmail.com>
wrote:
     Update:
I tried chapel 1.11 and the master, both gives me the same result
(not outputting anything). Executing with -v gives me one line
message:
expect .chpl-expect-# (some number, not fixed from run to run)


On Wed, Sep 7, 2016 at 2:30 PM, Hui Zhang <wayne.huizh...@gmail.com>
wrote:
     Hello, team

I had success on running Chapel multi-locale on an infiniband
cluster with the default GASNET settting. Here's my script to
use gasnet with slurm:

export GASNET_SSH_OPTIONS="-o LogLevel=Error" #disable login
banner into the output
export GASNET_SPAWNFN=C
export GASNET_CSPAWN_CMD='srun -N%N %C'

.​
/hello6-taskpar-dist -nl 4​     (using _real won't work, any
idea why?)​


It works but the output suggests to use ibv-conduit instead of
udp-conduit for better performance, so I ​did:
1) export CHPL_COMM=gasnet
     export CHPL_LAUNCHER=slurm-gasnetrun_ibv
     export CHPL_COMM_SUBSTRATE=ibv
2) cd $CHPL_HOME & make
It reports the same error
ashttps://sourceforge.net/p/chapel/mailman/chapel-developers/thread/VI1PR0
6MB118160
<http://sourceforge.net/p/chapel/mailman/chapel-developers/thread/VI1PR06M
B118160>
8c2323f3f4c6d95cf0d2...@vi1pr06mb1181.eurprd06.prod.outlook.com/
<http://8c2323f3f4c6d95cf0d2...@vi1pr06mb1181.eurprd06.prod.outlook.com/>
and it builds with patch provided by Michael.

However, when I recompiled hello6, then used the same script
to execute it, the job completed normally but it did not
output anything. If I use -v in the command, it only printed
out:
expect .chpl-expect-12045

Am I missing something ?
Thanks

--
Best regards


Hui Zhang




--
Best regards


Hui Zhang




--
Best regards


Hui Zhang












--
Best regards


Hui Zhang



------------------------------------------------------------------------------
_______________________________________________
Chapel-developers mailing list
Chapel-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/chapel-developers
------------------------------------------------------------------------------
_______________________________________________
Chapel-developers mailing list
Chapel-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to