There are 2 typos in the solution:

On Fri, Sep 16, 2016 at 3:46 PM, Michael Ferguson <mfergu...@cray.com>
wrote:

> Hi -
>
> (For the archives). I was able to help Hui and got 3 different ways of
> launching Chapel programs working on that Infiniband cluster:
>
> 1) export CHPL_LAUNCHER=slurm-gasnetrun_ibv
>    export CHPL_LAUNCHER_WALLTIME=00:15:00
>
>    export SLURM_PARTITION=debug
>    make
>    chpl program.chpl
>    ./a.out -nl 3
>
> 2) export CHPL_LAUNCHER=gasnetrun_ibv
>    export GASNET_IBV_SPAWNER=S
>
​GASNET_IBV_SPAWNER=ssh​


>    make
>    chpl program.chpl
>    salloc -N number-of-locales
>      # in the salloc shell:
>      export GASNET_SSH_SERVERS=`scontrol show hostnames`
>      ./a.out -nl 3
>
> 3) export CHPL_LAUNCHER=gasnetrun_ibv
>    export GASNET_IBV_SPAWNER=S
>
​
​GASNET_IBV_SPAWNER=ssh​
 ​


>    make
>    chpl program.chpl
>    sbatch job.sh
>
>    where job.sh is an sbatch script that contains
>    export GASNET_SSH_SERVERS=`scontrol show hostnames`
>    among other things:
>
>    job.sh file contains:
>
> #!/bin/bash
> #SBATCH -t 0:10:0
> #SBATCH --nodes=3
> #SBATCH --exclusive
> #SBATCH --partition=debug
> #SBATCH --output=/path-to-job-output
>
> export GASNET_SSH_SERVERS=`scontrol show hostnames`
> export GASNET_IBV_SPAWNER=ssh
> export GASNET_PHYSMEM_MAX=1G # Limit GASNet's IBV conduit probing
>
> export GASNET_SSH_OPTIONS="-o LogLevel=Error" #disable login banner into
> the output
>
> cd some-directory
>
> ./a.out -nl 3
>
>
>
> Note:
>
>  * GASNET_CSPAWN_CMD does not work with GASNet's ibv launcher.
>
>  * It appears to be necessary to run GASNet's ibv launcher
>    (simply running the _real executables in sbatch or srun
>     isn't sufficient).
>  * Setting GASNET_PHYSMEM_MAX and possibly GASNET_PHYSMEM_NOPROBE
>    is important for job launches to take a reasonable amount of time
>
> Cheers,
>
> -michael
>
>
>
>
>
> On 9/10/16, 12:17 AM, "Hui Zhang" <wayne.huizh...@gmail.com> wrote:
>
> >Hello, Greg
> >
> >
> >I did two ways:
> >1. use batch script
> >CHPL_COMM=gasnet
> >CHPL_LAUNCHER=slurm-gasnetrun_ibv
> >CHPL_COMM_SUBSTRATE=ibv
> >GASNET_ROUTE_OUTPUT=0
> >GASNET_VERBOSEENV=1
> >GASNET_SSH_OPTIONS="-o LogLevel=Error" #disable login banner
> >
> >
> >
> >GASNET_SPAWNFN=C
> >GASNET_CSPAWN_CMD='srun -N%N %C'
> >
> >
> >cmd:
> >$CHPL_HOME/test/release/examples/hello6-taskpar-dist_real -nl 4
> >--tasksPerLocale=6 -v
> >
> >
> >2. use interactive:
> >same Envs, except I didn't set GASNET_SPAWNFN, and use srun explicitly:
> >
> >
> >salloc -N 4 -t 00:15:00 -p debug
> >srun $CHPL_HOME/test/release/examples/hello6-taskpar-dist_real -nl 4
> >--tasksPerLocale=6 -v
> >
> >
> >Both gives me the same error:
> >
> >
> >*** FATAL ERROR: Requested spawner "(not set)" is unknown or not
> >supported in this build
> >WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
> >gasneti_backtrace_init
> >*** FATAL ERROR: Requested spawner "(not set)" is unknown or not
> >supported in this build
> >WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
> >gasneti_backtrace_init
> >*** FATAL ERROR: Requested spawner "(not set)" is unknown or not
> >supported in this build
> >WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
> >gasneti_backtrace_init
> >*** FATAL ERROR: Requested spawner "(not set)" is unknown or not
> >supported in this build
> >WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
> >gasneti_backtrace_init
> >srun: error: compute-b28-47: task 0: Aborted (core dumped)
> >srun: error: compute-b28-49: task 2: Aborted (core dumped)
> >srun: error: compute-b28-48: task 1: Aborted (core dumped)
> >srun: error: compute-b28-50: task 3: Aborted (core dumped)
> >
> >
> >Thanks
> >
> >
> >
> >
> >On Fri, Sep 9, 2016 at 7:20 PM, Greg Titus
> ><g...@cray.com> wrote:
> >
> >Hello Hui --
> >
> >I've somewhat lost track of your environment settings.  What do you have
> >CHPL_LAUNCHER and CHPL_COMM_SUBSTRATE set to now, and also what are the
> >settings of all of your GASNet-specific env vars, such as GASNET_SPAWNFN
> >and the like?
> >
> >thanks,
> >greg
> >
> >
> >
> >On Fri, 9 Sep 2016, Hui Zhang wrote:
> >
> >
> >Hello, team
> >Following up the previous issue, I've found out that was because I was
> >missing libibverbs.so.1 in the machine. After adding that, I came to an
> >error exactly the same as I found in an old thread in the mailing list:
> >https://sourceforge.net/p/chapel/mailman/message/34769706/
> >
> >** FATAL ERROR: Requested spawner "(not set)" is unknown or not supported
> >in this build
> >WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before
> >gasneti_backtrace_init
> >
> >srun: error: node01: task 0: Aborted
> >srun: error: node03: task 2: Aborted
> >srun: error: node02: task 1: Aborted
> >
> >But I don't see a solution provided, so is there any method tried to fix
> >this problem ?
> >
> >Thanks
> >
> >
> >On Wed, Sep 7, 2016 at 11:22 PM, Hui Zhang <wayne.huizh...@gmail.com>
> >wrote:
> >      Update:
> >I tried chapel 1.11 and the master, both gives me the same result
> >(not outputting anything). Executing with -v gives me one line
> >message:
> >expect .chpl-expect-# (some number, not fixed from run to run)
> >
> >
> >On Wed, Sep 7, 2016 at 2:30 PM, Hui Zhang <wayne.huizh...@gmail.com>
> >wrote:
> >      Hello, team
> >
> >I had success on running Chapel multi-locale on an infiniband
> >cluster with the default GASNET settting. Here's my script to
> >use gasnet with slurm:
> >
> >export GASNET_SSH_OPTIONS="-o LogLevel=Error" #disable login
> >banner into the output
> >export GASNET_SPAWNFN=C
> >export GASNET_CSPAWN_CMD='srun -N%N %C'
> >
> >.​
> >/hello6-taskpar-dist -nl 4​     (using _real won't work, any
> >idea why?)​
> >
> >
> >It works but the output suggests to use ibv-conduit instead of
> >udp-conduit for better performance, so I ​did:
> >1) export CHPL_COMM=gasnet
> >      export CHPL_LAUNCHER=slurm-gasnetrun_ibv
> >      export CHPL_COMM_SUBSTRATE=ibv
> >2) cd $CHPL_HOME & make
> >It reports the same error
> >ashttps://sourceforge.net/p/chapel/mailman/chapel-
> developers/thread/VI1PR0
> >6MB118160
> ><http://sourceforge.net/p/chapel/mailman/chapel-
> developers/thread/VI1PR06M
> >B118160>
> >8c2323f3f4c6d95cf0d2...@vi1pr06mb1181.eurprd06.prod.outlook.com/
> ><http://8c2323f3f4c6d95cf0d2...@vi1pr06mb1181.eurprd06.prod.outlook.com/>
> >and it builds with patch provided by Michael.
> >
> >However, when I recompiled hello6, then used the same script
> >to execute it, the job completed normally but it did not
> >output anything. If I use -v in the command, it only printed
> >out:
> >expect .chpl-expect-12045
> >
> >Am I missing something ?
> >Thanks
> >
> >--
> >Best regards
> >
> >
> >Hui Zhang
> >
> >
> >
> >
> >--
> >Best regards
> >
> >
> >Hui Zhang
> >
> >
> >
> >
> >--
> >Best regards
> >
> >
> >Hui Zhang
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >--
> >Best regards
> >
> >
> >Hui Zhang
> >
> >
>
>


-- 
Best regards


Hui Zhang
------------------------------------------------------------------------------
_______________________________________________
Chapel-developers mailing list
Chapel-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/chapel-developers

Reply via email to