Hi there,
            Since a simple hello world application works. I tried running
chapel program for multiple node. The computer I am running has slurm as
job scheduler. But I am running into some problem. I even tried running
chapel binary directly into the slurm but it does not work that way.

*$ srun -p marvin -N 2 -n 4 -c 8 ./hello6-taskpar-dist*

*error: Specify number of locales via -nl <#> or --numLocales=<#>*

*error: Specify number of locales via -nl <#> or --numLocales=<#>*

*error: Specify number of locales via -nl <#> or --numLocales=<#>*

*error: Specify number of locales via -nl <#> or --numLocales=<#>*

*I understand why this error appears. *


Then I read README.multilocale and README.launcher there I found couple of
things on how to launch chapel using slurm. So I tried exporting ...

 export CHPL_COMM=gasnet


 export CHPL_COMM_SUBSTRATE=ibv


 export CHPL_LAUNCHER_WALLTIME=00:15:00


 export GASNET_SPAWNFN=S


 export GASNET_SSH_SERVERS="reno lyra01"


 export SSH_CMD=ssh


 export SSH_OPTIONS=-x


 export GASNET_CSPAWN_CMD="srun -N%N %C"

and yes I did recompiled after doing all this.


when I do

./hello6-taskpar-dist -nl 2

Access denied: user bghimire (uid=3030) has no active jobs.

Connection closed by 12.23.1.1

connection to reno failed.

Terminated


I was curious why the slurm was not working in this case. Its just going
through the ssh but not doing any slurm thing

Another question is what does -N%N %C really mean in  export
 GASNET_CSPAWN_CMD="srun -N%N %C" and how can I force it to submit job via
slurm to specific node and cores.







Thank you,
Bibek
------------------------------------------------------------------------------
_______________________________________________
Chapel-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-users

Reply via email to