Hi Konstantina - Have you tried running your program with -v ? sometimes it prints out some diagnostics that point to the problem.
Are you able to run programs with CHPL_COMM_SUBSTRATE=udp and using SSH spawning? Are you able to run programs locally? The error message you included makes me think that perhaps something is going wrong with environment variable forwarding. Perhaps there is a way to configure SLURM to do that, or to explicitly forward the GASNET_IBV_SPAWNER variable. It's also possible that something went wrong with the build. I usually need to build the Chapel runtime on a machine with access to InfiniBand, which sometimes means a compute node (and not a head node). I wouldn't expect GASNET_IBV_SPAWNER=ssh to work unless you can SSH to the compute nodes without a password. On some systems in the past, I've had better luck configuring the spawner to launch with MPI. (although I think I prefer the SSH one on principle...) Have you tried running the GASNet tests, as the thread you found suggests? Lastly, if InfiniBand is going to work, you should be able to run ibstat and it should say State: Active. It might be worth making sure you can run an InfniBand benchamrk, like ibping. -michael On 1/15/16, 12:13 PM, "Panagiotopoulou, Konstantina" <[email protected]> wrote: >Hi MIchael, > >I 've got a version just after October's release but didn't have that >(weird!) >Anyway, I applied the patch and it does build but still not working >properly. I tried the taskParallel.chpl in primers. and I get this: > >** FATAL ERROR: Requested spawner "(not set)" is unknown or not supported >in this build >WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before >gasneti_backtrace_init > >srun: error: gpu01: task 0: Aborted >srun: error: gpu03: task 2: Aborted >srun: error: gpu02: task 1: Aborted > >I found someone with the same issue (...ages ago) >https://www.mail-archive.com/[email protected]/msg00095.ht >ml > >I also have GASNET_IBV_SPAWNER=ssh set, but I am not sure it makes any >difference since I can only ssh to my login node... > >--Konstantina >________________________________________ >From: Michael Ferguson <[email protected]> >Sent: 15 January 2016 16:21 >To: Panagiotopoulou, Konstantina; [email protected] >Subject: Re: [Chapel-developers] Building Chapel on AMD Infiniband >cluster with slurm > >Hi Konstantina - > >Good to hear from you. I merged a PR fixing some compilation errors in >August > https://github.com/chapel-lang/chapel/pull/2299 > >Are you using a version of Chapel from before that change? Or have you >found >other compilation errors? Perhaps you just need to apply that patch... > >Cheers, > >-michael > >On 1/15/16, 10:45 AM, "Panagiotopoulou, Konstantina" <[email protected]> >wrote: > >>Hi team, >> >> >>I am trying to build Chapel on an Infiniband cluster with slurm but I >>keep getting these errors: >> >> >>launch-slurm-gasnetrun_ibv.c: In function ŒgenNumLocalesOptions¹: >>launch-slurm-gasnetrun_ibv.c:121:3: error: enumeration value Œslurmpro¹ >>not handled in switch [-Werror=switch] >> switch (sbatch) { >> ^ >>launch-slurm-gasnetrun_ibv.c:121:3: error: enumeration value Œnccs¹ not >>handled in switch [-Werror=switch] >>launch-slurm-gasnetrun_ibv.c:121:3: error: enumeration value Œuma¹ not >>handled in switch [-Werror=switch] >>launch-slurm-gasnetrun_ibv.c:121:3: error: enumeration value Œunknown¹ >>not handled in switch [-Werror=switch] >>launch-slurm-gasnetrun_ibv.c:111:9: error: unused variable Œqueue¹ >>[-Werror=unused-variable] >> char* queue = getenv("CHPL_LAUNCHER_QUEUE"); >> ^ >>launch-slurm-gasnetrun_ibv.c: In function Œchpl_launch_create_command¹: >>launch-slurm-gasnetrun_ibv.c:215:23: error: too many arguments for format >>[-Werror=format-extra-args] >> fprintf(expectFile, "--ntasks-per-node=1 ",numLocales); >> >> >>For my experiments I need TASKS=fifo and LOCALE_MODEL=flat. >>Am I missing something?? >> >> >>Thanks, >>Konstantina >> >> >> >> >> >>Here is my chplenv: >> >> >>$ util/printchplenv >>CHPL_HOST_PLATFORM: linux64 * >>CHPL_HOST_COMPILER: gnu >>CHPL_TARGET_PLATFORM: linux64 >>CHPL_TARGET_COMPILER: gnu * >>CHPL_TARGET_ARCH: k8 * (same errors when set to "native") >>CHPL_LOCALE_MODEL: flat >>CHPL_COMM: gasnet * >> CHPL_COMM_SUBSTRATE: ibv * >> CHPL_GASNET_SEGMENT: large >>CHPL_TASKS: fifo * >>CHPL_LAUNCHER: slurm-gasnetrun_ibv * >>CHPL_TIMERS: generic >>CHPL_MEM: dlmalloc >>CHPL_MAKE: gmake >>CHPL_ATOMICS: intrinsics >> CHPL_NETWORK_ATOMICS: none >>CHPL_GMP: gmp >>CHPL_HWLOC: none >>CHPL_REGEXP: re2 >>CHPL_WIDE_POINTERS: struct >>CHPL_LLVM: none >>CHPL_AUX_FILESYS: none >> >> >> >>I set TARGET_ARCH to k8 (though it is an opteron) because of this: >>https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/i386-and-x86-64-Options.html >>Œk8¹ Œopteron¹ Œathlon64¹ Œathlon-fx¹Processors based on the AMD K8 core >>with x86-64 instruction set support, including the AMD Opteron, Athlon >>64, and Athlon 64 FX processors. (This supersets MMX, SSE, SSE2, 3DNow!, >>enhanced 3DNow! >> and 64-bit instruction set extensions.) >> >> >>and the cpu spec: >>$ lscpu >>Architecture: x86_64 >>CPU op-mode(s): 32-bit, 64-bit >>Byte Order: Little Endian >>CPU(s): 8 >>On-line CPU(s) list: 0-7 >>Thread(s) per core: 2 >>Core(s) per socket: 4 >>Socket(s): 1 >>NUMA node(s): 2 >>Vendor ID: AuthenticAMD >>CPU family: 21 >>Model: 2 >>Stepping: 0 >>CPU MHz: 1400.000 >>BogoMIPS: 5600.37 >>Virtualization: AMD-V >>L1d cache: 16K >>L1i cache: 64K >>L2 cache: 2048K >>L3 cache: 6144K >>NUMA node0 CPU(s): 0-3 >>NUMA node1 CPU(s): 4-7 >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140 _______________________________________________ Chapel-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/chapel-developers
