Hello Brad, tests are running. They all look like these:
** Running ibv-conduit testcore2 in PAR/EVERYTHING mode *** /users/staff/ifi/guerrera/chapel-1.8.0/third-party/gasnet/build/linux64-gnu/seg-everything/nodbg/ibv-conduit/contrib/gasnetrun_ibv -np 2 ./testcore2 Cleaning up orphaned processes... *** FATAL ERROR: One or more processes died before setup was completed WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before gasneti_backtrace_init Command terminated by signal 6 0.03user 0.03system 0:00.29elapsed 24%CPU (0avgtext+0avgdata 14608maxresident)k 0inputs+0outputs (0major+6230minor)pagefaults 0swaps ERROR: Test exited with failure code=134 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ *** Running ibv-conduit testcore3 in PAR/EVERYTHING mode *** /users/staff/ifi/guerrera/chapel-1.8.0/third-party/gasnet/build/linux64-gnu/seg-everything/nodbg/ibv-conduit/contrib/gasnetrun_ibv -np 2 ./testcore3 Cleaning up orphaned processes... *** FATAL ERROR: One or more processes died before setup was completed WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before gasneti_backtrace_init Command terminated by signal 6 0.04user 0.02system 0:00.28elapsed 25%CPU (0avgtext+0avgdata 14608maxresident)k 0inputs+0outputs (0major+6224minor)pagefaults 0swaps ERROR: Test exited with failure code=134 and that's the error I get if I run the hello6-task-par-dist example provided by Chapel Greets, Danilo ________________________________________ Da: Brad Chamberlain [[email protected]] Inviato: giovedì 3 aprile 2014 19.00 A: Danilo Guerrera Cc: rafael; Public Chapel Bugs list; chapel-developers Oggetto: Re: [Chapel-developers] [Chapel-bugs] problem with chapel locales (fwd) Backing up a several messages... maybe I missed it, but did you try running the GASNet diagnostic tests as suggested in the original response? With debugging on? It seems to me that the duplicate cs-minta issue below (as well as the previous issues) are very likely a GASNet configuration/launch issue rather than a Chapel-specific thing and that it would probably be easiest to debug using GASNet's tests (e.g., testhello) if it is. And if it isn't, that would be valuable information for debugging Chapel. Also likely to be useful, if you haven't already found it, is the GASNet README for the ibv conduit, which is in third-party/gasnet/GASNet-*/ibv-conduit/README (or vapi-conduit/README for older versions of GASNet -- I can't remember when they finally moved the README into ibv-conduit). Among other things, this details the various options for launching via MPI or ssh, etc. -Brad On Thu, 3 Apr 2014, Danilo Guerrera wrote: > We use hydra for starting the parallel jobs. > You don't need to load mpi environment before using it, you can just (compile > and) run a mpi program like this > mpiexec -f hostfile ./app > > Greets, > > Danilo > > ________________________________________ > Da: rafael [[email protected]] > Inviato: giovedì 3 aprile 2014 10.17 > A: Danilo Guerrera > Cc: chapel-developers; Public Chapel Bugs list > Oggetto: Re: [Chapel-developers] [Chapel-bugs] problem with chapel locales > (fwd) > >> Hello, >> >> I set GASNET_IBV_SPAWNER=mpi and then exported GASNET_BACKTRACE=1, >> >> when I run the program this is the only output I get: >> ./hello_ibv2 -nl 2 -v >> /usr/local/chapel-1.8.0/third-party/gasnet/install/linux64-gnu/seg-everything/nodbg/bin/gasnetrun_ibv >> -n 2 ./hello_ibv2_real -nl 2 -v >> executing on node 1 of 2 node(s): cs-minta >> executing on node 0 of 2 node(s): cs-minta >> Hello, world! (from locale 0 of 2 named cs-minta) >> Hello, world! (from locale 1 of 2 named cs-minta) >> >> from the same node, instead of minta and mintb > > So it works on a node. > > Do you use a queue system in the cluster ? (slurm, PBS, …) > How do you send mpi programs ? > Do you need to load the mpi environment before using it ? > > Greets, > > Rafael > > >> Greets, >> >> Danilo >> >> ________________________________________ >> Da: rafael [[email protected]] >> Inviato: giovedì 3 aprile 2014 9.35 >> A: Danilo Guerrera >> Cc: chapel-developers; Public Chapel Bugs list >> Oggetto: Re: [Chapel-developers] [Chapel-bugs] problem with chapel locales >> (fwd) >> >> Hi, >> >> Have you modified the rights of uverbs in all the computers in the cluster ? >> >> Have you tried again with export GASNET_IBV_SPAWNER=mpi ? >> Do you have mpi properly installed and configured ? >> >> What does it says when you do the GASNET_BACKTRACE=1 ? >> >> Greets, >> >> Rafael >> >>> Hello, >>> >>>> Try adding >>>> >>>> export GASNET_IBV_SPAWNER=ssh >>>> >>>> to your exports (shouldn't need to recompile anything). >>>> The GASNet/IBV launcher defaults to using MPI. >>> >>> I did it and now I get this error: >>> >>> ./hello_ibv -nl 2 >>> Cleaning up orphaned processes... >>> *** FATAL ERROR: One or more processes died before setup was completed >>> WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before >>> gasneti_backtrace_init >>> Aborted (core dumped) >>> >>>> Perhaps it is a problem with the IB HCA device file permissions. >>>> If you do as your user: >>>> >>>> ibv_devinfo >>>> >>>> It should print the HCA information, otherwise that is the problem. >>> >>> It works and prints out the following: >>> hca_id: mlx4_0 >>> transport: InfiniBand (0) >>> fw_ver: 2.8.000 >>> node_guid: 0025:90ff:ff16:c09c >>> sys_image_guid: 0025:90ff:ff16:c09f >>> vendor_id: 0x02c9 >>> vendor_part_id: 26428 >>> hw_ver: 0xB0 >>> board_id: SM_2121000001000 >>> phys_port_cnt: 1 >>> port: 1 >>> state: PORT_ACTIVE (4) >>> max_mtu: 4096 (5) >>> active_mtu: 4096 (5) >>> sm_lid: 4 >>> port_lid: 1 >>> port_lmc: 0x00 >>> >>> I solved the problem with the IBV modifying the rights on uverbs0 in >>> /dev/infiniband/ >>> >>> Greets, >>> >>> Danilo Guerrera >>> Departement Mathematik & Informatik >>> Universität Basel >>> Klingelbergstrasse 50 >>> CH-4056 Basel Switzerland >>> email: [email protected] >>> Phone: +41 (0)61 267 15 18 >>> >>> ________________________________________ >>> Da: Michael Ferguson [[email protected]] >>> Inviato: mercoledì 2 aprile 2014 20.09 >>> A: Danilo Guerrera >>> Cc: Public Chapel Bugs list; chapel-developers >>> Oggetto: Re: [Chapel-bugs] [Chapel-developers] problem with chapel locales >>> (fwd) >>> >>> Hi Danilo - >>> >>> Try adding >>> >>> export GASNET_IBV_SPAWNER=ssh >>> >>> to your exports (shouldn't need to recompile anything). >>> The GASNet/IBV launcher defaults to using MPI. >>> >>> -michael >>> >>> On 04/02/2014 12:08 PM, Danilo Guerrera wrote: >>>> Hello Greg, >>>> >>>> yes, it's set to GASNET_SPAWNFN=S >>>> >>>> the following are our exports: >>>> >>>> export CHPL_COMM=gasnet >>>> export CHPL_COMM_SUBSTRATE=ibv >>>> export GASNET_SPAWNFN=S >>>> export GASNET_SSH_SERVERS="ib-minta ib-mintb" >>>> export SSH_CMD=ssh >>>> export SSH_OPTIONS=-x >>>> >>>> Thanks, >>>> >>>> Danilo >>>> >>>> ________________________________________ >>>> Da: [email protected] [[email protected]] >>>> Inviato: mercoledì 2 aprile 2014 18.05 >>>> A: Brad Chamberlain >>>> Cc: Danilo Guerrera; Public Chapel Bugs list; Chapel Sourceforge >>>> Developers List >>>> Oggetto: Re: [Chapel-developers] problem with chapel locales (fwd) >>>> >>>> Is GASNET_SPAWNFN set and if so, to what? >>>> >>>> greg >>>> >>>> >>>> On Wed, 2 Apr 2014, Brad Chamberlain wrote: >>>> >>>>> >>>>> For this thread's reference, here's a follow-up from Danilo that I hadn't >>>>> found prior to sending: >>>>> >>>>> Good afternoon Mr. Chamberlain, >>>>> >>>>> I went through these errors and now I'm able to compile and run the >>>>> hello-world with locales. The problem is that even if I provide >>>>> GASNET_SSH_SERVERS with 2 node to be used I only and always get as an >>>>> output >>>>> 2 prints from the same node, so somehow it's not going through the >>>>> infiniband >>>>> and executing 2 locales on the same machine. If you have any suggestion I >>>>> would appreciate it, anyway I will open an issue in the mailing list >>>>> hoping >>>>> to find a solution so that we can use chapel locales in our course. >>>>> >>>>> >>>>> I think the original suggestion still holds, but this behavior may be >>>>> familiar to others... It sounds vaguely familiar to me, but not enough >>>>> for >>>>> the solution to leap into my hands. >>>>> >>>>> -Brad >>>>> >>>>> >>>>> >>>>> On Wed, 2 Apr 2014, Brad Chamberlain wrote: >>>>> >>>>>> >>>>>> Hi Danilo -- >>>>>> >>>>>> I don't personally have enough experience with GASNet over ibv to >>>>>> immediately recognize this error, but am Cc:ing the public chapel-bugs >>>>>> list >>>>>> and chapel-developers list in case someone else does (Rafael?). >>>>>> >>>>>> [Note that your responses to chapel-developers will bounce unless you're >>>>>> subscribed, but you should be able to post to chapel-bugs] >>>>>> >>>>>> it seems likely that the problem is with your GASNet installation rather >>>>>> than something Chapel specific, so in cases like this, it's often helpful >>>>>> to run GASNet's test suite that it ships with. To do so, cd to the >>>>>> subdirectory of $CHPL_HOME/third-party/gasnet/build/ that corresponds to >>>>>> your configuration (e.g., mine would be linux64-gnu/seg-fast/nodbg) and >>>>>> then do 'make run-tests-par' or 'make run-tests' (see >>>>>> third-party/gasnet/GASNet-*/README for more information). >>>>>> >>>>>> If that doesn't point out the problem, you may also want to turn on >>>>>> GASNet's internal debugging assertion checks by setting the environment >>>>>> variable CHPL_COMM_DEBUG and remaking. This will create a sibling to >>>>>> 'nodbg' in the path above called 'debug' and will often be more verbose >>>>>> about what's going wrong. >>>>>> >>>>>> Hope this is helpful, >>>>>> -Brad >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ---------- Forwarded message ---------- >>>>>> Date: Wed, 2 Apr 2014 04:28:45 -0500 >>>>>> From: Danilo Guerrera <[email protected]> >>>>>> To: "[email protected]" <[email protected]> >>>>>> Subject: problem with chapel locales >>>>>> >>>>>> Good morning Mr. Chamberlain, >>>>>> >>>>>> I'm a PhD student at University of Basel in the High Performance and Web >>>>>> Computing Group led by prof. H. Burkhart. We introduced Chapel in our >>>>>> High >>>>>> Performance Computing course and now wanted to exploit our cluster and >>>>>> try >>>>>> the locales. We have an infiniband interconnection so that our first >>>>>> choice >>>>>> wad to set CHPL_COMM_SUBSTRATE to ibv. We followed the simple steps shown >>>>>> in the $CHPL_HOME/doc/README.multilocale file, recompiled chapel and then >>>>>> exported the environment variables as shown at point 5 of the README, in >>>>>> particular giving >>>>>> export GASNET_SSH_SERVERS="minta mintb" as locales to be used. >>>>>> We were able to compile the hello6-taskpar-dist.chpl example, but when >>>>>> running it with the syntax >>>>>> ./hello6-taskpar-dist -nl 2 >>>>>> >>>>>> we get this error: >>>>>> >>>>>> >>>>>> GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem >>>>>> with requested resource) >>>>>> >>>>>> at >>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606 >>>>>> >>>>>> reason: unable to open any HCA ports >>>>>> >>>>>> GASNet >>>>>> gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines >>>>>> returning an error code: GASNET_ERR_RESOURCE (Problem with requested >>>>>> resource) >>>>>> >>>>>> at >>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849 >>>>>> >>>>>> *** Caught a fatal signal: SIGSEGV(11) on node 0/2 >>>>>> >>>>>> NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the >>>>>> environment to generate a backtrace. >>>>>> >>>>>> GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem >>>>>> with requested resource) >>>>>> >>>>>> at >>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606 >>>>>> >>>>>> reason: unable to open any HCA ports >>>>>> >>>>>> GASNet >>>>>> gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines >>>>>> returning an error code: GASNET_ERR_RESOURCE (Problem with requested >>>>>> resource) >>>>>> >>>>>> at >>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849 >>>>>> >>>>>> *** Caught a fatal signal: SIGSEGV(11) on node 1/2 >>>>>> >>>>>> NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the >>>>>> environment to generate a backtrace. >>>>>> >>>>>> >>>>>> ===================================================================================== >>>>>> >>>>>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>>>>> >>>>>> = EXIT CODE: 139 >>>>>> >>>>>> = CLEANING UP REMAINING PROCESSES >>>>>> >>>>>> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES >>>>>> >>>>>> ===================================================================================== >>>>>> >>>>>> APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal >>>>>> 11) >>>>>> >>>>>> Are there particular additional configuration steps we have to make >>>>>> before >>>>>> being able to run chapel locales properly? Or are we making somehow >>>>>> mistakes? >>>>>> >>>>>> I hope you can help us. >>>>>> >>>>>> Kind regards, >>>>>> >>>>>> Danilo Guerrera >>>>>> Departement Mathematik & Informatik >>>>>> Universität Basel >>>>>> Klingelbergstrasse 50 >>>>>> CH-4056 Basel Switzerland >>>>>> email: [email protected] >>>>>> Phone: +41 (0)61 267 15 18 >>>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> _______________________________________________ >>>> Chapel-bugs mailing list >>>> [email protected] >>>> https://lists.sourceforge.net/lists/listinfo/chapel-bugs >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> _______________________________________________ >>> Chapel-developers mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/chapel-developers >> >> -- >> Rafael Larrosa Jiménez >> Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es >> Universidad de Málaga >> >> EMAIL: [email protected] Edificio de Bioinnovación >> TELEF: + 34951952788 C/ Severo Ochoa 34 >> FAX : +34951952792 Parque Tecnológico de >> Andalucía >> 29590 Málaga >> (SPAIN) >> >> > > -- > Rafael Larrosa Jiménez > Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es > Universidad de Málaga > > EMAIL: [email protected] Edificio de Bioinnovación > TELEF: + 34951952788 C/ Severo Ochoa 34 > FAX : +34951952792 Parque Tecnológico de > Andalucía > 29590 Málaga > (SPAIN) > > > > ------------------------------------------------------------------------------ > _______________________________________________ > Chapel-developers mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/chapel-developers > ------------------------------------------------------------------------------ _______________________________________________ Chapel-bugs mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/chapel-bugs
