> Hello,
> 
> I set GASNET_IBV_SPAWNER=mpi and then exported GASNET_BACKTRACE=1,
> 
> when I run the program this is the only output I get:
> ./hello_ibv2 -nl 2 -v
> /usr/local/chapel-1.8.0/third-party/gasnet/install/linux64-gnu/seg-everything/nodbg/bin/gasnetrun_ibv
>   -n 2 ./hello_ibv2_real -nl 2 -v
> executing on node 1 of 2 node(s): cs-minta
> executing on node 0 of 2 node(s): cs-minta
> Hello, world! (from locale 0 of 2 named cs-minta)
> Hello, world! (from locale 1 of 2 named cs-minta)
> 
> from the same node, instead of minta and mintb

So it works on a node.

Do you use a queue system in the cluster ? (slurm, PBS, …)
How do you send mpi programs ?
Do you need to load the mpi environment before using it ?

Greets,

Rafael


> Greets,
> 
> Danilo
> 
> ________________________________________
> Da: rafael [[email protected]]
> Inviato: giovedì 3 aprile 2014 9.35
> A: Danilo Guerrera
> Cc: chapel-developers; Public Chapel Bugs list
> Oggetto: Re: [Chapel-developers] [Chapel-bugs] problem with chapel locales 
> (fwd)
> 
> Hi,
> 
> Have you modified the rights of uverbs in all the computers in the cluster ?
> 
> Have you tried again with export GASNET_IBV_SPAWNER=mpi ?
> Do you have mpi properly installed and configured ?
> 
> What does it says when you do the GASNET_BACKTRACE=1 ?
> 
> Greets,
> 
> Rafael
> 
>> Hello,
>> 
>>> Try adding
>>> 
>>> export GASNET_IBV_SPAWNER=ssh
>>> 
>>> to your exports (shouldn't need to recompile anything).
>>> The GASNet/IBV launcher defaults to using MPI.
>> 
>> I did it and now I get this error:
>> 
>> ./hello_ibv -nl 2
>> Cleaning up orphaned processes...
>> *** FATAL ERROR: One or more processes died before setup was completed
>> WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before 
>> gasneti_backtrace_init
>> Aborted (core dumped)
>> 
>>> Perhaps it is a problem with the IB HCA device file permissions.
>>> If you do as your user:
>>> 
>>> ibv_devinfo
>>> 
>>> It should print the HCA information, otherwise that is the problem.
>> 
>> It works and prints out the following:
>> hca_id:       mlx4_0
>>      transport:                      InfiniBand (0)
>>      fw_ver:                         2.8.000
>>      node_guid:                      0025:90ff:ff16:c09c
>>      sys_image_guid:                 0025:90ff:ff16:c09f
>>      vendor_id:                      0x02c9
>>      vendor_part_id:                 26428
>>      hw_ver:                         0xB0
>>      board_id:                       SM_2121000001000
>>      phys_port_cnt:                  1
>>              port:   1
>>                      state:                  PORT_ACTIVE (4)
>>                      max_mtu:                4096 (5)
>>                      active_mtu:             4096 (5)
>>                      sm_lid:                 4
>>                      port_lid:               1
>>                      port_lmc:               0x00
>> 
>> I solved the problem with the IBV modifying the rights on uverbs0 in 
>> /dev/infiniband/
>> 
>> Greets,
>> 
>> Danilo Guerrera
>> Departement Mathematik & Informatik
>> Universität Basel
>> Klingelbergstrasse 50
>> CH-4056 Basel Switzerland
>> email:         [email protected]
>> Phone:      +41 (0)61 267 15 18
>> 
>> ________________________________________
>> Da: Michael Ferguson [[email protected]]
>> Inviato: mercoledì 2 aprile 2014 20.09
>> A: Danilo Guerrera
>> Cc: Public Chapel Bugs list; chapel-developers
>> Oggetto: Re: [Chapel-bugs] [Chapel-developers] problem with chapel locales 
>> (fwd)
>> 
>> Hi Danilo -
>> 
>> Try adding
>> 
>> export GASNET_IBV_SPAWNER=ssh
>> 
>> to your exports (shouldn't need to recompile anything).
>> The GASNet/IBV launcher defaults to using MPI.
>> 
>> -michael
>> 
>> On 04/02/2014 12:08 PM, Danilo Guerrera wrote:
>>> Hello Greg,
>>> 
>>> yes, it's set to GASNET_SPAWNFN=S
>>> 
>>> the following are our exports:
>>> 
>>> export CHPL_COMM=gasnet
>>> export CHPL_COMM_SUBSTRATE=ibv
>>> export GASNET_SPAWNFN=S
>>> export GASNET_SSH_SERVERS="ib-minta ib-mintb"
>>> export SSH_CMD=ssh
>>> export SSH_OPTIONS=-x
>>> 
>>> Thanks,
>>> 
>>> Danilo
>>> 
>>> ________________________________________
>>> Da: [email protected] [[email protected]]
>>> Inviato: mercoledì 2 aprile 2014 18.05
>>> A: Brad Chamberlain
>>> Cc: Danilo Guerrera; Public Chapel Bugs list; Chapel Sourceforge Developers 
>>> List
>>> Oggetto: Re: [Chapel-developers] problem with chapel locales (fwd)
>>> 
>>> Is GASNET_SPAWNFN set and if so, to what?
>>> 
>>> greg
>>> 
>>> 
>>> On Wed, 2 Apr 2014, Brad Chamberlain wrote:
>>> 
>>>> 
>>>> For this thread's reference, here's a follow-up from Danilo that I hadn't
>>>> found prior to sending:
>>>> 
>>>> Good afternoon Mr. Chamberlain,
>>>> 
>>>> I went through these errors and now I'm able to compile and run the
>>>> hello-world with locales. The problem is that even if I provide
>>>> GASNET_SSH_SERVERS with 2 node to be used I only and always get as an 
>>>> output
>>>> 2 prints from the same node, so somehow it's not going through the 
>>>> infiniband
>>>> and executing 2 locales on the same machine. If you have any suggestion I
>>>> would appreciate it, anyway I will open an issue in the mailing list hoping
>>>> to find a solution so that we can use chapel locales in our course.
>>>> 
>>>> 
>>>> I think the original suggestion still holds, but this behavior may be
>>>> familiar to others...  It sounds vaguely familiar to me, but not enough for
>>>> the solution to leap into my hands.
>>>> 
>>>> -Brad
>>>> 
>>>> 
>>>> 
>>>> On Wed, 2 Apr 2014, Brad Chamberlain wrote:
>>>> 
>>>>> 
>>>>> Hi Danilo --
>>>>> 
>>>>> I don't personally have enough experience with GASNet over ibv to
>>>>> immediately recognize this error, but am Cc:ing the public chapel-bugs 
>>>>> list
>>>>> and chapel-developers list in case someone else does (Rafael?).
>>>>> 
>>>>> [Note that your responses to chapel-developers will bounce unless you're
>>>>> subscribed, but you should be able to post to chapel-bugs]
>>>>> 
>>>>> it seems likely that the problem is with your GASNet installation rather
>>>>> than something Chapel specific, so in cases like this, it's often helpful
>>>>> to run GASNet's test suite that it ships with.  To do so, cd to the
>>>>> subdirectory of $CHPL_HOME/third-party/gasnet/build/ that corresponds to
>>>>> your configuration (e.g., mine would be linux64-gnu/seg-fast/nodbg) and
>>>>> then do 'make run-tests-par' or 'make run-tests' (see
>>>>> third-party/gasnet/GASNet-*/README for more information).
>>>>> 
>>>>> If that doesn't point out the problem, you may also want to turn on
>>>>> GASNet's internal debugging assertion checks by setting the environment
>>>>> variable CHPL_COMM_DEBUG and remaking.  This will create a sibling to
>>>>> 'nodbg' in the path above called 'debug' and will often be more verbose
>>>>> about what's going wrong.
>>>>> 
>>>>> Hope this is helpful,
>>>>> -Brad
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ---------- Forwarded message ----------
>>>>> Date: Wed, 2 Apr 2014 04:28:45 -0500
>>>>> From: Danilo Guerrera <[email protected]>
>>>>> To: "[email protected]" <[email protected]>
>>>>> Subject: problem with chapel locales
>>>>> 
>>>>> Good morning Mr. Chamberlain,
>>>>> 
>>>>> I'm a PhD student at University of Basel in the High Performance and Web
>>>>> Computing Group led by prof. H. Burkhart. We introduced Chapel in our High
>>>>> Performance Computing course and now wanted to exploit our cluster and try
>>>>> the locales. We have an infiniband interconnection so that our first 
>>>>> choice
>>>>> wad to set CHPL_COMM_SUBSTRATE to ibv. We followed the simple steps shown
>>>>> in the $CHPL_HOME/doc/README.multilocale file, recompiled chapel and then
>>>>> exported the environment variables as shown at point 5 of the README, in
>>>>> particular giving
>>>>> export GASNET_SSH_SERVERS="minta mintb" as locales to be used.
>>>>> We were able to compile the hello6-taskpar-dist.chpl example, but when
>>>>> running it with the syntax
>>>>> ./hello6-taskpar-dist -nl 2
>>>>> 
>>>>> we get this error:
>>>>> 
>>>>> 
>>>>> GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem
>>>>> with requested resource)
>>>>> 
>>>>> at
>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606
>>>>> 
>>>>> reason: unable to open any HCA ports
>>>>> 
>>>>> GASNet
>>>>> gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines
>>>>> returning an error code: GASNET_ERR_RESOURCE (Problem with requested
>>>>> resource)
>>>>> 
>>>>> at
>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849
>>>>> 
>>>>> *** Caught a fatal signal: SIGSEGV(11) on node 0/2
>>>>> 
>>>>> NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the
>>>>> environment to generate a backtrace.
>>>>> 
>>>>> GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem
>>>>> with requested resource)
>>>>> 
>>>>> at
>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606
>>>>> 
>>>>> reason: unable to open any HCA ports
>>>>> 
>>>>> GASNet
>>>>> gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines
>>>>> returning an error code: GASNET_ERR_RESOURCE (Problem with requested
>>>>> resource)
>>>>> 
>>>>> at
>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849
>>>>> 
>>>>> *** Caught a fatal signal: SIGSEGV(11) on node 1/2
>>>>> 
>>>>> NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the
>>>>> environment to generate a backtrace.
>>>>> 
>>>>> 
>>>>> =====================================================================================
>>>>> 
>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>> 
>>>>> =   EXIT CODE: 139
>>>>> 
>>>>> =   CLEANING UP REMAINING PROCESSES
>>>>> 
>>>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>>>> 
>>>>> =====================================================================================
>>>>> 
>>>>> APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 
>>>>> 11)
>>>>> 
>>>>> Are there particular additional configuration steps we have to make before
>>>>> being able to run chapel locales properly? Or are we making somehow
>>>>> mistakes?
>>>>> 
>>>>> I hope you can help us.
>>>>> 
>>>>> Kind regards,
>>>>> 
>>>>> Danilo Guerrera
>>>>> Departement Mathematik & Informatik
>>>>> Universität Basel
>>>>> Klingelbergstrasse 50
>>>>> CH-4056 Basel Switzerland
>>>>> email:         [email protected]
>>>>> Phone:      +41 (0)61 267 15 18
>>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> Chapel-bugs mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/chapel-bugs
>>> 
>> 
>> 
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Chapel-developers mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/chapel-developers
> 
> --
> Rafael Larrosa Jiménez
> Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es
> Universidad de Málaga
> 
> EMAIL: [email protected]                  Edificio de Bioinnovación
> TELEF: + 34951952788                    C/ Severo Ochoa 34
> FAX  : +34951952792                             Parque Tecnológico de 
> Andalucía
>                                                                29590 Málaga 
> (SPAIN)
> 
> 

-- 
Rafael Larrosa Jiménez
Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es
Universidad de Málaga

EMAIL: [email protected]                  Edificio de Bioinnovación
TELEF: + 34951952788                    C/ Severo Ochoa 34
FAX  : +34951952792                             Parque Tecnológico de Andalucía
                                                                29590 Málaga 
(SPAIN)



------------------------------------------------------------------------------
_______________________________________________
Chapel-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-bugs

Reply via email to