I forgot to add that the rights are now set properly on all of the nodes.

Danilo Guerrera
Departement Mathematik & Informatik
Universität Basel
Klingelbergstrasse 50
CH-4056 Basel Switzerland
email:         [email protected]
Phone:      +41 (0)61 267 15 18

________________________________________
Da: rafael [[email protected]]
Inviato: giovedì 3 aprile 2014 9.35
A: Danilo Guerrera
Cc: chapel-developers; Public Chapel Bugs list
Oggetto: Re: [Chapel-developers] [Chapel-bugs] problem with chapel locales (fwd)

Hi,

Have you modified the rights of uverbs in all the computers in the cluster ?

Have you tried again with export GASNET_IBV_SPAWNER=mpi ?
Do you have mpi properly installed and configured ?

What does it says when you do the GASNET_BACKTRACE=1 ?

Greets,

Rafael

> Hello,
>
>> Try adding
>>
>> export GASNET_IBV_SPAWNER=ssh
>>
>> to your exports (shouldn't need to recompile anything).
>> The GASNet/IBV launcher defaults to using MPI.
>
> I did it and now I get this error:
>
> ./hello_ibv -nl 2
> Cleaning up orphaned processes...
> *** FATAL ERROR: One or more processes died before setup was completed
> WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before 
> gasneti_backtrace_init
> Aborted (core dumped)
>
>> Perhaps it is a problem with the IB HCA device file permissions.
>> If you do as your user:
>>
>> ibv_devinfo
>>
>> It should print the HCA information, otherwise that is the problem.
>
> It works and prints out the following:
> hca_id:       mlx4_0
>       transport:                      InfiniBand (0)
>       fw_ver:                         2.8.000
>       node_guid:                      0025:90ff:ff16:c09c
>       sys_image_guid:                 0025:90ff:ff16:c09f
>       vendor_id:                      0x02c9
>       vendor_part_id:                 26428
>       hw_ver:                         0xB0
>       board_id:                       SM_2121000001000
>       phys_port_cnt:                  1
>               port:   1
>                       state:                  PORT_ACTIVE (4)
>                       max_mtu:                4096 (5)
>                       active_mtu:             4096 (5)
>                       sm_lid:                 4
>                       port_lid:               1
>                       port_lmc:               0x00
>
> I solved the problem with the IBV modifying the rights on uverbs0 in 
> /dev/infiniband/
>
> Greets,
>
> Danilo Guerrera
> Departement Mathematik & Informatik
> Universität Basel
> Klingelbergstrasse 50
> CH-4056 Basel Switzerland
> email:         [email protected]
> Phone:      +41 (0)61 267 15 18
>
> ________________________________________
> Da: Michael Ferguson [[email protected]]
> Inviato: mercoledì 2 aprile 2014 20.09
> A: Danilo Guerrera
> Cc: Public Chapel Bugs list; chapel-developers
> Oggetto: Re: [Chapel-bugs] [Chapel-developers] problem with chapel locales 
> (fwd)
>
> Hi Danilo -
>
> Try adding
>
> export GASNET_IBV_SPAWNER=ssh
>
> to your exports (shouldn't need to recompile anything).
> The GASNet/IBV launcher defaults to using MPI.
>
> -michael
>
> On 04/02/2014 12:08 PM, Danilo Guerrera wrote:
>> Hello Greg,
>>
>> yes, it's set to GASNET_SPAWNFN=S
>>
>> the following are our exports:
>>
>> export CHPL_COMM=gasnet
>> export CHPL_COMM_SUBSTRATE=ibv
>> export GASNET_SPAWNFN=S
>> export GASNET_SSH_SERVERS="ib-minta ib-mintb"
>> export SSH_CMD=ssh
>> export SSH_OPTIONS=-x
>>
>> Thanks,
>>
>> Danilo
>>
>> ________________________________________
>> Da: [email protected] [[email protected]]
>> Inviato: mercoledì 2 aprile 2014 18.05
>> A: Brad Chamberlain
>> Cc: Danilo Guerrera; Public Chapel Bugs list; Chapel Sourceforge Developers 
>> List
>> Oggetto: Re: [Chapel-developers] problem with chapel locales (fwd)
>>
>> Is GASNET_SPAWNFN set and if so, to what?
>>
>> greg
>>
>>
>> On Wed, 2 Apr 2014, Brad Chamberlain wrote:
>>
>>>
>>> For this thread's reference, here's a follow-up from Danilo that I hadn't
>>> found prior to sending:
>>>
>>> Good afternoon Mr. Chamberlain,
>>>
>>> I went through these errors and now I'm able to compile and run the
>>> hello-world with locales. The problem is that even if I provide
>>> GASNET_SSH_SERVERS with 2 node to be used I only and always get as an output
>>> 2 prints from the same node, so somehow it's not going through the 
>>> infiniband
>>> and executing 2 locales on the same machine. If you have any suggestion I
>>> would appreciate it, anyway I will open an issue in the mailing list hoping
>>> to find a solution so that we can use chapel locales in our course.
>>>
>>>
>>> I think the original suggestion still holds, but this behavior may be
>>> familiar to others...  It sounds vaguely familiar to me, but not enough for
>>> the solution to leap into my hands.
>>>
>>> -Brad
>>>
>>>
>>>
>>> On Wed, 2 Apr 2014, Brad Chamberlain wrote:
>>>
>>>>
>>>> Hi Danilo --
>>>>
>>>> I don't personally have enough experience with GASNet over ibv to
>>>> immediately recognize this error, but am Cc:ing the public chapel-bugs list
>>>> and chapel-developers list in case someone else does (Rafael?).
>>>>
>>>> [Note that your responses to chapel-developers will bounce unless you're
>>>> subscribed, but you should be able to post to chapel-bugs]
>>>>
>>>> it seems likely that the problem is with your GASNet installation rather
>>>> than something Chapel specific, so in cases like this, it's often helpful
>>>> to run GASNet's test suite that it ships with.  To do so, cd to the
>>>> subdirectory of $CHPL_HOME/third-party/gasnet/build/ that corresponds to
>>>> your configuration (e.g., mine would be linux64-gnu/seg-fast/nodbg) and
>>>> then do 'make run-tests-par' or 'make run-tests' (see
>>>> third-party/gasnet/GASNet-*/README for more information).
>>>>
>>>> If that doesn't point out the problem, you may also want to turn on
>>>> GASNet's internal debugging assertion checks by setting the environment
>>>> variable CHPL_COMM_DEBUG and remaking.  This will create a sibling to
>>>> 'nodbg' in the path above called 'debug' and will often be more verbose
>>>> about what's going wrong.
>>>>
>>>> Hope this is helpful,
>>>> -Brad
>>>>
>>>>
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> Date: Wed, 2 Apr 2014 04:28:45 -0500
>>>> From: Danilo Guerrera <[email protected]>
>>>> To: "[email protected]" <[email protected]>
>>>> Subject: problem with chapel locales
>>>>
>>>> Good morning Mr. Chamberlain,
>>>>
>>>> I'm a PhD student at University of Basel in the High Performance and Web
>>>> Computing Group led by prof. H. Burkhart. We introduced Chapel in our High
>>>> Performance Computing course and now wanted to exploit our cluster and try
>>>> the locales. We have an infiniband interconnection so that our first choice
>>>> wad to set CHPL_COMM_SUBSTRATE to ibv. We followed the simple steps shown
>>>> in the $CHPL_HOME/doc/README.multilocale file, recompiled chapel and then
>>>> exported the environment variables as shown at point 5 of the README, in
>>>> particular giving
>>>> export GASNET_SSH_SERVERS="minta mintb" as locales to be used.
>>>> We were able to compile the hello6-taskpar-dist.chpl example, but when
>>>> running it with the syntax
>>>> ./hello6-taskpar-dist -nl 2
>>>>
>>>> we get this error:
>>>>
>>>>
>>>> GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem
>>>> with requested resource)
>>>>
>>>>  at
>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606
>>>>
>>>>  reason: unable to open any HCA ports
>>>>
>>>> GASNet
>>>> gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines
>>>> returning an error code: GASNET_ERR_RESOURCE (Problem with requested
>>>> resource)
>>>>
>>>>  at
>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849
>>>>
>>>> *** Caught a fatal signal: SIGSEGV(11) on node 0/2
>>>>
>>>> NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the
>>>> environment to generate a backtrace.
>>>>
>>>> GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem
>>>> with requested resource)
>>>>
>>>>  at
>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606
>>>>
>>>>  reason: unable to open any HCA ports
>>>>
>>>> GASNet
>>>> gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines
>>>> returning an error code: GASNET_ERR_RESOURCE (Problem with requested
>>>> resource)
>>>>
>>>>  at
>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849
>>>>
>>>> *** Caught a fatal signal: SIGSEGV(11) on node 1/2
>>>>
>>>> NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the
>>>> environment to generate a backtrace.
>>>>
>>>>
>>>> =====================================================================================
>>>>
>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>
>>>> =   EXIT CODE: 139
>>>>
>>>> =   CLEANING UP REMAINING PROCESSES
>>>>
>>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>>>
>>>> =====================================================================================
>>>>
>>>> APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
>>>>
>>>> Are there particular additional configuration steps we have to make before
>>>> being able to run chapel locales properly? Or are we making somehow
>>>> mistakes?
>>>>
>>>> I hope you can help us.
>>>>
>>>> Kind regards,
>>>>
>>>> Danilo Guerrera
>>>> Departement Mathematik & Informatik
>>>> Universität Basel
>>>> Klingelbergstrasse 50
>>>> CH-4056 Basel Switzerland
>>>> email:         [email protected]
>>>> Phone:      +41 (0)61 267 15 18
>>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Chapel-bugs mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/chapel-bugs
>>
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Chapel-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/chapel-developers

--
Rafael Larrosa Jiménez
Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es
Universidad de Málaga

EMAIL: [email protected]                  Edificio de Bioinnovación
TELEF: + 34951952788                    C/ Severo Ochoa 34
FAX  : +34951952792                             Parque Tecnológico de Andalucía
                                                                29590 Málaga 
(SPAIN)



------------------------------------------------------------------------------
_______________________________________________
Chapel-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-bugs

Reply via email to