We use hydra for starting the parallel jobs.
You don't need to load mpi environment before using it, you can just (compile 
and) run a mpi program like this
mpiexec -f hostfile ./app

Greets,

Danilo

________________________________________
Da: rafael [[email protected]]
Inviato: giovedì 3 aprile 2014 10.17
A: Danilo Guerrera
Cc: chapel-developers; Public Chapel Bugs list
Oggetto: Re: [Chapel-developers] [Chapel-bugs] problem with chapel locales (fwd)

> Hello,
>
> I set GASNET_IBV_SPAWNER=mpi and then exported GASNET_BACKTRACE=1,
>
> when I run the program this is the only output I get:
> ./hello_ibv2 -nl 2 -v
> /usr/local/chapel-1.8.0/third-party/gasnet/install/linux64-gnu/seg-everything/nodbg/bin/gasnetrun_ibv
>   -n 2 ./hello_ibv2_real -nl 2 -v
> executing on node 1 of 2 node(s): cs-minta
> executing on node 0 of 2 node(s): cs-minta
> Hello, world! (from locale 0 of 2 named cs-minta)
> Hello, world! (from locale 1 of 2 named cs-minta)
>
> from the same node, instead of minta and mintb

So it works on a node.

Do you use a queue system in the cluster ? (slurm, PBS, …)
How do you send mpi programs ?
Do you need to load the mpi environment before using it ?

Greets,

Rafael


> Greets,
>
> Danilo
>
> ________________________________________
> Da: rafael [[email protected]]
> Inviato: giovedì 3 aprile 2014 9.35
> A: Danilo Guerrera
> Cc: chapel-developers; Public Chapel Bugs list
> Oggetto: Re: [Chapel-developers] [Chapel-bugs] problem with chapel locales 
> (fwd)
>
> Hi,
>
> Have you modified the rights of uverbs in all the computers in the cluster ?
>
> Have you tried again with export GASNET_IBV_SPAWNER=mpi ?
> Do you have mpi properly installed and configured ?
>
> What does it says when you do the GASNET_BACKTRACE=1 ?
>
> Greets,
>
> Rafael
>
>> Hello,
>>
>>> Try adding
>>>
>>> export GASNET_IBV_SPAWNER=ssh
>>>
>>> to your exports (shouldn't need to recompile anything).
>>> The GASNet/IBV launcher defaults to using MPI.
>>
>> I did it and now I get this error:
>>
>> ./hello_ibv -nl 2
>> Cleaning up orphaned processes...
>> *** FATAL ERROR: One or more processes died before setup was completed
>> WARNING: Ignoring call to gasneti_print_backtrace_ifenabled before 
>> gasneti_backtrace_init
>> Aborted (core dumped)
>>
>>> Perhaps it is a problem with the IB HCA device file permissions.
>>> If you do as your user:
>>>
>>> ibv_devinfo
>>>
>>> It should print the HCA information, otherwise that is the problem.
>>
>> It works and prints out the following:
>> hca_id:       mlx4_0
>>      transport:                      InfiniBand (0)
>>      fw_ver:                         2.8.000
>>      node_guid:                      0025:90ff:ff16:c09c
>>      sys_image_guid:                 0025:90ff:ff16:c09f
>>      vendor_id:                      0x02c9
>>      vendor_part_id:                 26428
>>      hw_ver:                         0xB0
>>      board_id:                       SM_2121000001000
>>      phys_port_cnt:                  1
>>              port:   1
>>                      state:                  PORT_ACTIVE (4)
>>                      max_mtu:                4096 (5)
>>                      active_mtu:             4096 (5)
>>                      sm_lid:                 4
>>                      port_lid:               1
>>                      port_lmc:               0x00
>>
>> I solved the problem with the IBV modifying the rights on uverbs0 in 
>> /dev/infiniband/
>>
>> Greets,
>>
>> Danilo Guerrera
>> Departement Mathematik & Informatik
>> Universität Basel
>> Klingelbergstrasse 50
>> CH-4056 Basel Switzerland
>> email:         [email protected]
>> Phone:      +41 (0)61 267 15 18
>>
>> ________________________________________
>> Da: Michael Ferguson [[email protected]]
>> Inviato: mercoledì 2 aprile 2014 20.09
>> A: Danilo Guerrera
>> Cc: Public Chapel Bugs list; chapel-developers
>> Oggetto: Re: [Chapel-bugs] [Chapel-developers] problem with chapel locales 
>> (fwd)
>>
>> Hi Danilo -
>>
>> Try adding
>>
>> export GASNET_IBV_SPAWNER=ssh
>>
>> to your exports (shouldn't need to recompile anything).
>> The GASNet/IBV launcher defaults to using MPI.
>>
>> -michael
>>
>> On 04/02/2014 12:08 PM, Danilo Guerrera wrote:
>>> Hello Greg,
>>>
>>> yes, it's set to GASNET_SPAWNFN=S
>>>
>>> the following are our exports:
>>>
>>> export CHPL_COMM=gasnet
>>> export CHPL_COMM_SUBSTRATE=ibv
>>> export GASNET_SPAWNFN=S
>>> export GASNET_SSH_SERVERS="ib-minta ib-mintb"
>>> export SSH_CMD=ssh
>>> export SSH_OPTIONS=-x
>>>
>>> Thanks,
>>>
>>> Danilo
>>>
>>> ________________________________________
>>> Da: [email protected] [[email protected]]
>>> Inviato: mercoledì 2 aprile 2014 18.05
>>> A: Brad Chamberlain
>>> Cc: Danilo Guerrera; Public Chapel Bugs list; Chapel Sourceforge Developers 
>>> List
>>> Oggetto: Re: [Chapel-developers] problem with chapel locales (fwd)
>>>
>>> Is GASNET_SPAWNFN set and if so, to what?
>>>
>>> greg
>>>
>>>
>>> On Wed, 2 Apr 2014, Brad Chamberlain wrote:
>>>
>>>>
>>>> For this thread's reference, here's a follow-up from Danilo that I hadn't
>>>> found prior to sending:
>>>>
>>>> Good afternoon Mr. Chamberlain,
>>>>
>>>> I went through these errors and now I'm able to compile and run the
>>>> hello-world with locales. The problem is that even if I provide
>>>> GASNET_SSH_SERVERS with 2 node to be used I only and always get as an 
>>>> output
>>>> 2 prints from the same node, so somehow it's not going through the 
>>>> infiniband
>>>> and executing 2 locales on the same machine. If you have any suggestion I
>>>> would appreciate it, anyway I will open an issue in the mailing list hoping
>>>> to find a solution so that we can use chapel locales in our course.
>>>>
>>>>
>>>> I think the original suggestion still holds, but this behavior may be
>>>> familiar to others...  It sounds vaguely familiar to me, but not enough for
>>>> the solution to leap into my hands.
>>>>
>>>> -Brad
>>>>
>>>>
>>>>
>>>> On Wed, 2 Apr 2014, Brad Chamberlain wrote:
>>>>
>>>>>
>>>>> Hi Danilo --
>>>>>
>>>>> I don't personally have enough experience with GASNet over ibv to
>>>>> immediately recognize this error, but am Cc:ing the public chapel-bugs 
>>>>> list
>>>>> and chapel-developers list in case someone else does (Rafael?).
>>>>>
>>>>> [Note that your responses to chapel-developers will bounce unless you're
>>>>> subscribed, but you should be able to post to chapel-bugs]
>>>>>
>>>>> it seems likely that the problem is with your GASNet installation rather
>>>>> than something Chapel specific, so in cases like this, it's often helpful
>>>>> to run GASNet's test suite that it ships with.  To do so, cd to the
>>>>> subdirectory of $CHPL_HOME/third-party/gasnet/build/ that corresponds to
>>>>> your configuration (e.g., mine would be linux64-gnu/seg-fast/nodbg) and
>>>>> then do 'make run-tests-par' or 'make run-tests' (see
>>>>> third-party/gasnet/GASNet-*/README for more information).
>>>>>
>>>>> If that doesn't point out the problem, you may also want to turn on
>>>>> GASNet's internal debugging assertion checks by setting the environment
>>>>> variable CHPL_COMM_DEBUG and remaking.  This will create a sibling to
>>>>> 'nodbg' in the path above called 'debug' and will often be more verbose
>>>>> about what's going wrong.
>>>>>
>>>>> Hope this is helpful,
>>>>> -Brad
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> Date: Wed, 2 Apr 2014 04:28:45 -0500
>>>>> From: Danilo Guerrera <[email protected]>
>>>>> To: "[email protected]" <[email protected]>
>>>>> Subject: problem with chapel locales
>>>>>
>>>>> Good morning Mr. Chamberlain,
>>>>>
>>>>> I'm a PhD student at University of Basel in the High Performance and Web
>>>>> Computing Group led by prof. H. Burkhart. We introduced Chapel in our High
>>>>> Performance Computing course and now wanted to exploit our cluster and try
>>>>> the locales. We have an infiniband interconnection so that our first 
>>>>> choice
>>>>> wad to set CHPL_COMM_SUBSTRATE to ibv. We followed the simple steps shown
>>>>> in the $CHPL_HOME/doc/README.multilocale file, recompiled chapel and then
>>>>> exported the environment variables as shown at point 5 of the README, in
>>>>> particular giving
>>>>> export GASNET_SSH_SERVERS="minta mintb" as locales to be used.
>>>>> We were able to compile the hello6-taskpar-dist.chpl example, but when
>>>>> running it with the syntax
>>>>> ./hello6-taskpar-dist -nl 2
>>>>>
>>>>> we get this error:
>>>>>
>>>>>
>>>>> GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem
>>>>> with requested resource)
>>>>>
>>>>> at
>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606
>>>>>
>>>>> reason: unable to open any HCA ports
>>>>>
>>>>> GASNet
>>>>> gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines
>>>>> returning an error code: GASNET_ERR_RESOURCE (Problem with requested
>>>>> resource)
>>>>>
>>>>> at
>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849
>>>>>
>>>>> *** Caught a fatal signal: SIGSEGV(11) on node 0/2
>>>>>
>>>>> NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the
>>>>> environment to generate a backtrace.
>>>>>
>>>>> GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem
>>>>> with requested resource)
>>>>>
>>>>> at
>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606
>>>>>
>>>>> reason: unable to open any HCA ports
>>>>>
>>>>> GASNet
>>>>> gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines
>>>>> returning an error code: GASNET_ERR_RESOURCE (Problem with requested
>>>>> resource)
>>>>>
>>>>> at
>>>>> /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849
>>>>>
>>>>> *** Caught a fatal signal: SIGSEGV(11) on node 1/2
>>>>>
>>>>> NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the
>>>>> environment to generate a backtrace.
>>>>>
>>>>>
>>>>> =====================================================================================
>>>>>
>>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>>>
>>>>> =   EXIT CODE: 139
>>>>>
>>>>> =   CLEANING UP REMAINING PROCESSES
>>>>>
>>>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>>>>
>>>>> =====================================================================================
>>>>>
>>>>> APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 
>>>>> 11)
>>>>>
>>>>> Are there particular additional configuration steps we have to make before
>>>>> being able to run chapel locales properly? Or are we making somehow
>>>>> mistakes?
>>>>>
>>>>> I hope you can help us.
>>>>>
>>>>> Kind regards,
>>>>>
>>>>> Danilo Guerrera
>>>>> Departement Mathematik & Informatik
>>>>> Universität Basel
>>>>> Klingelbergstrasse 50
>>>>> CH-4056 Basel Switzerland
>>>>> email:         [email protected]
>>>>> Phone:      +41 (0)61 267 15 18
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> _______________________________________________
>>> Chapel-bugs mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/chapel-bugs
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Chapel-developers mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/chapel-developers
>
> --
> Rafael Larrosa Jiménez
> Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es
> Universidad de Málaga
>
> EMAIL: [email protected]                  Edificio de Bioinnovación
> TELEF: + 34951952788                    C/ Severo Ochoa 34
> FAX  : +34951952792                             Parque Tecnológico de 
> Andalucía
>                                                                29590 Málaga 
> (SPAIN)
>
>

--
Rafael Larrosa Jiménez
Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es
Universidad de Málaga

EMAIL: [email protected]                  Edificio de Bioinnovación
TELEF: + 34951952788                    C/ Severo Ochoa 34
FAX  : +34951952792                             Parque Tecnológico de Andalucía
                                                                29590 Málaga 
(SPAIN)



------------------------------------------------------------------------------
_______________________________________________
Chapel-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-bugs

Reply via email to