For this thread's reference, here's a follow-up from Danilo that I hadn't
found prior to sending:

Good afternoon Mr. Chamberlain,

I went through these errors and now I'm able to compile and run the hello-world with locales. The problem is that even if I provide GASNET_SSH_SERVERS with 2 node to be used I only and always get as an output 2 prints from the same node, so somehow it's not going through the infiniband and executing 2 locales on the same machine. If you have any suggestion I would appreciate it, anyway I will open an issue in the mailing list hoping to find a solution so that we can use chapel locales in our course.


I think the original suggestion still holds, but this behavior may be familiar to others... It sounds vaguely familiar to me, but not enough for the solution to leap into my hands.

-Brad



On Wed, 2 Apr 2014, Brad Chamberlain wrote:


Hi Danilo --

I don't personally have enough experience with GASNet over ibv to immediately recognize this error, but am Cc:ing the public chapel-bugs list and chapel-developers list in case someone else does (Rafael?).

[Note that your responses to chapel-developers will bounce unless you're subscribed, but you should be able to post to chapel-bugs]

it seems likely that the problem is with your GASNet installation rather than something Chapel specific, so in cases like this, it's often helpful to run GASNet's test suite that it ships with. To do so, cd to the subdirectory of $CHPL_HOME/third-party/gasnet/build/ that corresponds to your configuration (e.g., mine would be linux64-gnu/seg-fast/nodbg) and then do 'make run-tests-par' or 'make run-tests' (see third-party/gasnet/GASNet-*/README for more information).

If that doesn't point out the problem, you may also want to turn on GASNet's internal debugging assertion checks by setting the environment variable CHPL_COMM_DEBUG and remaking. This will create a sibling to 'nodbg' in the path above called 'debug' and will often be more verbose about what's going wrong.

Hope this is helpful,
-Brad




---------- Forwarded message ----------
Date: Wed, 2 Apr 2014 04:28:45 -0500
From: Danilo Guerrera <[email protected]>
To: "[email protected]" <[email protected]>
Subject: problem with chapel locales

Good morning Mr. Chamberlain,

I'm a PhD student at University of Basel in the High Performance and Web Computing Group led by prof. H. Burkhart. We introduced Chapel in our High Performance Computing course and now wanted to exploit our cluster and try the locales. We have an infiniband interconnection so that our first choice wad to set CHPL_COMM_SUBSTRATE to ibv. We followed the simple steps shown in the $CHPL_HOME/doc/README.multilocale file, recompiled chapel and then exported the environment variables as shown at point 5 of the README, in particular giving
export GASNET_SSH_SERVERS="minta mintb" as locales to be used.
We were able to compile the hello6-taskpar-dist.chpl example, but when running it with the syntax
./hello6-taskpar-dist -nl 2

we get this error:


GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem with requested resource)

at /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606

 reason: unable to open any HCA ports

GASNet gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines returning an error code: GASNET_ERR_RESOURCE (Problem with requested resource)

at /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849

*** Caught a fatal signal: SIGSEGV(11) on node 0/2

NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.

GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem with requested resource)

at /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606

 reason: unable to open any HCA ports

GASNet gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines returning an error code: GASNET_ERR_RESOURCE (Problem with requested resource)

at /usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849

*** Caught a fatal signal: SIGSEGV(11) on node 1/2

NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment to generate a backtrace.


=====================================================================================

=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES

=   EXIT CODE: 139

=   CLEANING UP REMAINING PROCESSES

=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES

=====================================================================================

APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)

Are there particular additional configuration steps we have to make before being able to run chapel locales properly? Or are we making somehow mistakes?

I hope you can help us.

Kind regards,

Danilo Guerrera
Departement Mathematik & Informatik
Universität Basel
Klingelbergstrasse 50
CH-4056 Basel Switzerland
email:         [email protected]
Phone:      +41 (0)61 267 15 18
------------------------------------------------------------------------------
_______________________________________________
Chapel-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-bugs

Reply via email to