For this thread's reference, here's a follow-up from Danilo that I hadn't
found prior to sending:
Good afternoon Mr. Chamberlain,
I went through these errors and now I'm able to compile and run the
hello-world with locales. The problem is that even if I provide
GASNET_SSH_SERVERS with 2 node to be used I only and always get as an
output 2 prints from the same node, so somehow it's not going through the
infiniband and executing 2 locales on the same machine. If you have any
suggestion I would appreciate it, anyway I will open an issue in the
mailing list hoping to find a solution so that we can use chapel locales
in our course.
I think the original suggestion still holds, but this behavior may be
familiar to others... It sounds vaguely familiar to me, but not enough
for the solution to leap into my hands.
-Brad
On Wed, 2 Apr 2014, Brad Chamberlain wrote:
Hi Danilo --
I don't personally have enough experience with GASNet over ibv to immediately
recognize this error, but am Cc:ing the public chapel-bugs list and
chapel-developers list in case someone else does (Rafael?).
[Note that your responses to chapel-developers will bounce unless you're
subscribed, but you should be able to post to chapel-bugs]
it seems likely that the problem is with your GASNet installation rather than
something Chapel specific, so in cases like this, it's often helpful to run
GASNet's test suite that it ships with. To do so, cd to the subdirectory of
$CHPL_HOME/third-party/gasnet/build/ that corresponds to your configuration
(e.g., mine would be linux64-gnu/seg-fast/nodbg) and then do 'make
run-tests-par' or 'make run-tests' (see third-party/gasnet/GASNet-*/README
for more information).
If that doesn't point out the problem, you may also want to turn on GASNet's
internal debugging assertion checks by setting the environment variable
CHPL_COMM_DEBUG and remaking. This will create a sibling to 'nodbg' in the
path above called 'debug' and will often be more verbose about what's going
wrong.
Hope this is helpful,
-Brad
---------- Forwarded message ----------
Date: Wed, 2 Apr 2014 04:28:45 -0500
From: Danilo Guerrera <[email protected]>
To: "[email protected]" <[email protected]>
Subject: problem with chapel locales
Good morning Mr. Chamberlain,
I'm a PhD student at University of Basel in the High Performance and Web
Computing Group led by prof. H. Burkhart. We introduced Chapel in our High
Performance Computing course and now wanted to exploit our cluster and try
the locales. We have an infiniband interconnection so that our first choice
wad to set CHPL_COMM_SUBSTRATE to ibv. We followed the simple steps shown in
the $CHPL_HOME/doc/README.multilocale file, recompiled chapel and then
exported the environment variables as shown at point 5 of the README, in
particular giving
export GASNET_SSH_SERVERS="minta mintb" as locales to be used.
We were able to compile the hello6-taskpar-dist.chpl example, but when
running it with the syntax
./hello6-taskpar-dist -nl 2
we get this error:
GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem
with requested resource)
at
/usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606
reason: unable to open any HCA ports
GASNet
gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines
returning an error code: GASNET_ERR_RESOURCE (Problem with requested
resource)
at
/usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849
*** Caught a fatal signal: SIGSEGV(11) on node 0/2
NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment
to generate a backtrace.
GASNet gasnetc_init returning an error code: GASNET_ERR_RESOURCE (Problem
with requested resource)
at
/usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1606
reason: unable to open any HCA ports
GASNet
gasnet_init_GASNET_PARnopshmEVERYTHINGnodebugnotracenostatsnodebugmallocnosrclines
returning an error code: GASNET_ERR_RESOURCE (Problem with requested
resource)
at
/usr/local/chapel-1.8.0/third-party/gasnet/GASNet-1.20.2/vapi-conduit/gasnet_core.c:1849
*** Caught a fatal signal: SIGSEGV(11) on node 1/2
NOTICE: Before reporting bugs, run with GASNET_BACKTRACE=1 in the environment
to generate a backtrace.
=====================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
Are there particular additional configuration steps we have to make before
being able to run chapel locales properly? Or are we making somehow mistakes?
I hope you can help us.
Kind regards,
Danilo Guerrera
Departement Mathematik & Informatik
Universität Basel
Klingelbergstrasse 50
CH-4056 Basel Switzerland
email: [email protected]
Phone: +41 (0)61 267 15 18
------------------------------------------------------------------------------
_______________________________________________
Chapel-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-bugs