Great, thanks for checking!
Based on that, my thought is that for ibv we should use segment fast
(rationale: it's faster; it's more similar to what we do elsewhere and we
test it more) and use the configure flags below to make it work. Rafael
(or anyone), do you see any reasons not to do this?
If not, Rafael, would this be a fairly simple patch for you to write,
test, and submit? (we don't have easy access to an ibv cluster, which is
part of the reason we haven't run into this ourselves). I think it should
be as simple as adding a case to util/chplenv/commSegment and then adding
some logic to third-party/gasnet/Makefile that adds the configuration
options based on CHPL_MAKE_COMM_SUBSTRATE being set to ibv. (?)
Thanks,
-Brad
On Fri, 7 Feb 2014, Rafael Larrosa Jiménez wrote:
Hi,
Hi Rafael --
Here's another tip from the GASNet team. Would you be able to give this a
try and see if it helps with the ibv + fast?
----
The Bad Address sounds like an interaction between IBV and POSIX Shared
Memory:
https://upc-bugs.lbl.gov/bugzilla/show_bug.cgi?id=2929
The work-around is to configure GASNet with "--enable-pshm-sysv
--disable-pshm-posix".
----
Yes, it works fine with those changes, sorry for the delay.
Greets,
Rafael
Thanks,
-Brad
On Thu, 30 Jan 2014, Brad Chamberlain wrote:
Hi Rafael --
Thanks for your analysis and description.
I don't have any problem with changing the default segment for 'ibv' and
agree that it sounds like the right thing to do if it works around this
issue. It seems like 'fast' would be preferable to 'large' from a
performance perspective (based on a quick glance at the docs -- I'm mostly
unfamiliar with 'large').
In trying to use the 'fast' segment, do you know whether the memory
allocator used changed to either tcmalloc or dlmalloc (either
automatically or manually)? That is probably necessary for correctness,
but should've (hopefully) happened automatically. printchplenv can be
used to verify.
Note that I believe you should only need to set CHPL_GASNET_SEGMENT. The
CHPL_MAKE_ variables should follow automatically (and are not intended to
be set by the end-user).
Thanks again,
-Brad
On Thu, 30 Jan 2014, Rafael Larrosa Jiménez wrote:
El Jueves, 30 de enero de 2014 12:21:11 Akihiro Hayashi escribió:
Hi,
Thanks for your helping me to understand the problem.
I think now I understand the problem correctly.
Rafael Asenjo and Rafael Larrosa.
I'm pretty sure my assignment have involved a contiguous block and
useBulkTransfer is active and useBulkTransferStride is not active in my
chapel compiler. I just wanted to make both sure my code
(useBulkTransfer)
and Rafaerl Larrosa's code (useBulkTransferStride) can fail due to the
ibv-conduit bug (https://upc-bugs.lbl.gov/bugzilla/show_bug.cgi?id=495).
First, tell that I'm not the developer, but a client of that code.
Second, I have been writting this email for several hours, as I changed
my
vision on the problem, have been reading the ib verbs manuals, ib code,
etc.
At the end I have found the solution, by default chapel puts all memory
as
memory accesible by RDMA, which seems to be the problem, as it can be
"unpinned, if instead only "large" portions are used for RDMA then it
works
fine, as explained in one of the messages:
---
In a GASNET_SEGMENT_FAST or _LARGE configuration the segment is obtained
at
startup via mmap() and is never unmapped. So, sending from inside the
GASNet
segment in these two cases will ensure this bug cannot occur.
---
So if you define :
export CHPL_MAKE_GASNET_SEGMENT=large
export CHPL_MAKE_COMM_SEGMENT=large
export CHPL_GASNET_SEGMENT=large
Then do a make clobber followed by a make and recompile your program, it
should work fine.
BTW, when using portals, gemini or aries, the fast segment is used, as
explained in :
doc/release/README.multilocale
---
3) Advanced GASNet users can set CHPL_GASNET_SEGMENT to choose a
memory segment to use with GASNet. Current defaults are:
When CHPL_COMM_SUBSTRATE is... Chapel will choose...
portals fast
gemini fast
(other) everything
---
But using fast with ibv gives this error:
*** FATAL ERROR: Unexpected error Bad address (rc=1 errno=14) when
registering
the segment
But that is not a problem as large segments are the solution, hope they
don't
break other things :-)
I have tested it with Akihiro code, and it seems to work fine now.
Perphaps is a good idea to change the default to use large segments with
ibv.
Greets,
Rafael
--
Rafael Larrosa Jiménez
Centro de Supercomputación y Bioinformática - http://www.scbi.uma.es
Universidad de Málaga
EMAIL: [email protected] Edificio de Bioinnovación
TELEF: + 34951952788 C/ Severo Ochoa 34
FAX : +34951952792 Parque Tecnológico de Andalucía
29590 Málaga (SPAIN)
------------------------------------------------------------------------------
Android™ apps run on BlackBerry®10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience. Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers