Re: [OMPI devel] RFC: Add GPU Direct RDMA support to openib btl

2013-10-08 Thread Kenneth A. Lloyd
+1



From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart
Sent: Tuesday, October 08, 2013 3:05 PM
To: de...@open-mpi.org
Subject: [OMPI devel] RFC: Add GPU Direct RDMA support to openib btl



WHAT: Add GPU Direct RDMA support to openib btl 

WHY: Better latency for small GPU message transfers

WHERE: Several files, see ticket for list

WHEN: Friday,  October 18, 2013 COB 

More detail: 

This RFC looks to make use of GPU Direct RDMA support that is coming in the
future in Mellanox libraries.  With GPU Direct RDMA, we can register GPU
memory with the ibv_reg_mr() calls.  Therefore, we are simply piggy backing
on the large message RDMA support (RGET) that exists in the PML and openib
BTL.  For best performance, we want to use the RGET protocol at small
messages and the switch to a pipeline protocol at larger messages.



To make use of this, we add some extra code paths that are followed when
moving GPU buffers.   If we have the support compiled in, then when we
detect we have a GPU buffer, we use the RGET protocol even for small
messages.   When the messages get larger, we switch to using the regular
pipeline protocol.  There is some other support code that is added as well.
We add a flag to any GPU memory that is registered so we can check for
cuMemAlloc/cuMemFree/cuMemAlloc issues.  Each GPU has a buffer ID associated
with it, so we can ensure that any registrations in the rcache are still
valid.   



To view the changes, go to https://svn.open-mpi.org/trac/ompi/ticket/3836
and click on gdr.diff
 .





  _  

This email message is for the sole use of the intended recipient(s) and may
contain confidential information.  Any unauthorized review, use, disclosure
or distribution is prohibited.  If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of the
original message. 

  _  



[OMPI devel] RFC: Add GPU Direct RDMA support to openib btl

2013-10-08 Thread Rolf vandeVaart
WHAT: Add GPU Direct RDMA support to openib btl
WHY: Better latency for small GPU message transfers
WHERE: Several files, see ticket for list
WHEN: Friday,  October 18, 2013 COB
More detail:
This RFC looks to make use of GPU Direct RDMA support that is coming in the 
future in Mellanox libraries.  With GPU Direct RDMA, we can register GPU memory 
with the ibv_reg_mr() calls.  Therefore, we are simply piggy backing on the 
large message RDMA support (RGET) that exists in the PML and openib BTL.  For 
best performance, we want to use the RGET protocol at small messages and the 
switch to a pipeline protocol at larger messages.

To make use of this, we add some extra code paths that are followed when moving 
GPU buffers.   If we have the support compiled in, then when we detect we have 
a GPU buffer, we use the RGET protocol even for small messages.   When the 
messages get larger, we switch to using the regular pipeline protocol.  There 
is some other support code that is added as well.  We add a flag to any GPU 
memory that is registered so we can check for cuMemAlloc/cuMemFree/cuMemAlloc 
issues.  Each GPU has a buffer ID associated with it, so we can ensure that any 
registrations in the rcache are still valid.

To view the changes, go to https://svn.open-mpi.org/trac/ompi/ticket/3836 and 
click on 
gdr.diff.



---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [OMPI devel] 1.7.x support statement

2013-10-08 Thread marco atzeri

Il 10/8/2013 4:42 PM, Jeff Squyres (jsquyres) ha scritto:



Could you also run the built-in "trivial" suite?  It's just hello world and MPI 
ring in C, C++, Fortran (if you have C++/Fortran support).

FWIW: this is basically what Absoft does for us: they do the builds, and then 
build/run the trivial suite just to make sure it's nominally working.



attached the only dump I found plus the log of the run on 1.7.2.
Is this the expected results?  It seems a bit thin.

Regards
Marco



$VAR1 = {
  'trivial' => {
'append_path' => undef,
'result_message' => 'Success',
'full_section_name' => 'test get: trivial',
'test_result' => 1,
'module_data' => {
  'directory' => '/tmp/scratch1/sources/test_get__trivial'
},
'prepare_for_install' => 'MTT::Common::Copytree::PrepareForInstall',
'prepend_path' => undef,
'setenv' => undef,
'refcount' => 0,
'simple_section_name' => 'trivial',
'have_new' => 1,
'module_name' => 'MTT::Test::Get::Trivial',
'start_timestamp' => 1381258364,
'unsetenv' => undef
  }
};
*** MTT: ./client/mtt --scratch /tmp/scratch1 --file ../trivial.ini --verbose 
--print-time
*** Running on IT-Marco-Atzeri
*** Main scratch tree: /tmp/scratch1
*** Fast scratch tree: /tmp/scratch1
*** Reporter initializing
*** Reporter initialized
*** MPI Get phase starting
*** MPI Get phase complete
>> Phase: MPI Get
   Started:   Tue Oct  8 21:19:14 2013
   Stopped:   Tue Oct  8 21:19:14 2013
   Elapsed:   00:00:00 0.02u 0.02s
   Total elapsed: 00:00:00 0.02u 0.02s
*** MPI Install phase starting
*** MPI Install phase complete
>> Phase: MPI Install
   Started:   Tue Oct  8 21:19:14 2013
   Stopped:   Tue Oct  8 21:19:14 2013
   Elapsed:   00:00:00 0.00u 0.00s
   Total elapsed: 00:00:00 0.02u 0.02s
*** Test Get phase starting
>> Test Get: [test get: trivial]
   Checking for new test sources...
   Failed to get new test sources: File already exists: hello.c
*** Test Get phase complete
>> Phase: Test Get
   Started:   Tue Oct  8 21:19:14 2013
   Stopped:   Tue Oct  8 21:19:15 2013
   Elapsed:   00:00:01 0.02u 0.00s
   Total elapsed: 00:00:01 0.05u 0.03s
*** Test Build phase starting
>> Test Build [test build: trivial]
*** Test Build phase complete
>> Phase: Test Build
   Started:   Tue Oct  8 21:19:15 2013
   Stopped:   Tue Oct  8 21:19:15 2013
   Elapsed:   00:00:00 0.00u 0.00s
   Total elapsed: 00:00:01 0.05u 0.03s
*** Test Run phase starting
>> Test Run [trivial]
*** Run test phase complete
>> Phase: Test Run
   Started:   Tue Oct  8 21:19:15 2013
   Stopped:   Tue Oct  8 21:19:16 2013
   Elapsed:   00:00:01 0.00u 0.03s
   Total elapsed: 00:00:02 0.05u 0.06s
>> Phase: Trim
   Started:   Tue Oct  8 21:19:16 2013
   Stopped:   Tue Oct  8 21:19:16 2013
   Elapsed:   00:00:00 0.00u 0.00s
   Total elapsed: 00:00:02 0.05u 0.06s
*** Reporter finalizing
*** Reporter finalized


[OMPI devel] Changes to classes in OMPI

2013-10-08 Thread Ralph Castain
Hi folks

This was one item from the last devel meeting that can be done independent of 
other things:

• resolution: all opal and orte (and possibly ompi) classes 
need to have a thread safe and thread-free interface
• _st suffix: single thread (i.e., not thread safe 
variant)
• _mt suffix: multi thread (i.e., thread safe variant)
• for functions that have both st/mt, they will 
*both* have suffixes
• other functions (that do not have st/mt 
versions) will be naked names
• need to rename all classes that have locking enabled 
already (e.g., opal_free_list)
• so today, we go rename all the functions (e.g., 
opal_free_list functions get _mt suffix) throughout the code base
• as someone needs the _st version, they go create it 
and use it as they want to
• Ralph will do the orte classes
• Aurelien will do this for the ompi classes

I believe some of these have been done - I will take care of the ORTE classes 
this week, so consider this a "heads up" for that change.
Ralph



[OMPI devel] Moving BTLs to OPAL

2013-10-08 Thread Ralph Castain
During today's telecon, we discussed the need to continue making progress on 
full support for MPI_THREAD_MULTIPLE. The next obstacle is to have the BTLs 
moved down to the OPAL layer. This is viewed as a "blocker" to any further 
progress on this issue.

We know that UTK was working on this, but may have hit a roadblock since we 
haven't heard or seen anything. Anyone from there able to give us an update? Is 
there something that is creating the roadblock that we can help resolve?

If we can get access to your repo, or a patch from your current state, I'm 
willing to complete the work. I can start from scratch if necessary, but that 
seems wasteful as I believe you're pretty close.

Either way, we need to get this blocker out of the way as the thread_multiple 
support needs to get done. Several of us are going to review the notes from the 
last meeting and execute the parts that we can in the interim.

Thanks
Ralph





[OMPI devel] December dev meeting

2013-10-08 Thread Jeff Squyres (jsquyres)
On the call today, we decided the following:

 1. Let's have a meeting in December after the Forum meeting.

 2. I put up a wiki page: https://svn.open-mpi.org/trac/ompi/wiki/Dec13Meeting

>> Sign up on the wiki page if you're going to attend
>> Put up agenda items on the wiki page
>> Primary focus for the meeting will be threading
   ...so try to knock out your to-do items from 
https://svn.open-mpi.org/trac/ompi/wiki/Jun13Meeting before this Dec meeting

 3. We'll go from 2pm Thursday to 3pm Friday (although 2pm might be ambitious 
-- we have to get from downtown to Rosemont, which will take at least an 
hour...)

 4. We'll probably also want to have another meeting in Q1 CY2014 because a 
single 24 hour block isn't a whole lot of time; it won't be enough to cover all 
topics.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.7.x support statement

2013-10-08 Thread Jeff Squyres (jsquyres)
On Oct 6, 2013, at 3:13 PM, marco atzeri  wrote:

> automating build on my hardware is a bit too much ;-)
> as it is just a normal notebook and building on cygwin is a
> slow process.

Fair enough.

Any chance you could script this up (i.e., so you can trivially run with a 
single command, like "run-ompi-mtt"), and run it every once in a while?  Maybe 
once a week or something?  :-D

Even if you run less frequently than that, we can get you an MTT database 
password so that your results will show up on the web reporter for 
mtt.open-mpi.org.

> Attached the outcome of testing with the default developer.ini
> and the installed 1.7.2.
> I assume mtt is working as expected. Correct ?

Yes.

Could you also run the built-in "trivial" suite?  It's just hello world and MPI 
ring in C, C++, Fortran (if you have C++/Fortran support).

FWIW: this is basically what Absoft does for us: they do the builds, and then 
build/run the trivial suite just to make sure it's nominally working.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI devel] More oshmem compile errors

2013-10-08 Thread Jeff Squyres (jsquyres)
With icc, getting errors about pointer math with (void*) types.  See attached.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/
Making all in include
make[1]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/include'
make  all-am
make[2]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/include'
make[2]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/include'
make[1]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/include'
Making all in shmem/c
make[1]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/shmem/c'
Making all in profile
make[2]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/shmem/c/profile'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/shmem/c/profile'
make[2]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/shmem/c'
make[2]: Nothing to be done for `all-am'.
make[2]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/shmem/c'
make[1]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/shmem/c'
Making all in shmem/fortran
make[1]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/shmem/fortran'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/shmem/fortran'
Making all in mca/atomic
make[1]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/mca/atomic'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/mca/atomic'
Making all in mca/memheap
make[1]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/mca/memheap'
  CC   base/memheap_base_alloc.lo
base/memheap_base_alloc.c(248): error #1338: arithmetic on pointer to void or 
function type
  s->end = s->start + s->size;
^

base/memheap_base_alloc.c(300): error #1338: arithmetic on pointer to void or 
function type
  s->end = s->start + s->size;
^

compilation aborted for base/memheap_base_alloc.c (code 2)
make[1]: *** [base/memheap_base_alloc.lo] Error 1
  CC   base/memheap_base_static.lo
base/memheap_base_static.c(65): error #1338: arithmetic on pointer to void or 
function type
  s->size = s->end - s->start;
   ^

base/memheap_base_static.c(70): error #1338: arithmetic on pointer to void or 
function type
  total_mem += s->end - s->start;
  ^

compilation aborted for base/memheap_base_static.c (code 2)
make[1]: *** [base/memheap_base_static.lo] Error 1
  CC   base/memheap_base_register.lo
base/memheap_base_register.c(29): error #1338: arithmetic on pointer to void or 
function type
  MEMHEAP_VERBOSE(5,
  ^

base/memheap_base_register.c(54): error #1338: arithmetic on pointer to void or 
function type
  MEMHEAP_VERBOSE(5,
  ^

base/memheap_base_register.c(112): error #1338: arithmetic on pointer to void 
or function type
  s->mkeys = MCA_SPML_CALL(register((void *)(unsigned long)s->start,
 ^

compilation aborted for base/memheap_base_register.c (code 2)
make[1]: *** [base/memheap_base_register.lo] Error 1
  CC   base/memheap_base_mkey.lo
base/memheap_base_mkey.c(531): error #1338: arithmetic on pointer to void or 
function type
  return (void*) (remote_base > local_base ? (uintptr_t)va + (remote_base - 
local_base) :
  ^

base/memheap_base_mkey.c(532): error #1338: arithmetic on pointer to void or 
function type
  (uintptr_t)va - (local_base - remote_base));
  ^

base/memheap_base_mkey.c(602): error #1338: arithmetic on pointer to void or 
function type
  return ((s && s->is_active) ? (rva - s->mkeys_cache[pe][tr_id].va_base) : 
0);
 ^

base/memheap_base_mkey.c(621): error #1338: arithmetic on pointer to void or 
function type
 && (uintptr_t)va < (uintptr_t) (s->start + 
mca_memheap.memheap_size)) {
  ^

base/memheap_base_mkey.c(624): error #1338: arithmetic on pointer to void or 
function type
  assert( (uintptr_t)va >= (uintptr_t)(s->start + 
mca_memheap.memheap_size) && (uintptr_t)va < (uintptr_t)s->end);
  ^

compilation aborted for base/memheap_base_mkey.c (code 2)
make[1]: *** [base/memheap_base_mkey.lo] Error 1
make[1]: Target `all' not remade because of errors.
make[1]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/mca/memheap'
Making all in mca/scoll
make[1]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/mca/scoll'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/jsquyres/svn/ompi2/oshmem/mca/scoll'
Making all in mca/spml
make[1]: Entering directory `/home/jsquyres/svn/ompi2/oshmem/mca/spml'
make[1]: Nothing to be done for `all'.
make[1]: 

[OMPI devel] 32 bit build breakage in oshmem

2013-10-08 Thread Jeff Squyres (jsquyres)
On RHEL6.4 with:

"CFLAGS=-g -pipe -m32" CXXFLAGS=-m32 FFLAGS=-m32 FCFLAGS=-m32 
--with-wrapper-cflags=-m32 --with-wrapper-cxxflags=-m32 
--with-wrapper-fflags=-m32 --with-wrapper-fcflags=-m32 
--enable-mpirun-prefix-by-default --disable-dlopen --enable-mpi-cxx

I get a ton of compile errors in oshmem/op/op.c, like this one:

-
op/op.c:194:1: note: in expansion of macro 'FUNC_OP_CREATE'
 FUNC_OP_CREATE(max, freal16, ompi_fortran_real16_t, __max_op);
op/op.c:135:15: error: 'b' undeclared (first use in this function)
 type *b = (type *) out; \
   ^
-

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/