The following simple test code will exercise the following: start_pes()
shmalloc() shmem_int_get() shmem_int_put() shmem_barrier_all() To compile: shmemcc test_shmem.c -o test_shmem To launch: shmemrun -np 2 test_shmem or for those who prefer to launch with SLURM srun -n 2 test_shmem Josh -----Original Message----- From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Wednesday, August 14, 2013 5:32 PM To: Open MPI Developers Subject: Re: [OMPI devel] [EXTERNAL] OpenSHMEM round 2 Can you point me to a test program that would exercise it? I'd like to give it a try first. I'm okay with on by default as it builds its own separate library, and with the RFC On Aug 14, 2013, at 2:03 PM, "Barrett, Brian W" <bwba...@sandia.gov> wrote: > Josh - > > In general, I don't have a strong opinion of whether OpenSHMEM is on > by default or not. It might cause unexpected behavior for some users > (like on Crays, where one should really use Cray's SHMEM), but maybe > it's better on other platforms. > > I also would have no objection to the RFC, provided the segfaults I > found get resolved. > > Brian > > On 8/14/13 2:08 PM, "Joshua Ladd" <josh...@mellanox.com> wrote: > >> Ralph, and Brian >> >> Thanks a bunch for taking the time to review this. It is extremely >> helpful. Let me comment of the building of OSHMEM and solicit some >> feedback from you guys (along with the rest of the community.) >> Originally we had planned to enable OSHMEM to build only if >> '--with-oshmem' flag was passed at configure time. However, >> (unbeknownst to me) this behavior was changed and now OSHMEM is built by >> default, i.e. >> yes, Ralph this is the intended behavior now. I am wondering if this >> is such a good idea. Do folks have a strong opinion on this one way >> or the other? From my perspective I can see arguments for both sides >> of the coin. >> >> Other than cleaning up warnings and resolving the segfault that Brian >> observed are we on a good course to getting this upstream? Is it >> reasonable to file an RFC for three weeks out? >> >> Josh >> >> -----Original Message----- >> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Barrett, >> Brian W >> Sent: Sunday, August 11, 2013 1:42 PM >> To: Open MPI Developers >> Subject: Re: [OMPI devel] [EXTERNAL] OpenSHMEM round 2 >> >> Ralph - >> >> I think those warnings are just because of when they last synced with >> the trunk; it looks like they haven't updated in the last week, when >> those (and some usnic fixes) went in. >> >> More concerning is the --enable-picky stuff and the disabling of >> SHMEM in the right places. >> >> Brian >> >> On 8/11/13 11:24 AM, "Ralph Castain" <r...@open-mpi.org> wrote: >> >>> Turning off the enable_picky, I get it to compile with the following >>> warnings: >>> >>> pget_elements_x_f.c:70: warning: no previous prototype for >>> 'ompi_get_elements_x_f' >>> pstatus_set_elements_x_f.c:70: warning: no previous prototype for >>> 'ompi_status_set_elements_x_f' >>> ptype_get_extent_x_f.c:69: warning: no previous prototype for >>> 'ompi_type_get_extent_x_f' >>> ptype_get_true_extent_x_f.c:69: warning: no previous prototype for >>> 'ompi_type_get_true_extent_x_f' >>> ptype_size_x_f.c:69: warning: no previous prototype for >>> 'ompi_type_size_x_f' >>> >>> I also found that OpenShmem is still building by default. Is that >>> intended? I thought you were only going to build if --with-shmem (or >>> whatever option) was given. >>> >>> Looks like some cleanup is required >>> >>> On Aug 10, 2013, at 8:54 PM, Ralph Castain <r...@open-mpi.org> wrote: >>> >>>> FWIW, I couldn't get it to build - this is on a simple Xeon-based >>>> system under CentOS 6.2: >>>> >>>> cc1: warnings being treated as errors >>>> spml_yoda_getreq.c: In function 'mca_spml_yoda_get_completion': >>>> spml_yoda_getreq.c:98: error: pointer targets in passing argument 1 >>>> of 'opal_atomic_add_32' differ in signedness >>>> ../../../../opal/include/opal/sys/amd64/atomic.h:174: note: >>>> expected 'volatile int32_t *' but argument is of type 'uint32_t *' >>>> spml_yoda_getreq.c:98: error: signed and unsigned type in >>>> conditional expression >>>> cc1: warnings being treated as errors >>>> spml_yoda_putreq.c: In function 'mca_spml_yoda_put_completion': >>>> spml_yoda_putreq.c:81: error: pointer targets in passing argument 1 >>>> of 'opal_atomic_add_32' differ in signedness >>>> ../../../../opal/include/opal/sys/amd64/atomic.h:174: note: >>>> expected 'volatile int32_t *' but argument is of type 'uint32_t *' >>>> spml_yoda_putreq.c:81: error: signed and unsigned type in >>>> conditional expression >>>> make[2]: *** [spml_yoda_getreq.lo] Error 1 >>>> make[2]: *** Waiting for unfinished jobs.... >>>> make[2]: *** [spml_yoda_putreq.lo] Error 1 >>>> cc1: warnings being treated as errors >>>> spml_yoda.c: In function 'mca_spml_yoda_put_internal': >>>> spml_yoda.c:725: error: pointer targets in passing argument 1 of >>>> 'opal_atomic_add_32' differ in signedness >>>> ../../../../opal/include/opal/sys/amd64/atomic.h:174: note: >>>> expected 'volatile int32_t *' but argument is of type 'uint32_t *' >>>> spml_yoda.c:725: error: signed and unsigned type in conditional >>>> expression >>>> spml_yoda.c: In function 'mca_spml_yoda_get': >>>> spml_yoda.c:1107: error: pointer targets in passing argument 1 of >>>> 'opal_atomic_add_32' differ in signedness >>>> ../../../../opal/include/opal/sys/amd64/atomic.h:174: note: >>>> expected 'volatile int32_t *' but argument is of type 'uint32_t *' >>>> spml_yoda.c:1107: error: signed and unsigned type in conditional >>>> expression >>>> make[2]: *** [spml_yoda.lo] Error 1 >>>> make[1]: *** [all-recursive] Error 1 >>>> >>>> Only configure arguments: >>>> >>>> enable_picky=yes >>>> enable_debug=yes >>>> >>>> >>>> gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3) >>>> >>>> >>>> >>>> On Aug 10, 2013, at 7:21 PM, "Barrett, Brian W" >>>> <bwba...@sandia.gov> >>>> wrote: >>>> >>>>> On 8/6/13 10:30 AM, "Joshua Ladd" <josh...@mellanox.com> wrote: >>>>> >>>>>> Dear OMPI Community, >>>>>> >>>>>> Please find on Bitbucket the latest round of OSHMEM changes based >>>>>> on community feedback. Please git and test at your leisure. >>>>>> >>>>>> https://bitbucket.org/jladd_math/mlnx-oshmem.git >>>>> >>>>> Josh - >>>>> >>>>> In general, I think everything looks ok. However, the "right" >>>>> thing doesn't happen if the CM PML is used (at least, when using >>>>> the Portals >>>>> 4 >>>>> MTL). When configured with: >>>>> >>>>> ./configure >>>>> --enable-mca-no-build=pml-ob1,pml-bfo,pml-v,btl,bml,mpool >>>>> >>>>> The build segfaults trying to run a SHMEM program: >>>>> >>>>> mpirun -np 2 ./bcast >>>>> [shannon:90397] *** Process received signal *** [shannon:90397] >>>>> Signal: Segmentation fault (11) [shannon:90397] Signal code: >>>>> Address not mapped (1) [shannon:90397] Failing at address: (nil) >>>>> [shannon:90398] *** Process received signal *** [shannon:90398] >>>>> Signal: Segmentation fault (11) [shannon:90398] Signal code: >>>>> Address not mapped (1) [shannon:90398] Failing at address: (nil) >>>>> [shannon:90397] [ 0] /lib64/libpthread.so.0() [0x38b7a0f4a0] >>>>> [shannon:90397] *** End of error message *** [shannon:90398] [ 0] >>>>> /lib64/libpthread.so.0() [0x38b7a0f4a0] [shannon:90398] *** End of >>>>> error message *** >>>>> >>>>> ------------------------------------------------------------------ >>>>> --- >>>>> --- >>>>> -- >>>>> mpirun noticed that process rank 1 with PID 90398 on node shannon >>>>> exited on signal 11 (Segmentation fault). >>>>> >>>>> ------------------------------------------------------------------ >>>>> --- >>>>> --- >>>>> -- >>>>> >>>>> >>>>> >>>>> Brian >>>>> >>>>> -- >>>>> Brian W. Barrett >>>>> Scalable System Software Group >>>>> Sandia National Laboratories >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> >> -- >> Brian W. Barrett >> Scalable System Software Group >> Sandia National Laboratories >> >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > > -- > Brian W. Barrett > Scalable System Software Group > Sandia National Laboratories > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel _______________________________________________ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
#include "shmem.h" #include "stdio.h" #define N 100 static int target[N]; static int source[N]; #define STATIC_CHECK 1 #define DYNAMIC_CHECK 1 #define ATOMIC 1 #define PEER 1 int main() { int *source_d,*target_d; int i; start_pes(0); source_d = shmalloc(sizeof(*source_d)*N); target_d = shmalloc(sizeof(*target_d)*N); for (i = 0; i < N; i++) { source_d[i] = source[i] = 1; target[i] = target_d[i] = 9; } int peer = PEER; if (_my_pe() == 0) { #if STATIC_CHECK int c, f; int a = c, b = f; #if ATOMIC for (i = 0; i < N; i++) target[i] = shmem_int_g(source + i, peer); #else shmem_int_get(target, source, N, PEER); #endif #endif #if DYNAMIC_CHECK #if ATOMIC for (i = 0; i < N; i++) { target_d[i] = shmem_int_g(source_d + i, peer); } #else shmem_int_get(target_d, source_d, N, PEER); #endif #endif } if(_my_pe() == 0) { for (i = 0; i < N; i++) { #if DYNAMIC_CHECK if(target_d[i] != 1) { printf("Get dynamic error %d, target + i = %p, target[0] = %d, target[1] = %d\n",i, target_d + i,target_d[0], target_d[1]); fflush(stdout); return 1; #endif #if STATIC_CHECK if (target[i] != 1) { printf("Get static error %d, target + i = %p, target[i] = %d\n",i, target + i,target[i]); fflush(stdout); return 1; } #endif } } } /*put check*/ for (i = 0; i < N; i++) { source_d[i] = source[i] = 1; target[i] = target_d[i] = -9; } shmem_barrier_all(); if (_my_pe() == 0) { #if STATIC_CHECK shmem_int_put(target, source, N, PEER); #endif #if DYNAMIC_CHECK shmem_int_put(target_d, source_d, N, PEER); #endif } shmem_barrier_all(); if(_my_pe() == PEER) { for (i = 0; i < N; i++) { #if DYNAMIC_CHECK if(target_d[i] != 1) { printf("Put dynamic error\n"); fflush(stdout); return 1; } #endif #if STATIC_CHECK if (target[i] != 1) { printf("Put static error\n"); fflush(stdout); return 1; } #endif } } printf("All test passed\n");fflush(stdout); return 0; }