[hwloc-devel] Create success (hwloc git dev-239-gfe0111e)
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc dev-239-gfe0111e Start time: Mon Sep 29 21:02:51 EDT 2014 End time: Mon Sep 29 21:04:18 EDT 2014 Your friendly daemon, Cyrador
[OMPI devel] Problem on MPI_Type_create_resized and multiple BTL modules
Hi George, Thank you for attending the meeting at Kyoto. As we talked at the meeting, my colleague suffers from a datatype problem. See attached create_resized.c. It creates a datatype with an LB marker using MPI_Type_create_struct and MPI_Type_create_resized. Expected contents of the output file (received_data) is: 0: t1 = 0.1, t2 = 0.2 1: t1 = 1.1, t2 = 1.2 2: t1 = 2.1, t2 = 2.2 3: t1 = 3.1, t2 = 3.2 4: t1 = 4.1, t2 = 4.2 ... snip ... 1995: t1 = 1995.1, t2 = 1995.2 1996: t1 = 1996.1, t2 = 1996.2 1997: t1 = 1997.1, t2 = 1997.2 1998: t1 = 1998.1, t2 = 1998.2 1999: t1 = 1999.1, t2 = 1999.2 But if you run the program many times with multiple BTL modules and with their small eager_limit and small max_send_size, you'll see on some run: 0: t1 = 0.1, t2 = 0.2 1: t1 = 1.1, t2 = 1.2 2: t1 = 2.1, t2 = 2.2 3: t1 = 3.1, t2 = 3.2 4: t1 = 4.1, t2 = 4.2 ... snip ... 470: t1 = 470.1, t2 = 470.2 471: t1 = 471.1, t2 = 471.2 472: t1 = 472.1, t2 = 472.2 473: t1 = 473.1, t2 = 473.2 474: t1 = 474.1, t2 = 0<-- broken! 475: t1 = 0, t2 = 475.1 476: t1 = 0, t2 = 476.1 477: t1 = 0, t2 = 477.1 ... snip ... 1995: t1 = 0, t2 = 1995.1 1996: t1 = 0, t2 = 1996.1 1997: t1 = 0, t2 = 1997.1 1998: t1 = 0, t2 = 1998.1 1999: t1 = 0, t2 = 1999.1 The index of the array at which data start to break (474 in the above case) may change on every run. Same result appears on both trunk and v1.8.3. You can reproduce this with the following options if you have multiple IB HCAs. -n 2 --mca btl self,openib --mca btl_openib_eager_limit 256 --mca btl_openib_max_send_size 384 Or if you don't have multiple NICs, with the following options. -n 2 --host localhost --mca btl self,sm,vader --mca btl_vader_exclusivity 65536 --mca btl_vader_eager_limit 256 --mca btl_vader_max_send_size 384 --mca btl_sm_exclusivity 65536 --mca btl_sm_eager_limit 256 --mca btl_sm_max_send_size 384 My colleague found that OPAL convertor on the receiving process seems to add the LB value twice for out-of-order arrival of fragments when computing the receive buffer write-offset. He created the patch bellow. Our program works fine with this patch but we don't know this is a correct fix. Could you see this issue? Index: opal/datatype/opal_convertor.c === --- opal/datatype/opal_convertor.c (revision 32807) +++ opal/datatype/opal_convertor.c (working copy) @@ -362,11 +362,11 @@ if( OPAL_LIKELY(0 == count) ) { pStack[1].type = pElems->elem.common.type; pStack[1].count= pElems->elem.count; -pStack[1].disp = pElems->elem.disp; +pStack[1].disp = 0; } else { pStack[1].type = OPAL_DATATYPE_UINT1; pStack[1].count = pData->size - count; -pStack[1].disp = pData->true_lb + count; +pStack[1].disp = count; } pStack[1].index= 0; /* useless */ Best regards, Takahiro Kawashima, MPI development team, Fujitsu /* np=2 */ #include #include #include struct structure { double not_transfered; double transfered_1; double transfered_2; }; int main(int argc, char *argv[]) { int i, n = 2000, myrank; struct structure *data; MPI_Datatype struct_type, temp_type; MPI_Datatype types[2] = {MPI_DOUBLE, MPI_DOUBLE}; int blocklens[2] = {1, 1}; MPI_Aint disps[3]; MPI_Init(, ); MPI_Comm_rank(MPI_COMM_WORLD, ); data = malloc(sizeof(data[0]) * n); if (myrank == 0) { for (i = 0; i < n; i++) { data[i].transfered_1 = i + 0.1; data[i].transfered_2 = i + 0.2; } } MPI_Get_address([0].transfered_1, [0]); MPI_Get_address([0].transfered_2, [1]); MPI_Get_address([0], [2]); disps[1] -= disps[2]; /* 8 */ disps[0] -= disps[2]; /* 16 */ MPI_Type_create_struct(2, blocklens, disps, types, _type); MPI_Type_create_resized(temp_type, 0, sizeof(data[0]), _type); MPI_Type_commit(_type); if (myrank == 0) { MPI_Send(data, n, struct_type, 1, 0, MPI_COMM_WORLD); } else if (myrank == 1) { MPI_Recv(data, n, struct_type, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE); } MPI_Type_free(_type); MPI_Type_free(_type); if (myrank == 1) { FILE *fp; fp = fopen("received_data", "w"); for (i = 0; i < n; i++) { fprintf(fp, "%d: t1 = %g, t2 = %g\n", i, data[i].transfered_1, data[i].transfered_2); } fclose(fp); } free(data); MPI_Finalize(); return 0; } Index: opal/datatype/opal_convertor.c === --- opal/datatype/opal_convertor.c (revision 32807) +++ opal/datatype/opal_convertor.c (working copy) @@ -362,11 +362,11 @@ if( OPAL_LIKELY(0 == count) ) {
Re: [OMPI devel] --enable-visibility ( OPAL_C_HAVE_VISIBILITY) behavior in trunk
I see behavioral difference between 1.8.x and trunk for OPAL_C_HAVE_VISIBILITY definition on same build environment. is this expected? -Devendar -Original Message- From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff Squyres (jsquyres) Sent: Monday, September 29, 2014 4:25 PM To: Open MPI Developers List Subject: Re: [OMPI devel] --enable-visibility ( OPAL_C_HAVE_VISIBILITY) behavior in trunk I can't quite parse what you are saying -- do you have a specific question? On Sep 29, 2014, at 7:18 PM, Devendar Bureddywrote: > This is supposed to be enable by default. In trunk, I see that > OPAL_C_HAVE_VISIBILITY is defined to 0 by default. 1.8.x looks fine > > Configure : ./configure -prefix=$PWD/install > --enable-mpirun-prefix-by-default --disable-mpi-fortran --disable-vt > --enable-debug --enable-oshmem --with-pmi GCC : gcc version 4.4.7 > 20120313 (Red Hat 4.4.7-3) (GCC) > > -Devendar > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/09/15936.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/09/15937.php
Re: [OMPI devel] --enable-visibility ( OPAL_C_HAVE_VISIBILITY) behavior in trunk
I can't quite parse what you are saying -- do you have a specific question? On Sep 29, 2014, at 7:18 PM, Devendar Bureddywrote: > This is supposed to be enable by default. In trunk, I see that > OPAL_C_HAVE_VISIBILITY is defined to 0 by default. 1.8.x looks fine > > Configure : ./configure –prefix=$PWD/install > --enable-mpirun-prefix-by-default --disable-mpi-fortran --disable-vt > --enable-debug --enable-oshmem --with-pmi > GCC : gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) > > -Devendar > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/09/15936.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI devel] --enable-visibility ( OPAL_C_HAVE_VISIBILITY) behavior in trunk
This is supposed to be enable by default. In trunk, I see that OPAL_C_HAVE_VISIBILITY is defined to 0 by default. 1.8.x looks fine Configure : ./configure -prefix=$PWD/install --enable-mpirun-prefix-by-default --disable-mpi-fortran --disable-vt --enable-debug --enable-oshmem --with-pmi GCC : gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) -Devendar
Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r32814 - trunk/ompi/mca/coll/ml
Hi Jeff, Sure if that's the preferred check inside ompi itself. Howard -Original Message- From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff Squyres (jsquyres) Sent: Monday, September 29, 2014 3:59 PM To: Open MPI Developers List Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r32814 - trunk/ompi/mca/coll/ml Howard -- Do you want to just check ompi_mpi_thread_provided (== MPI_THREAD_MULTIPLE), instead? On Sep 29, 2014, at 5:02 PM,wrote: > Author: hppritcha (Howard Pritchard) > Date: 2014-09-29 17:02:15 EDT (Mon, 29 Sep 2014) New Revision: 32814 > URL: https://svn.open-mpi.org/trac/ompi/changeset/32814 > > Log: > disqualify coll ml for MPI_THREAD_MULTIPLE > > Text files modified: > trunk/ompi/mca/coll/ml/coll_ml_module.c | 7 +++ > > 1 files changed, 7 insertions(+), 0 deletions(-) > > Modified: trunk/ompi/mca/coll/ml/coll_ml_module.c > == > --- trunk/ompi/mca/coll/ml/coll_ml_module.c Mon Sep 29 15:26:33 2014 > (r32813) > +++ trunk/ompi/mca/coll/ml/coll_ml_module.c 2014-09-29 17:02:15 EDT (Mon, > 29 Sep 2014) (r32814) > @@ -2896,6 +2896,13 @@ > return NULL; > } > > +if (opal_using_threads()) { > +ML_VERBOSE(10, ("coll:ml: MPI_THREAD_MULTIPLE not suppported; > skipping this component")); > +*priority = -1; > +return NULL; > +} > + > + > /* NTH: Disabled this check until we have a better one. */ #if 0 > if (!ompi_rte_proc_is_bound) { > ___ > svn-full mailing list > svn-f...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/svn-full -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/09/15934.php
Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r32814 - trunk/ompi/mca/coll/ml
Howard -- Do you want to just check ompi_mpi_thread_provided (== MPI_THREAD_MULTIPLE), instead? On Sep 29, 2014, at 5:02 PM,wrote: > Author: hppritcha (Howard Pritchard) > Date: 2014-09-29 17:02:15 EDT (Mon, 29 Sep 2014) > New Revision: 32814 > URL: https://svn.open-mpi.org/trac/ompi/changeset/32814 > > Log: > disqualify coll ml for MPI_THREAD_MULTIPLE > > Text files modified: > trunk/ompi/mca/coll/ml/coll_ml_module.c | 7 +++ > > 1 files changed, 7 insertions(+), 0 deletions(-) > > Modified: trunk/ompi/mca/coll/ml/coll_ml_module.c > == > --- trunk/ompi/mca/coll/ml/coll_ml_module.c Mon Sep 29 15:26:33 2014 > (r32813) > +++ trunk/ompi/mca/coll/ml/coll_ml_module.c 2014-09-29 17:02:15 EDT (Mon, > 29 Sep 2014) (r32814) > @@ -2896,6 +2896,13 @@ > return NULL; > } > > +if (opal_using_threads()) { > +ML_VERBOSE(10, ("coll:ml: MPI_THREAD_MULTIPLE not suppported; > skipping this component")); > +*priority = -1; > +return NULL; > +} > + > + > /* NTH: Disabled this check until we have a better one. */ > #if 0 > if (!ompi_rte_proc_is_bound) { > ___ > svn-full mailing list > svn-f...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/svn-full -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI devel] Broken abort backtrace functionality
I looks like OMPI_MCA_mpi_abort_print_stack=1 is broken. I'm seeing following warning with it. -- $mpirun -np 2 -x OMPI_MCA_mpi_abort_print_stack=1 ./hello_c -- WARNING: A user-supplied value attempted to override the default-only MCA variable named "mpi_abort_print_stack". The user-supplied value was ignored. -- -- WARNING: A user-supplied value attempted to override the default-only MCA variable named "mpi_abort_print_stack". The user-supplied value was ignored. -- Hello, world, I am 1 of 2, Hello, world, I am 0 of 2, -- It seems HAVE_BACKTRACE is not defined by any configuration but, below relevant code is guarded with it. #if OPAL_WANT_PRETTY_PRINT_STACKTRACE && defined(HAVE_BACKTRACE) 0, OPAL_INFO_LVL_9, MCA_BASE_VAR_SCOPE_READONLY, #else MCA_BASE_VAR_FLAG_DEFAULT_ONLY, OPAL_INFO_LVL_9, MCA_BASE_VAR_SCOPE_CONSTANT, #endif $git grep HAVE_BACKTRACE ompi/runtime/ompi_mpi_params.c:#if OPAL_WANT_PRETTY_PRINT_STACKTRACE && defined(HAVE_BACKTRACE) $ -- -Devendar
[OMPI devel] release 1.9
Hi Folks, The release managers for the 1.9/2.0 stream have been putting together notes on features for this series, what sort of code pruning to do, etc. See https://github.com/open-mpi/ompi/wiki/Releasev19 We will be discussing the contents of the table(s) at the bottom of the wiki at tomorrow's meeting. Thanks, Howard - Howard Pritchard HPC-5 Los Alamos National Laboratory
Re: [OMPI devel] Valgrind warning in MPI_Win_allocate[_shared]()
Good catch - the problem is that ompi_info_get_bool returns "success" if the value isn't found, setting "flag" to false, but doesn't set the value of the param itself. So if you don't specify "blocking_fence" in MPI_Info, then the "blocking_fence" flag wasn't being set. Fixed in r32812 and scheduled for 1.8.4 Thanks! Ralph On Sep 28, 2014, at 2:43 AM, Lisandro Dalcinwrote: > Just built 1.8.3 for another round of testing with mpi4py. I'm getting > the following valgrind warning: > > ==4718== Conditional jump or move depends on uninitialised value(s) > ==4718==at 0xD0D9F4C: component_select (osc_sm_component.c:333) > ==4718==by 0x4CF44F6: ompi_osc_base_select (osc_base_init.c:73) > ==4718==by 0x4C68B69: ompi_win_allocate (win.c:182) > ==4718==by 0x4CBB8C2: PMPI_Win_allocate (pwin_allocate.c:79) > ==4718==by 0x400898: main (in /home/dalcinl/Devel/BUGS-MPI/openmpi/a.out) > > The offending code is in ompi/mca/osc/sm/osc_sm_component.c, it seems > you forgot to initialize the "blocking_fence" to a default true or > false value. > >bool blocking_fence; >int flag; > >if (OMPI_SUCCESS != ompi_info_get_bool(info, "blocking_fence", > _fence, )) { >goto error; >} > >if (blocking_fence) { > > > -- > Lisandro Dalcin > > Research Scientist > Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) > Numerical Porous Media Center (NumPor) > King Abdullah University of Science and Technology (KAUST) > http://numpor.kaust.edu.sa/ > > 4700 King Abdullah University of Science and Technology > al-Khawarizmi Bldg (Bldg 1), Office # 4332 > Thuwal 23955-6900, Kingdom of Saudi Arabia > http://www.kaust.edu.sa > > Office Phone: +966 12 808-0459 > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/09/15925.php
Re: [OMPI devel] Neighbor collectives with periodic Cartesian topologies of size one
An equivalent change would need to be made for graph and dist graph as well. That will take a little more work. Also, I was avoiding changing anything in topo for 1.8. -Nathan On Mon, Sep 29, 2014 at 08:02:41PM +0900, Gilles Gouaillardet wrote: >Nathan, > >why not just make the topology information available at that point as you >described it ? > >the attached patch does this, could you please review it ? > >Cheers, > >Gilles > >On 2014/09/26 2:50, Nathan Hjelm wrote: > > On Tue, Aug 26, 2014 at 07:03:24PM +0300, Lisandro Dalcin wrote: > > I finally managed to track down some issues in mpi4py's test suite > using Open MPI 1.8+. The code below should be enough to reproduce the > problem. Run it under valgrind to make sense of my following > diagnostics. > > In this code I'm creating a 2D, periodic Cartesian topology out of > COMM_SELF. In this case, the process in COMM_SELF has 4 logical in/out > links to itself. So we have size=1 but indegree=outdegree=4. However, > in ompi/mca/coll/basic/coll_basic_module.c, "size * 2" request are > being allocated to manage communication: > > if (OMPI_COMM_IS_INTER(comm)) { > size = ompi_comm_remote_size(comm); > } else { > size = ompi_comm_size(comm); > } > basic_module->mccb_num_reqs = size * 2; > basic_module->mccb_reqs = (ompi_request_t**) > malloc(sizeof(ompi_request_t *) * basic_module->mccb_num_reqs); > > I guess you have to also special-case for topologies and allocate > indegree+outdegree requests (not sure about this number, just > guessing). > > > I wish this was possible but the topology information is not available > at that point. We may be able to change that but I don't see the work > completing anytime soon. I committed an alternative fix as r32796 and > CMR'd it to 1.8.3. I can confirm that the attached reproducer no longer > produces a SEGV. Let me know if you run into any more issues. > > > -Nathan > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/09/15915.php > Index: ompi/mca/topo/base/topo_base_cart_create.c > === > --- ompi/mca/topo/base/topo_base_cart_create.c(revision 32807) > +++ ompi/mca/topo/base/topo_base_cart_create.c(working copy) > @@ -163,10 +163,18 @@ > return MPI_ERR_INTERN; > } > > +assert(NULL == new_comm->c_topo); > +assert(!(new_comm->c_flags & OMPI_COMM_CART)); > +new_comm->c_topo = topo; > +new_comm->c_topo->mtc.cart = cart; > +new_comm->c_topo->reorder = reorder; > +new_comm->c_flags |= OMPI_COMM_CART; > ret = ompi_comm_enable(old_comm, new_comm, > new_rank, num_procs, topo_procs); > if (OMPI_SUCCESS != ret) { > /* something wrong happened during setting the communicator */ > +new_comm->c_topo = NULL; > +new_comm->c_flags &= ~OMPI_COMM_CART; > ompi_comm_free (_comm); > free(topo_procs); > if(NULL != cart->periods) free(cart->periods); > @@ -176,10 +184,6 @@ > return ret; > } > > -new_comm->c_topo = topo; > -new_comm->c_topo->mtc.cart = cart; > -new_comm->c_topo->reorder = reorder; > -new_comm->c_flags |= OMPI_COMM_CART; > *comm_topo = new_comm; > > if( MPI_UNDEFINED == new_rank ) { > Index: ompi/mca/coll/basic/coll_basic_module.c > === > --- ompi/mca/coll/basic/coll_basic_module.c (revision 32807) > +++ ompi/mca/coll/basic/coll_basic_module.c (working copy) > @@ -13,6 +13,8 @@ > * Copyright (c) 2012 Sandia National Laboratories. All rights reserved. > * Copyright (c) 2013 Los Alamos National Security, LLC. All rights > * reserved. > + * Copyright (c) 2014 Research Organization for Information Science > + * and Technology (RIST). All rights reserved. > * $COPYRIGHT$ > * > * Additional copyrights may follow > @@ -28,6 +30,7 @@ > #include "mpi.h" > #include "ompi/mca/coll/coll.h" > #include "ompi/mca/coll/base/base.h" > +#include "ompi/mca/topo/topo.h" > #include "coll_basic.h" > > > @@ -70,6 +73,15 @@ > } else { > size = ompi_comm_size(comm); > } > +if (comm->c_flags & OMPI_COMM_CART) { > +int cart_size; > +assert (NULL != comm->c_topo); > +comm->c_topo->topo.cart.cartdim_get(comm, _size); > +cart_size *= 2; > +if (cart_size > size) { > +size = cart_size; > +} > +} > basic_module->mccb_num_reqs = size * 2; > basic_module->mccb_reqs = (ompi_request_t**) > malloc(sizeof(ompi_request_t *) *
Re: [OMPI devel] Neighbor collectives with periodic Cartesian topologies of size one
Nathan, why not just make the topology information available at that point as you described it ? the attached patch does this, could you please review it ? Cheers, Gilles On 2014/09/26 2:50, Nathan Hjelm wrote: > On Tue, Aug 26, 2014 at 07:03:24PM +0300, Lisandro Dalcin wrote: >> I finally managed to track down some issues in mpi4py's test suite >> using Open MPI 1.8+. The code below should be enough to reproduce the >> problem. Run it under valgrind to make sense of my following >> diagnostics. >> >> In this code I'm creating a 2D, periodic Cartesian topology out of >> COMM_SELF. In this case, the process in COMM_SELF has 4 logical in/out >> links to itself. So we have size=1 but indegree=outdegree=4. However, >> in ompi/mca/coll/basic/coll_basic_module.c, "size * 2" request are >> being allocated to manage communication: >> >> if (OMPI_COMM_IS_INTER(comm)) { >> size = ompi_comm_remote_size(comm); >> } else { >> size = ompi_comm_size(comm); >> } >> basic_module->mccb_num_reqs = size * 2; >> basic_module->mccb_reqs = (ompi_request_t**) >> malloc(sizeof(ompi_request_t *) * basic_module->mccb_num_reqs); >> >> I guess you have to also special-case for topologies and allocate >> indegree+outdegree requests (not sure about this number, just >> guessing). >> > I wish this was possible but the topology information is not available > at that point. We may be able to change that but I don't see the work > completing anytime soon. I committed an alternative fix as r32796 and > CMR'd it to 1.8.3. I can confirm that the attached reproducer no longer > produces a SEGV. Let me know if you run into any more issues. > > > -Nathan > > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/09/15915.php Index: ompi/mca/topo/base/topo_base_cart_create.c === --- ompi/mca/topo/base/topo_base_cart_create.c (revision 32807) +++ ompi/mca/topo/base/topo_base_cart_create.c (working copy) @@ -163,10 +163,18 @@ return MPI_ERR_INTERN; } +assert(NULL == new_comm->c_topo); +assert(!(new_comm->c_flags & OMPI_COMM_CART)); +new_comm->c_topo = topo; +new_comm->c_topo->mtc.cart = cart; +new_comm->c_topo->reorder = reorder; +new_comm->c_flags |= OMPI_COMM_CART; ret = ompi_comm_enable(old_comm, new_comm, new_rank, num_procs, topo_procs); if (OMPI_SUCCESS != ret) { /* something wrong happened during setting the communicator */ +new_comm->c_topo = NULL; +new_comm->c_flags &= ~OMPI_COMM_CART; ompi_comm_free (_comm); free(topo_procs); if(NULL != cart->periods) free(cart->periods); @@ -176,10 +184,6 @@ return ret; } -new_comm->c_topo = topo; -new_comm->c_topo->mtc.cart = cart; -new_comm->c_topo->reorder = reorder; -new_comm->c_flags |= OMPI_COMM_CART; *comm_topo = new_comm; if( MPI_UNDEFINED == new_rank ) { Index: ompi/mca/coll/basic/coll_basic_module.c === --- ompi/mca/coll/basic/coll_basic_module.c (revision 32807) +++ ompi/mca/coll/basic/coll_basic_module.c (working copy) @@ -13,6 +13,8 @@ * Copyright (c) 2012 Sandia National Laboratories. All rights reserved. * Copyright (c) 2013 Los Alamos National Security, LLC. All rights * reserved. + * Copyright (c) 2014 Research Organization for Information Science + * and Technology (RIST). All rights reserved. * $COPYRIGHT$ * * Additional copyrights may follow @@ -28,6 +30,7 @@ #include "mpi.h" #include "ompi/mca/coll/coll.h" #include "ompi/mca/coll/base/base.h" +#include "ompi/mca/topo/topo.h" #include "coll_basic.h" @@ -70,6 +73,15 @@ } else { size = ompi_comm_size(comm); } +if (comm->c_flags & OMPI_COMM_CART) { +int cart_size; +assert (NULL != comm->c_topo); +comm->c_topo->topo.cart.cartdim_get(comm, _size); +cart_size *= 2; +if (cart_size > size) { +size = cart_size; +} +} basic_module->mccb_num_reqs = size * 2; basic_module->mccb_reqs = (ompi_request_t**) malloc(sizeof(ompi_request_t *) * basic_module->mccb_num_reqs);