On Sep 27, 2013, at 17:31 , Nathan Hjelm <hje...@lanl.gov> wrote: > On Fri, Sep 27, 2013 at 01:01:01PM +0000, Jeff Squyres (jsquyres) wrote: >> On Sep 27, 2013, at 3:27 AM, George Bosilca <bosi...@icl.utk.edu> wrote: >> >>> The addition of the neighborhood collectives to the >>> mca_coll_base_comm_coll_t structure increased the size of the >>> ompi_communicator_t structure over the limit of the predefined padding >>> (PREDEFINED_COMMUNICATOR_PAD). This is not a small change, it will break >>> the ABI with all past version of Open MPI. >> >> This is going to be problematic for putting this in 1.7.4. >> >> Nathan: is there another way? Perhaps even just a stopgap way for the >> 1.7/1.8 series, and we can keep the "real" way for 1.9+? I.e., perhaps: >> >> 1. keep PREDEFINED_COMMUNICATOR_PAD at current value for v1.7.x/1.8, but use >> a secondary pointer system (which won't be *too* painful; the algorithms are >> all simple/not optimized, anyway) >> >> 2. increase PREDEFINED_COMMUNICATOR_PAD on the trunk for v1.9+ (we might >> want to increase it more than it is already increased, so that we actually >> have some breathing room for 1.9+) >> >>> I pushed a temporary commit to allow the trunk to be built, but we might >>> want a better solution. > > Ok, it looks like the structure was exactly 128 * sizeof (void *) without > peruse. So enabling peruse > would make it go over the max. Attached is a work around so we don't have to > increase the size of > the communicator for 1.7.x. George, let me know if you think this solution is > acceptable.
Let me check your patch. However, I don't have peruse enabled on my builds where I got the error this morning. Here are the flags I'm using: --prefix=/Users/bosilca/opt/trunk/debug --enable-shared --disable-static --enable-debug --disable-mpi-cxx --disable-io-romio --enable-contrib-no-build=vt,libtrace --enable-mpirun-prefix-by-default --disable-mpi-profile George. > >> Thanks George. >> >>> There a re a new warnings: >>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c: In >>> function 'libnbc_comm_query': >>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:196:48: >>> warning: assignment from incompatible pointer type [enabled by default] >>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:197:49: >>> warning: assignment from incompatible pointer type [enabled by default] >>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:198:47: >>> warning: assignment from incompatible pointer type [enabled by default] >>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:199:48: >>> warning: assignment from incompatible pointer type [enabled by default] >>> ../../../../../ompi/ompi/mca/coll/libnbc/coll_libnbc_component.c:200:48: >>> warning: assignment from incompatible pointer type [enabled by default] >> >> >> Nathan: please fix. > > Ok. Will commit a fix an add a comment to coll.h that increasing the size of > mca_coll_base_comm_coll_t might > require PREDEFINED_COMMUNICATOR_PAD to be increased. I didn't see an issue > with the communicator size because > I never modified the communicator directly. > > -Nathan > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel