On Wed, Feb 20, 2013 at 10:28:56AM -0800, Eugene Loh wrote: > On 02/20/13 07:54, Jeff Squyres (jsquyres) wrote: > >All MTT testing looks good for 1.6.4. There seems to be an MPI dynamics > >problem when --enable-spare-groups is used, but this does not look like a > >regression to me. > > > >I put out a final rc, because there was one more minor change to accommodate > >an MXM API change; it's in the usual place: > > > > http://www.open-mpi.org/software/ompi/v1.6/ > > > >Unless something disastrous happens, I plan to release this as the final > >1.6.4 tomorrow. > > I don't think this qualifies as "disastrous", but... > > I've been trying to do some 1.6 testing on Solaris. (Solaris 11, > Oracle Studio compilers, both SPARC and x86) Results generally look > good. The main issue appears to be: > > - SPARC > *AND* > - compile with "-m32 -xmemalign=8s" (the latter means assume at most 8-byte > alignment, with sigbus for misalignment) > *AND* > - openib > > There is a sigbus during MPI_Init. Specifically, if I go to > btl_openib_frag.h out_constructor(), I see: > > frag->sr_desc.wr_id = (uint64_t)(uintptr_t)frag; > > and the left-hand side is on a 4-byte (but not 8-byte) boundary. How hard > would it be to get openib frags on 8-byte boundaries?
Very easy. Just adjust the parameters given to ompi_free_list_init(). There are arguments for frag alignment and data alignment. Looking at btl_openib_component.c a number of free lists have the alignment set at 2. Change those to 8 and see if that fixes the problem. Anyone know why these were set with an alignment of 2 in the first place? I would have expected 8 or opal_cache_line_size. -Nathan