We saw these seqv too with and without setting sm btl . On Fri, Aug 7, 2009 at 10:51 AM, Ralph Castain <r...@open-mpi.org> wrote:
> > > On Thu, Aug 6, 2009 at 3:18 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > >> Ok, with Terry's help, I found a segv in the coll sm. If you run without >> the sm btl, there's an obvious bad parameter that we're passing that results >> in a segv. >> >> LANL -- can you confirm / deny that these are the segv's that you were >> seeing? > > > Yes we can deny that those are the segv's we were seeing - we definitely > had the sm btl active. I'll rerun the test on Monday and add the stacktrace > to your ticket. > > Ralph > > >> >> While fixing this, I noticed that the sm btl and sm coll are sharing an >> mpool when both are running. This probably used to be a good idea way back >> when (e.g., when we were using a lot more shmem than we needed and core >> counts were lower), but it seems like a bad idea now (e.g., the btl/sm is >> fairly specific about the size of the mpool that is created -- it's just big >> enough for its data structures). >> >> I'm therefore going to change the mpool string names that btl/sm and >> coll/sm are looking for so that they get unique sm mpool modules. >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >