It’s a tough call. This proposal will create significant differences between the debug and fast builds. As the entire objects will be set to zero this might reduce bugs in the debug build, bugs that will be horribly difficult to track in any non-debug builds. Moreover, if the structures are carefully accessed in our code, adding such a disruptive initialization just to prevent valgrind from reporting false-positive about uninitialized reads in memcpy is too costly as a solution (I am also conscient that it will be almost impossible to write a valgrind suppression rule for the specific case you mention).
Some parts of the code have (or at least had) some level of cleanness for the gaps in the structures. The solution was to minimally zero-fy the gaps, maintaining the same behavior between debug and non-debug builds. However, in order to do this one need to know the layout of the structure, so this is not a completely generic solution… George. On Oct 3, 2014, at 16:54 , Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > WHAT: change the malloc() to calloc() in opal_obj_new() (perhaps only in > debug builds?) > > WHY: Drastically reduces valgrind output > > WHERE: see > https://github.com/open-mpi/ompi/blob/master/opal/class/opal_object.h#L462-L467 > > TIMEOUT: teleconf, Tue, Oct 14 (there's no rush) > > MORE DETAIL: > > I was debugging some code today and came across a bunch of places where we > write structs down various IPC mechanisms, and the structs contain holes. In > most places, the performance doesn't matter / the readability of struct > members is more important, so we haven't re-ordered the structs to remove > holes. But consequently, those holes end up uninitialized, and therefore > memcpy()ing or write()ing instances of these structs causes valgrind to emit > warnings. > > The patch below eliminates most (all?) of these valgrind warnings -- in debug > builds, it changes the malloc() inside OBJ_NEW to a calloc(). > > Upon a little more thought, however, I wonder if we use OBJ_NEW in any fast > code paths (other than in bulk, such as when we need to grow a free list). > Specifically: would it be terrible to *always* calloc -- not just for debug > builds? > > ----- > diff --git a/opal/class/opal_object.h b/opal/class/opal_object.h > index 7012bac..585f13e 100644 > --- a/opal/class/opal_object.h > +++ b/opal/class/opal_object.h > @@ -464,7 +464,11 @@ static inline opal_object_t *opal_obj_new(opal_class_t * > cl > opal_object_t *object; > assert(cls->cls_sizeof >= sizeof(opal_object_t)); > > +#if OPAL_ENABLE_DEBUG > + object = (opal_object_t *) calloc(1, cls->cls_sizeof); > +#else > object = (opal_object_t *) malloc(cls->cls_sizeof); > +#endif > if (0 == cls->cls_initialized) { > opal_class_initialize(cls); > } > ----- > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/10/16001.php