It’s a tough call. This proposal will create significant differences between 
the debug and fast builds. As the entire objects will be set to zero this might 
reduce bugs in the debug build, bugs that will be horribly difficult to track 
in any non-debug builds. Moreover, if the structures are carefully accessed in 
our code, adding such a disruptive initialization just to prevent valgrind from 
reporting false-positive about uninitialized reads in memcpy is too costly as a 
solution (I am also conscient that it will be almost impossible to write a 
valgrind suppression rule for the specific case you mention).

Some parts of the code have (or at least had) some level of cleanness for the 
gaps in the structures. The solution was to minimally zero-fy the gaps, 
maintaining the same behavior between debug and non-debug builds. However, in 
order to do this one need to know the layout of the structure, so this is not a 
completely generic solution…

  George.


On Oct 3, 2014, at 16:54 , Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote:

> WHAT: change the malloc() to calloc() in opal_obj_new() (perhaps only in 
> debug builds?)
> 
> WHY: Drastically reduces valgrind output
> 
> WHERE: see 
> https://github.com/open-mpi/ompi/blob/master/opal/class/opal_object.h#L462-L467
> 
> TIMEOUT: teleconf, Tue, Oct 14 (there's no rush)
> 
> MORE DETAIL:
> 
> I was debugging some code today and came across a bunch of places where we 
> write structs down various IPC mechanisms, and the structs contain holes.  In 
> most places, the performance doesn't matter / the readability of struct 
> members is more important, so we haven't re-ordered the structs to remove 
> holes.  But consequently, those holes end up uninitialized, and therefore 
> memcpy()ing or write()ing instances of these structs causes valgrind to emit 
> warnings.
> 
> The patch below eliminates most (all?) of these valgrind warnings -- in debug 
> builds, it changes the malloc() inside OBJ_NEW to a calloc().
> 
> Upon a little more thought, however, I wonder if we use OBJ_NEW in any fast 
> code paths (other than in bulk, such as when we need to grow a free list).  
> Specifically: would it be terrible to *always* calloc -- not just for debug 
> builds?
> 
> -----
> diff --git a/opal/class/opal_object.h b/opal/class/opal_object.h
> index 7012bac..585f13e 100644
> --- a/opal/class/opal_object.h
> +++ b/opal/class/opal_object.h
> @@ -464,7 +464,11 @@ static inline opal_object_t *opal_obj_new(opal_class_t * 
> cl
>     opal_object_t *object;
>     assert(cls->cls_sizeof >= sizeof(opal_object_t));
> 
> +#if OPAL_ENABLE_DEBUG
> +    object = (opal_object_t *) calloc(1, cls->cls_sizeof);
> +#else
>     object = (opal_object_t *) malloc(cls->cls_sizeof);
> +#endif
>     if (0 == cls->cls_initialized) {
>         opal_class_initialize(cls);
>     }
> -----
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/10/16001.php

Reply via email to