Re: [Valgrind-users] (unavoidable) false positive?

Shachar Shemesh Mon, 05 May 2014 03:12:35 -0700

The following email contains discussion that can best be described as
"academic". If theoreticals are not for you, do yourself a favor and hit
"delete" right now.

On 05/05/14 02:26, John Reiser wrote:
> The *easy* way to avoid hassle is to clear any union (or struct which suffers
> alignment or padding) immediately after declaration:
>       memset(&u, 0, sizeof(u));
> In addition to keeping memcheck quiet, this tends to increase the 
> reproducibility
> of any bugs, and to insulate the application from unpleasant interactions
> when the protocol gets enhanced or extended.  Of course, the memset also
> tends to mask any actual uninit errors.  But these tend to be caught
> by other means because typically they cause incorrect results.
>
>
While I don't have any particular issue with valgrind reporting this
particular case as a problem (though I still think it is, at its core, a
false positive), I think the off handedness with which you offer memset
as an *easy* alternative is over-simplifying things.

Please consider the following rather simple C++ program:

> class A { char a; public: explicit A( char chr ) : a(chr) { } }; class
> B : public A { int b; public: explicit B( char chr ) : A(chr),
> b(chr+1) { } }; B func(int fd, char chr) { // Point 1 B b(chr); //
> Point 2 write(fd, &b, sizeof(B)); return b; } 

sizeof(A) is 1, but sizeof(B) is 8 (i.e. - three bytes of padding).
Despite the fact that the constructor initializes each and every member
of the type, there are three uninitialized bytes in each instance of B.
During the write, valgrind, correctly, complains about them.

Adding memset to this mix, however, is far from trivial. You cannot do
it in the constructor for A for three reasons:

 1. It is bad design. A doesn't have any padding.
 2. A doesn't know the size of the final object being allocated (B)
 3. A can only run the memset in the constructor's body, by which time
    its members are already initialized.

You cannot do it in the constructor for B because:

 4. B still doesn't know that it's the final object. Someone might be
    inheriting from it. They may need to repeat the memset, costing us
    unnecessary performance.
 5. By the time B's constructor runs, both the A instance inside B, as
    well as all of the members, are already initialized (same as point 3
    above)

You cannot do it by overloading the new operator, because, as is the
case here, B might be created on the stack.

You cannot do it in point 1, as at that point b has no storage
associated with it.

You cannot do it in point 2, as at that point b is already fully
constructed.

You options are, as far as I can see them:
Change B's constructor to read:

>     explicit B( char chr ) : A((memset(this, 0, sizeof(B)), chr)),
> b(chr+1)
>     {
>     }
This is ugly, hard to maintain (you'd need to do it to each constructor
individually), easy to miss, and uses an operator few remember exists,
and fewer still understand (the comma operator, unchanged from its C
days, and equally unused). This also suffers from problem 4 above.

You can use placement new. Replace the line between point 1 and point 2
with:

>     char b_storage[sizeof(B)];     memset(b_storage, 0, sizeof(B));
>     B &b( *new (b_storage) (chr) );     ScopeGuard b_guard( [b]()
> mutable { b.~B() } );

(this is C++11 code. With pre-C++11, this would be even uglier and less
readable).

The last option is to turn the padding explicit:

> class B : public A { char pad1, pad2, pad3; int b; public: explicit B(
> char chr ) : A(chr), pad1(0), pad2(0), pad3(0), b(chr+1) { } }; 

This code creates a dependency between B's implementation and A's
implementation, relies on alignment requirements of specific platforms,
and relies on undefined compiler behavior.

In the long run, all of the above solutions are much more likely to
cause bugs than to fix them.

The good news is that the more code uses constructs such as self
initializing data members, the less likely it is to have actual use of
uninitialized data. The same undefined behavior that cases the third
solution to be unacceptable also makes it unlikely that it would be okay
to send the buffer, byte for byte, to sendto. Cases such as fakeroot-ng,
where the process is guaranteed to be communicating with code run from
the same executable, should be very rare.

The point I was trying to make was that memset, while a good solution
for many of the cases, is not something that is always applicable.

Shachar

P.s.
If you feel you have just wasted three minutes of your time reading a
long complicated email with no action items attached to it, then you
should have listened to my advice from the first paragraph of the email :-)

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce

_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Re: [Valgrind-users] (unavoidable) false positive?

Reply via email to