Hi all,

now that PR 2216 has been merged, many of the mentioned problems have been
alleviated. To prevent similar problems, I'll add, as Philipp suggested, a
debug-build check to the buffer builders, which ensures that the finish
method has been called before the builder destructs.

Unfortunately, not every buffer is built with the builders. Since the
buffers are allocated uninitialized, it's really easy to forget about
padding. The Copy method of the buffer, for example, leaves the padding of
the target buffer uninitialized, except in the rare case where size_ ==
capacity_. Basically, before a newly allocated buffer is used, you have to
make sure it's zero padded.

I experimented with changing Buffer::data method's semantics to check if
the buffer is zero padded in the debug builds. It turned up a number of
problems (the ray one including), but also a great deal of false positives,
so I'm not sure if it's the right way forward.

I think we should at least add a AllocateBufferZeroPadded method and make
explicit, that you should use it, if you don't plan on resizing the buffer
after allocation. Also, a Buffer::ZeroPadding method which memsets
everything between size_ and capacity_ to 0 might be useful, since it would
catch programmers attention to the need. Possible checks can then be added
as needed.

Cheers,
Dimitri.

On Tue, Jul 3, 2018 at 4:53 PM Wes McKinney <[email protected]> wrote:

> hi Dimitri,
>
> Zeroing the padding is probably a good idea; I'd be interested to look
> at the diff to see how many code paths are impacted. We already have a
> number of places where we are zeroing buffers but as you have found it
> is not 100% consistent.
>
> - Wes
>
> On Tue, Jul 3, 2018 at 10:13 AM, Dimitri Vorona
> <[email protected]> wrote:
> > Hi,
> >
> > I misunderstood the issue: the zeroing of the padding is correctly
> handled
> > by the writers, as long as they are careful not to read beyond the
> > specified size of the buffers (as opposed to its capacity).
> >
> > The valgrind warning came from the way AppendNull(s) is implemented in
> some
> > of the builders: it just sets the nullmap and increments lengths, but
> > doesn't set the data, which leads to uninitialized memory chunks, where
> > nulls where inserted. I'll send a PR which addresses it.
> >
> > I'm still not sure if we should add zero the padding anyway, since the
> > buffers can be shared in any number of ways. Zeroing the last couple of
> > bytes shouldn't have any performance impact and could save a headache in
> > the future.
> >
> > Cheers.
> >
> > On Tue, Jul 3, 2018 at 1:15 PM Antoine Pitrou <[email protected]>
> wrote:
> >
> >>
> >> Hi,
> >>
> >> Writing uninitialized memory risks leaking private data.  This might
> >> lead to security issues.
> >>
> >> I'd go for option 2.  Option 3 sounds much more costly (we would be
> >> zero-initializing large memory areas instead of small padding areas).
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >> Le 03/07/2018 à 13:11, Dimitri Vorona a écrit :
> >> > Hi all,
> >> >
> >> > currently, running json-integration-test with valgrind leads to the
> >> > following warning: "Syscall param write(buf) points to uninitialised
> >> > byte(s)". This is caused by PrimitiveBufferBuilder not initializing
> its
> >> > data memory. Note: we initialize null_bitmap_data_ by zeroing, i.e.
> >> setting
> >> > all values to not-null by default.
> >> >
> >> > Since having tests pass valgrind might be desirable for the CI, I
> think
> >> we
> >> > should fix this warning. There are a couple of possibilities:
> >> >
> >> > 1. Add suppression. The specs doesn't require padding to have a
> specific
> >> > value, so we might consider it to be false positive
> >> > 2. Add initialization of the padding bytes to
> ArrayBuilder::FinishIntenal
> >> > implementations.
> >> > 3. Generally zero-initialize memory in PoolBuffer. Might be too
> >> expensive.
> >> >
> >> > Or course there could be number of other options, includeing "do
> >> nothing".
> >> > If we settle on a best option, I can make a PR.
> >> >
> >> > Cheers,
> >> > Dimitri.
> >> >
> >>
>

Reply via email to