Make sense also, thank you for your help. Calling setValueCount after allocateNew solved the problem apparently at least allowing me to round-trip some data. I never called setLastSet once so perhaps there was some duplicate work done.
On Sun, Jul 26, 2020 at 10:30 AM Jacques Nadeau <[email protected]> wrote: > > On Sun, Jul 26, 2020 at 8:02 AM Chris Nuernberger <[email protected]> > wrote: > >> It appears that those methods do not allocate the validity buffer *and* >> the function `allocateValidityBuffer` is private. >> > > It allocates both of them at once. To reduce heap usage we colocate them > since they are never resized indepently. > > > Also it appears that allocate new fails to set the value count for >> BaseVariableWidthVectors. And if you set the value count after you have >> assigned data then it clears *only* the offset buffer but not the validity >> or the data buffers. > > > For direct operations on variable, you'll need to do the following steps: > 1) allocateNew, > 2) copy in data via memory operations, > 3) call setLastSet() > 4) call setValueCount() > > I'm guessing you skipped #3 and then setValueCount sees that you never set > any values so it propagates the the last offset to the value count. This is > done so you can do something like: > set(1,...) > set(3,...) > setValueCount(7) > and then 4-6 ordinal positions will be offset filled even though you > didn't set them explicitly. If you do your own work, you have to help the > state model in the variable vector understand what you've done. >
