Perfect, thank you. I tried setCapacity and setValueCount together and this didn't have the result I was hoping for. The methods you provide are what I was looking for.
On Sat, Jul 25, 2020 at 5:22 PM Jacques Nadeau <[email protected]> wrote: > You can allocate exactly for both fixed [1] and variable types [2]. > > 1: > https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseFixedWidthVector.java#L292 > 2: > https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java#L401 > > You can then use the set method per cell or just grab the memory address > (e.g. getDataBufferAddress()) and use Unsafe to bulk copy. The latter > obviously is more advanced and requires you do things like set the > validity buffers as well. > > > On Sat, Jul 25, 2020 at 6:02 AM Chris Nuernberger <[email protected]> > wrote: > >> Hey, >> >> I would like to have bulk methods for copying data into a vector. >> Specifically, I have an existing data table so I know the desired lengths >> of the columns. I can also precalculate the necessary buffer sizes for any >> variable sized column. >> >> >> What I don't see is how to pre-allocate columns of a given size. When I >> use setValueCount on a column and then use the set method I get a netty >> error. What I was hoping for is some allocation method, especially for >> primitive data, that allocates the desired uninitialized memory for the >> valide and buffer data and then hands those two buffers back to me so I can >> use memcpy and friends as opposed to repeated calls to setSafe. >> >> >> Repeated calls to setSafe are time consuming, not parallelizable, and >> unnecessary when I know the data rectangle I would like to transfer into a >> record batch. >> >> >> In my case I have the data pre-cut. How would you recommend copying bulk >> portions of data (that may be in java arrays or in some more abstract >> interface) into a record batch? >> >> Thanks for any help, >> >> Chris >> >
