On Sat, Sep 8, 2018 at 8:00 AM Zhenyuan Zhao <[email protected]> wrote:
> After digging into it a little deeper, I have more questions: > > First, vector takes allocator. Zero copy means we should not do any > additional allocation which implies a dummy allocator with (at most) > capability of allocating zero length (getEmpty) ArrowBuf is sufficient. > I don't see the downside to using the existing allocator as opposed to creating a dummy. if it doesn't allocate anything, what's the problem? > However, there are places in vector that requires more allocation: > > > https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BaseFixedWidthVector.java#L511 > > https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/BitVectorHelper.java#L180 > > Vector will allocate in the case of all null or non null. Id does seem like > optimization that can be done, but why it reallocate without looking into > if validity buffer is really empty? Take fixed width vector as example, it > in fact does check buffers count is two, and for my simple test case, I saw > validity buffer is still being sent in non null case. > > Agree that ideally we should only allocate if the source was not provided. Seems like that could be improved. > Second, arrow made a decision to only support off-heap buffer. Why? Doesn't > affect my use case, but sounds like this can be more flexible. > Perf and GC.
