rymurr commented on a change in pull request #6402:
URL: https://github.com/apache/arrow/pull/6402#discussion_r426491071
##########
File path:
java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java
##########
@@ -740,10 +740,16 @@ private void splitAndTransferOffsetBuffer(int startIndex,
int length, BaseVariab
final int start = offsetBuffer.getInt(startIndex * OFFSET_WIDTH);
final int end = offsetBuffer.getInt((startIndex + length) * OFFSET_WIDTH);
final int dataLength = end - start;
- target.allocateOffsetBuffer((length + 1) * OFFSET_WIDTH);
- for (int i = 0; i < length + 1; i++) {
- final int relativeSourceOffset = offsetBuffer.getInt((startIndex + i) *
OFFSET_WIDTH) - start;
- target.offsetBuffer.setInt(i * OFFSET_WIDTH, relativeSourceOffset);
+
+ if (startIndex == 0) {
+ target.offsetBuffer = offsetBuffer.slice(0, (1 + length) * OFFSET_WIDTH);
+ target.offsetBuffer.getReferenceManager().retain();
Review comment:
I think we have an inconsistency in the way the value buffer is handled
vs the validity/offset buffer. If the testSplitAndTransfer tests were run with
two child buffer allocators I think they would fail. We transfer ownership of
the data buffer to the target allocator and depending on the branch we just
increment the ref count of the 'source' vectors offset/validity buffer (or
create a new buffer owned by the source vectors allocator). So if the source
buffer allocator is closed while we are still using the target buffer we will
end up having a bad day.
I guess the question is does transfer mean 'transfer the data (ie copy)' or
does it mean 'transfer ownership'. My feeling is that we mean transfer
ownership and that the handling of the validity buffer (in both fixed and
variable width vectors) and offset buffer isn't quite right. The reason I think
we mean transfer ownership is that we already have `copy` methods and we have
the concept of a `TransferPair` which explicitly handles 2 allocators.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]