rtadepalli commented on code in PR #777: URL: https://github.com/apache/arrow-java/pull/777#discussion_r2115033521
########## vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java: ########## @@ -248,4 +252,114 @@ public void copyFrom(int fromIndex, int thisIndex, ValueVector from) { public void copyFromSafe(int fromIndex, int thisIndex, ValueVector from) { throw new UnsupportedOperationException(); } + + /** + * Transfer the validity buffer from `validityBuffer` to the target vector's `validityBuffer`. + * Start at `startIndex` and copy `length` number of elements. If the starting index is 8 byte + * aligned, then the buffer is sliced from that index and ownership is transferred. If not, + * individual bytes are copied. + * + * @param startIndex starting index + * @param length number of elements to be copied + * @param target target vector + */ + protected void splitAndTransferValidityBuffer( + int startIndex, int length, BaseValueVector target) { Review Comment: Accepting `BaseValueVector` and not `ValueVector` here to keep the API `(int, int, vector)`. If I accept `ValueVector`, then there there needs to be some lambda wrangling to pass in the vector-specific `allocateValidityBuffer` somehow, or that function needs to be added to `ValueVector`, both of which don't seem that great. From what I can tell everything except `StructVector` seems to be descend from `BaseValueVector` so chose this as part of the API. ########## vector/src/main/java/org/apache/arrow/vector/BaseValueVector.java: ########## @@ -248,4 +252,114 @@ public void copyFrom(int fromIndex, int thisIndex, ValueVector from) { public void copyFromSafe(int fromIndex, int thisIndex, ValueVector from) { throw new UnsupportedOperationException(); } + + /** + * Transfer the validity buffer from `validityBuffer` to the target vector's `validityBuffer`. + * Start at `startIndex` and copy `length` number of elements. If the starting index is 8 byte + * aligned, then the buffer is sliced from that index and ownership is transferred. If not, + * individual bytes are copied. + * + * @param startIndex starting index + * @param length number of elements to be copied + * @param target target vector + */ + protected void splitAndTransferValidityBuffer( + int startIndex, int length, BaseValueVector target) { + int offset = startIndex % 8; + + if (length <= 0) { + return; + } + if (offset == 0) { + sliceAndTransferValidityBuffer(startIndex, length, target); + } else { + copyValidityBuffer(startIndex, length, target); + } + } + + /** + * If the start index is 8 byte aligned, slice `validityBuffer` and transfer ownership to + * `target`'s `validityBuffer`. + * + * @param startIndex starting index + * @param length number of elements to be copied + * @param target target vector + */ + protected void sliceAndTransferValidityBuffer( Review Comment: This implementation is the one being used by the vectors inside the `complex/` directory. `BaseValueVector#validityBuffer` is protected, so overridden implementations there can't call `target.validityBuffer` to set a new value. Vectors in the top level `org/apache/arrow/vector` directory can add their overridden implementations. If at some point some other vector in `complex/` wants to have a different version of this function, then I am afraid that vector will need to override `splitAndTransferValidityBuffer` entirely. Presumably don't want to expose a public `setValidityBuffer` on `BaseValueVector`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org