kbendick commented on a change in pull request #1334:
URL: https://github.com/apache/iceberg/pull/1334#discussion_r471172748



##########
File path: data/src/main/java/org/apache/iceberg/data/orc/GenericOrcWriters.java
##########
@@ -231,7 +231,23 @@ public void nonNullWrite(int rowId, String data, 
ColumnVector output) {
 
     @Override
     public void nonNullWrite(int rowId, ByteBuffer data, ColumnVector output) {
-      ((BytesColumnVector) output).setRef(rowId, data.array(), 0, 
data.array().length);
+      // When there is a backing heap based byte array, we avoid the overhead 
of
+      // copying it.
+      if (data.hasArray()) {
+        // Don't assume the we're reading in the entire backing array.
+        // Using slice as any mutations to the data at rowId of output
+        // would also be visible in `data`.
+        ByteBuffer slice = data.slice();
+        ((BytesColumnVector) output).setRef(rowId, slice.array(), 0, 
slice.array().length);

Review comment:
       Actually, my solution with slice causes a ByteBufferBackingArray 
warning, which could be suppressed as we called `.hasArray` before. But at this 
point suppression seems silly given the work that went in to correcting this.
   
   If anybody has any solutions which doesn't involve a deep copy, I'm open to 
suggestions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to