chasebradford commented on code in PR #7833:
URL: https://github.com/apache/iceberg/pull/7833#discussion_r1232861840


##########
data/src/main/java/org/apache/iceberg/data/DeleteFilter.java:
##########
@@ -275,17 +273,12 @@ private CloseableIterable<Record> openDeletes(DeleteFile 
deleteFile, Schema dele
     InputFile input = getInputFile(deleteFile.path().toString());
     switch (deleteFile.format()) {
       case AVRO:
-        return Avro.read(input)
-            .project(deleteSchema)
-            .reuseContainers()
-            .createReaderFunc(DataReader::create)
-            .build();
+        return 
Avro.read(input).project(deleteSchema).createReaderFunc(DataReader::create).build();
 
       case PARQUET:
         Parquet.ReadBuilder builder =
             Parquet.read(input)
                 .project(deleteSchema)
-                .reuseContainers()

Review Comment:
   Yes, but ByteBuffer isn't one of those and it's ByteBuffer that is used for 
binary fields.
   ```
     public static class BytesReader extends PrimitiveReader<ByteBuffer> {
       public BytesReader(ColumnDescriptor desc) {
         super(desc);
       }
   
       @Override
       public ByteBuffer read(ByteBuffer reuse) {
         Binary binary = column.nextBinary();
         ByteBuffer data = binary.toByteBuffer();
         if (reuse != null && reuse.hasArray() && reuse.capacity() >= 
data.remaining()) {
           data.get(reuse.array(), reuse.arrayOffset(), data.remaining());
           reuse.position(0);
           reuse.limit(binary.length());
           return reuse;
         } else {
           byte[] array = new byte[data.remaining()];
           data.get(array, 0, data.remaining());
           return ByteBuffer.wrap(array);
         }
       }
     }
     ```
   
   There's also the fact that even if the structure were reused here for 
performance, it'll still be copied a few lines above.
   
   I'd argue that the `Record::copy` is also broken, because I assume it's 
expected to do a deep copy. I worry the changes needed to ensure it's working 
correctly would be much more bigger, perhaps intractable, and carry larger 
risks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to