szehon-ho commented on code in PR #7833:
URL: https://github.com/apache/iceberg/pull/7833#discussion_r1254943672


##########
data/src/main/java/org/apache/iceberg/data/DeleteFilter.java:
##########
@@ -275,17 +273,12 @@ private CloseableIterable<Record> openDeletes(DeleteFile 
deleteFile, Schema dele
     InputFile input = getInputFile(deleteFile.path().toString());
     switch (deleteFile.format()) {
       case AVRO:
-        return Avro.read(input)
-            .project(deleteSchema)
-            .reuseContainers()
-            .createReaderFunc(DataReader::create)
-            .build();
+        return 
Avro.read(input).project(deleteSchema).createReaderFunc(DataReader::create).build();
 
       case PARQUET:
         Parquet.ReadBuilder builder =
             Parquet.read(input)
                 .project(deleteSchema)
-                .reuseContainers()

Review Comment:
   This is a great find.
   
   I think it would be a reasonable alternative to fix GenericRecord.copy as 
well, for ByteBuffers.  (looks like its assuming primitives, which ByteBuffer 
is not..).
   
   Re:  reuseContainer, point taken that we may probably incur the similar cost 
if we are to copy.  But I am not familiar with all the cases though, how about 
perf of nested columns (or would that also break due to GenericRecord copy?)   
My observation being, we generally dont put reuseContainers that much in the 
code.
   
   I would like to see the opinion of @rdblue @RussellSpitzer  as well, as I 
think Ryan wrote the original GenericRecord.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to