rdblue commented on issue #106: Deep copy maps and lists in GenericDataFile.
URL: https://github.com/apache/incubator-iceberg/pull/106#issuecomment-465769700
 
 
   @rdsr, the data files themselves are not modified, although you could build 
a process that gathers these stats from a file and replaces the old DataFile 
with a new one with the data.
   
   The case that this is addressing is reuse of container objects while 
scanning manifest files. To avoid object creation, Iceberg will reuse Record, 
Map, and List objects and fill them with new data. That cuts down on object 
churn when most records are discarded because a file was deleted or doesn't 
match a filter. When a file is selected for a scan in `planFiles`, the DataFile 
that is returned is a copy because the next record read from the manifest will 
refill the reused record with new data.
   
   Unfortunately, that copy wasn't deep copying these maps, so the wrong data 
was returned.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to