vincentpoon commented on issue #5936:
URL: https://github.com/apache/iceberg/issues/5936#issuecomment-1278209330

   @rdblue  Hmm I guess it depends on what "correct" behavior means here, but 
if the partition stats reflect values that can never be returned in a query 
(because the files containing those values have been deleted), then that seems 
incorrect to me.
   
   And changing the behavior would be a perf improvement, particularly when the 
manifests are quite large, as they are in our use case.  Filtering using the 
partition stats at the manifest list level means certain manifests don't have 
to be read.  With incorrect partition stats, the manifests are read even when 
they don't have any files that can answer the query.
   
   Agree that a mode to simply drop files rather than keep references would 
solve the problem.  But then I would ask, what's the functionality of keeping 
around deleted files in the manifests with "Status: 2" (deleted) ?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to