johnclara opened a new pull request #1739:
URL: https://github.com/apache/iceberg/pull/1739


   This was the patch we used to cover up the issue we encountered here: 
https://github.com/apache/iceberg/issues/1637
   
   It looks like a similar issue was encountered: 
https://github.com/apache/iceberg/issues/1511
   Which resulted in the patch: https://github.com/apache/iceberg/pull/1514
   
   Patch 1514 solves the issue we ran into in a better way than this patch set.
   
   However, it can lead to reading the DataFile multiple times. For my use 
case, that's totally fine and I will get rid of this patch when we upgrade. But 
if someone needs to skip duplicate files within a single scan then they can 
hopefully find this PR.
   
   I'll leave this up to confirm that @mehtaashish23 knows about the duplicates 
and that it is intended. Then I'll close this


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to