johnclara opened a new pull request #1739: URL: https://github.com/apache/iceberg/pull/1739
This was the patch we used to cover up the issue we encountered here: https://github.com/apache/iceberg/issues/1637 It looks like a similar issue was encountered: https://github.com/apache/iceberg/issues/1511 Which resulted in the patch: https://github.com/apache/iceberg/pull/1514 Patch 1514 solves the issue we ran into in a better way than this patch set. However, it can lead to reading the DataFile multiple times. For my use case, that's totally fine and I will get rid of this patch when we upgrade. But if someone needs to skip duplicate files within a single scan then they can hopefully find this PR. I'll leave this up to confirm that @mehtaashish23 knows about the duplicates and that it is intended. Then I'll close this ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
