rizaon commented on PR #4518: URL: https://github.com/apache/iceberg/pull/4518#issuecomment-1099703296
[c032b7a](https://github.com/apache/iceberg/pull/4518/commits/c032b7ab2f2db941fe5433c13a38dfcb8bf538ef) implement caching as a new FileIO class, CachingHadoopFileIO. A new Tables class, CachingHadoopTables, is also added to assist with testing. We tried to avoid `lazyStats()` call as much as possible by checking the cache first before checking for stream length. This comes with a risk of keeping stale data in memory when the actual file is already deleted, as shown in `CachingHiveTableTest.testMissingMetadataWontCauseHang()`. This is probably fine, given the immutable nature of iceberg metadata. Otherwise, application should invalidate cache entry first to force re-read / `lazyStats()` call. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
