pvary commented on issue #2319: URL: https://github.com/apache/iceberg/issues/2319#issuecomment-797277486
> Also, it's inconsistent with how Hive and Presto handle Iceberg tables; but also how Spark handles queries to non-Iceberg tables. Hive also should use the same snapshot of the table on query level, but the refresh is expected between sessions and transactions (currently queries). Since Hive query execution spans multiple JVMs, we have to find our own way for snapshotting tables. We have already started working on this (See BaseTable serialization) > I agree, this would solve caching for saving resources. However, this does not address the self-join concerns mentioned before, since they rely on looking at the same snapshot. I think the current CachingCatalog is too specific for general use but still has its own use-cases. Also, as this is a released feature some users might depend on its specific features. I would suggest to create a new one alongside it and when it is ready we might decide to deprecate the old. Whatdo you think? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
