rdblue commented on pull request #2659: URL: https://github.com/apache/iceberg/pull/2659#issuecomment-861698721
> Is this the only place in iceberg that a table would be cached? The table will be referenced by Spark plans as well. I think the problem was that those plans weren't being invalidated when you ran `REFRESH TABLE t` because the catalog's `invalidateTable` method calls `refresh` on the table reference that it loads. So if a table was cleared from the cache then the existing references in Spark would no longer be updated by the catalog's `invalidateTable` call. That seems like a Spark problem and not a catalog problem to me, which is why I think we should revisit this decision. Shouldn't Spark invalidate cached plans that reference a table when `REFRESH TABLE` runs, rather than assuming that the catalog can do it? We may also want to purposely separate a table when it is in a cached plan. @aokolnychyi, what did we decide was the "correct" behavior when a query is cached? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
