[GitHub] [iceberg] rdblue commented on pull request #3837: Spark: Fix `Table UUID does not match` problem when enable CachingCatalog

GitBox Fri, 07 Jan 2022 15:09:53 -0800


rdblue commented on pull request #3837:
URL: https://github.com/apache/iceberg/pull/3837#issuecomment-1007809606



   Thanks for working on this, @smallx! I've been thinking about this since it 
wasn't clear at first what the behavior of a caching catalog _should_ be when a 
table is dropped and re-created. In the end, I think that the behavior in this 
PR is correct: if you call `REFRESH TABLE` then it should invalidate the cache 
and the next call to `load` should return whatever table is referenced by the 
given name.
   
   I think that the right way to implement this is actually a more direct 
approach, like your current version rather than the initial version. The 
`Catalog` should expose an `invalidateTable` method that should evict the table 
from the cache. A subsequent call to `loadTable` should reload the table. 
That's slightly different from the current version:
   1. We should add `invalidateTable` to `Catalog` so that you don't need to 
cast the catalog to `CachingCatalog`
       * The default implementation should be a noop
   2. `SparkCatalog.invalidateTable` should not call `load`. Instead, the next 
table load should result in loading the table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on pull request #3837: Spark: Fix `Table UUID does not match` problem when enable CachingCatalog

Reply via email to