bryanck commented on PR #8555:
URL: https://github.com/apache/iceberg/pull/8555#issuecomment-1718003552

   > I am very concerned about this simplified solution. We should consider:
   > 
   > * Concurrency issues, when the table metadata has been changed between 
reloads. Maybe we should only refresh the table metadata during checpoints, so 
we can localize, and handle changes in one place?
   > * Load issues, as every TaskManager with a writer task connects to the 
catalog, and does so periodically. This will become costly very soon. I would 
prioritize a centralized solution.
   > * The current default implementation does not solve the schema evolution 
problem, since that would mean on-demand table reload instead of periodic reload
   > 
   > If the main goal here is to refresh the credentials, then maybe a proxy 
solution above the accessor could be a better solution, like the 
https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java
 where we get a new Kerberos token when the previous one has expired.
   > 
   > What do you think?
   
   To address your points:
   * This PR was updated so the committer will now only reload table at 
checkpoints. The committer was already updating the table metadata as part of 
the Iceberg commit and upon restart. Writers will always use the original table 
metadata and only use the reloaded table to get an updated FileIO, though the 
hope is that the writers will used the reloaded metadata in the future.
   * This PR was designed so that a different and more optimal table supplier 
solution could be plugged in later. There were limitations with the options we 
explored for centralized table reload, such as using the new token delegation 
framework or using a broadcast (I'd be happy to discuss details if you're 
interested). We went with this initial solution as a starting point. This 
feature is disabled by default and marked as experimental, so should only be 
used in cases where it is known that a job will not overburden the catalog with 
load requests.
   * This PR was not meant to solve schema evolution problems but rather make a 
change that will take a step towards that long term goal.
   * The short term goal for our use case was to refresh credentials, so we 
initially explored adding a FileIO-specific refresh. However, the long term 
goal is to support table reload in a running Flink job to support schema 
evolution. We felt that adding in the table refresh capability would be a 
better step towards reaching that goal, and it could also be used to solve the 
particular issue we are having.
   
   cc @danielcweeks @rdblue in case you have additional perspective or 
feedback...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to