rdblue commented on pull request #4254: URL: https://github.com/apache/iceberg/pull/4254#issuecomment-1061291885
@danielcweeks and @jackye1995, are we sure that dynamic loading is appropriate here? I don't think that this fits with where we've used dynamic loading in the past. Dynamic loading is used where we want to be able to customize the implementation from the catalog. For example, a catalog may dynamically choose GCS or S3 or another FileIO implementation based on the catalog settings and table details (like root location). But the metrics reporting that this affects are being passed back to a processing engine, which is determined by the environment. What we've been discussing for Flink is to have some way to connect FileIO metrics to Flink jobs and tasks, which likely means passing in a MetricsContext from Flink. This conflicts with the idea of passing down configuration that defines it. It doesn't make sense to me that we might allow a catalog or table to override something that is engine-specific. One thing that Iceberg has been fairly successful at so far is trying to put configuration in the right place. I think this is really similar: we should not use table or catalog configuration for engine concerns. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
