kainoa21 commented on issue #3044:
URL: https://github.com/apache/iceberg/issues/3044#issuecomment-908026571


   Thanks for putting this [PR](https://github.com/apache/iceberg/pull/3047) up 
so quickly @jackye1995.
   I think these changes are reasonable, but I have a couple of high-level 
comments.
   
   1. Additional updates to the FlinkCatalogFactory are still needed on top of 
these changes in order to fully remove the hadoop dependency.  One possible 
solution is to add `glue` as one of the known `catalog-type`'s and avoid 
calling `clusterHadoopConf()` when the user sets this value (I tested this 
change with a local build and it worked in a non-hadoop environment).
   2. While this addresses the issue for the GlueCatalog, it will not fix this 
issue for other catalog implementation which also should be able to work 
outside of a hadoop environment (e.g. DynamoCatalog).
   
   I am not completely sure how best to address concern 2.  When I look at both 
the GlueCatalog and DynamoCatalog, it seems the only reason for the 
`Configurable` interface is for supporting dynamic loading of FileIO 
implementations which may or may not need to inspect hadoop conf (currently 
only `HadoopFileIO`).   `HadoopCatalog` and `HiveCatalog` both look for 
additional configuration settings in the hadoop conf (beyond just for 
`HadoopFileIO`).
   
   Perhaps it makes sense to decouple these two dependencies such that a FileIO 
implementation can be provided to any catalog independent of that catalog's 
interaction with hadoop (or lack thereof).  I think this would also simplify 
the story for the `GlueCatalog` without having to have two distinct classes.  
(There is a potential strange place a user could get into trying to use the 
base `GlueCatalog` for the `catalog-impl` and `HadoopFileIO` for the `io-impl`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to