[GitHub] [iceberg] jackye1995 commented on issue #3044: Unable to use GlueCatalog in flink environments without hadoop

GitBox Sun, 29 Aug 2021 10:11:11 -0700


jackye1995 commented on issue #3044:
URL: https://github.com/apache/iceberg/issues/3044#issuecomment-907829687



   > If the only reason to have the GlueCatalog implement Configurable is to 
satisfy the current constraints of dynamic FileIO loading, likely we can change 
that (I believe dynamic FileIO loading was added when maybe the second 
additional Catalog beyond the basic Hadoop and Hive ones were added).
   
   It was added because some users still want to use `GlueCatalog` with 
`HadoopFIleIO`. One example is for users who are using EMR file system and has 
to use the `EmrFileSystem` plugin to make sure all their file system access are 
synchronized. (I know EmrFS is on deprecation path but it's a sticky dependency 
that is not easy to migrate)
   
   With that being said, I agree that Glue catalog should work without a Hadoop 
installation. This is the most important reason to have such an independent 
implementation.
   
   One approach I can think of is to make another `HadoopGlueCatalog` that 
supports Hadoop configuration, and remove the configurable aspect of Glue 
catalog. Let me put up a PR for this to discuss in more details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] jackye1995 commented on issue #3044: Unable to use GlueCatalog in flink environments without hadoop

Reply via email to