kainoa21 opened a new issue #3044:
URL: https://github.com/apache/iceberg/issues/3044


   When attempting to use the GlueCatalog implementation (or really any 
implementation) in flink, hadoop is expected to be in 
   the classpath.
   
   The 
[FlinkCatalogFactory](https://github.com/apache/iceberg/blob/4eb0853cd787bf2f5778195558d22a45ecf6c601/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java)
 always [attempts to 
load](https://github.com/apache/iceberg/blob/4eb0853cd787bf2f5778195558d22a45ecf6c601/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java#L118)
 the hadoop config from flink but flink does not guarantee that there is a 
valid hadoop environment present.  In environments where hadoop is not 
available (e.g. AWS Kinesis Data Analytics), this throws 
`java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration`.
   
   Presently, most of the catalog implementations implement `Configurable` and 
thus the util functions like 
[loadCatalog](https://github.com/apache/iceberg/blob/4eb0853cd787bf2f5778195558d22a45ecf6c601/core/src/main/java/org/apache/iceberg/CatalogUtil.java#L170)
 expect to be passed an instance of hadoopConf.  In catalogs like GlueCatalog 
and DynamoCatalog, the only reason for the `Configurable` interface is to 
enable [dynamic FileIO 
loading](https://github.com/apache/iceberg/blob/4eb0853cd787bf2f5778195558d22a45ecf6c601/aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java#L110)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to