rdblue commented on a change in pull request #1844:
URL: https://github.com/apache/iceberg/pull/1844#discussion_r533015622
##########
File path: core/src/main/java/org/apache/iceberg/CatalogUtil.java
##########
@@ -171,45 +172,57 @@ public static Catalog loadCatalog(
/**
* Load a custom {@link FileIO} implementation.
+ */
+ public static FileIO loadFileIO(
+ String impl,
+ Map<String, String> properties,
+ Configuration hadoopConf) {
+ return loadCatalogConfigurable(impl, properties, hadoopConf, FileIO.class);
+ }
+
+ /**
+ * Load a custom implementation of a {@link CatalogConfigurable}.
* <p>
* The implementation must have a no-arg constructor.
- * If the class implements {@link Configurable},
+ * If the class implements Hadoop's {@link Configurable},
* a Hadoop config will be passed using {@link
Configurable#setConf(Configuration)}.
- * {@link FileIO#initialize(Map properties)} is called to complete the
initialization.
+ * {@link CatalogConfigurable#initialize(Map properties)} is called to
complete the initialization.
*
* @param impl full class name of a custom FileIO implementation
* @param hadoopConf hadoop configuration
+ * @param resultClass the final return class type
* @return FileIO class
* @throws IllegalArgumentException if class path not found or
* right constructor not found or
* the loaded class cannot be casted to the given interface type
*/
- public static FileIO loadFileIO(
+ public static <T extends CatalogConfigurable> T loadCatalogConfigurable(
String impl,
Map<String, String> properties,
- Configuration hadoopConf) {
- LOG.info("Loading custom FileIO implementation: {}", impl);
- DynConstructors.Ctor<FileIO> ctor;
+ Configuration hadoopConf,
+ Class<T> resultClass) {
+ LOG.info("Loading custom {} implementation: {}", resultClass.getName(),
impl);
+ DynConstructors.Ctor<T> ctor;
try {
- ctor = DynConstructors.builder(FileIO.class).impl(impl).buildChecked();
+ ctor = DynConstructors.builder(resultClass).impl(impl).buildChecked();
Review comment:
> it's also a little ridiculous having to serialize all the config for
FileIO and others into Map<String, String> so that it can be loaded from
properties if it's already serializable.
I don't think I'm following what you are suggesting. The configuration
passed in comes from a string map, like a Spark config or Hadoop config.
Serialization is a concern after catalogs are configured because the components
that were created on the driver need to be serialized to be sent for use on
executors. Those are separate enough that I'm not sure how to improve on this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]