codope commented on code in PR #11259:
URL: https://github.com/apache/hudi/pull/11259#discussion_r1607618163


##########
hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java:
##########
@@ -33,6 +35,13 @@ public static HoodieStorage getStorage(String basePath, 
StorageConfiguration<?>
   }
 
   public static HoodieStorage getStorage(StoragePath path, 
StorageConfiguration<?> conf) {
-    return getIOFactory(conf).getStorage(path);
+    String storageClass = 
conf.getString(HoodieStorageConfig.HOODIE_STORAGE_CLASS.key())
+        .orElse(HoodieStorageConfig.HOODIE_STORAGE_CLASS.defaultValue());
+    try {
+      return (HoodieStorage) ReflectionUtils.loadClass(

Review Comment:
   Reflection is still being used to create `HoodieStorage`. What was the 
problem before this change? I mean `HoodieIOFactory` also holds 
`HoodieStorage`, so why can't we follow the previous approach where reflection 
was being used only to get `HoodieIOFactory`, which returned storage held by 
it? 



##########
hudi-common/src/main/java/org/apache/hudi/metadata/AbstractHoodieTableMetadata.java:
##########
@@ -35,16 +36,18 @@
 public abstract class AbstractHoodieTableMetadata implements 
HoodieTableMetadata {
 
   protected transient HoodieEngineContext engineContext;
+  protected transient HoodieStorage storage;

Review Comment:
   Trying to understand what necessitated this change? I mean why do we need 
`HoodieStorage` in addition to `StorageConfiguration`?



##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieStorageConfig.java:
##########
@@ -243,7 +243,12 @@ public class HoodieStorageConfig extends HoodieConfig {
       .withDocumentation("The fully-qualified class name of the factory class 
to return readers and writers of files used "
           + "by Hudi. The provided class should implement 
`org.apache.hudi.io.storage.HoodieIOFactory`.");
 
-
+  public static final ConfigProperty<String> HOODIE_STORAGE_CLASS = 
ConfigProperty

Review Comment:
   Do we need a separate config? Can we infer from `HOODIE_IO_FACTORY_CLASS`? 
Or maybe we can add infer function.



##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieStorageConfig.java:
##########
@@ -243,7 +243,12 @@ public class HoodieStorageConfig extends HoodieConfig {
       .withDocumentation("The fully-qualified class name of the factory class 
to return readers and writers of files used "
           + "by Hudi. The provided class should implement 
`org.apache.hudi.io.storage.HoodieIOFactory`.");
 
-
+  public static final ConfigProperty<String> HOODIE_STORAGE_CLASS = 
ConfigProperty
+      .key("hoodie.storage.class")
+      .defaultValue("org.apache.hudi.storage.hadoop.HoodieHadoopStorage")

Review Comment:
   let's also add `withValidValues(...)`?



##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReaderFactory.java:
##########
@@ -40,9 +39,9 @@
  */
 public class HoodieFileReaderFactory {
 
-  protected final StorageConfiguration<?> storageConf;
-  public HoodieFileReaderFactory(StorageConfiguration<?> storageConf) {
-    this.storageConf = storageConf;
+  protected final HoodieStorage storage;
+  public HoodieFileReaderFactory(HoodieStorage storage) {

Review Comment:
   `StorageConfiguration` is serializable. Now that is replaced by 
`HoodieStorage` in file readers, writers, and io factory, so do we need 
`HoodieStorage` to be serializable too?



##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##########
@@ -384,13 +384,12 @@ public Boolean isMetadataTable() {
 
   public HoodieStorage getStorage() {
     if (storage == null) {
+      HoodieStorage newStorage = HoodieStorageUtils.getStorage(metaPath, 
getStorageConf());

Review Comment:
   In the query engine code, I guess meta client is created exactly once 
throughout the lifecycle of the query? 
   Also, is `HoodieStorage` used by workers/executors in any way? Since 
`HoodieStorage` is not serialized, so we should be careful.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to