codope commented on code in PR #11259:
URL: https://github.com/apache/hudi/pull/11259#discussion_r1607618163
##########
hudi-common/src/main/java/org/apache/hudi/storage/HoodieStorageUtils.java:
##########
@@ -33,6 +35,13 @@ public static HoodieStorage getStorage(String basePath,
StorageConfiguration<?>
}
public static HoodieStorage getStorage(StoragePath path,
StorageConfiguration<?> conf) {
- return getIOFactory(conf).getStorage(path);
+ String storageClass =
conf.getString(HoodieStorageConfig.HOODIE_STORAGE_CLASS.key())
+ .orElse(HoodieStorageConfig.HOODIE_STORAGE_CLASS.defaultValue());
+ try {
+ return (HoodieStorage) ReflectionUtils.loadClass(
Review Comment:
Reflection is still being used to create `HoodieStorage`. What was the
problem before this change? I mean `HoodieIOFactory` also holds
`HoodieStorage`, so why can't we follow the previous approach where reflection
was being used only to get `HoodieIOFactory`, which returned storage held by
it?
##########
hudi-common/src/main/java/org/apache/hudi/metadata/AbstractHoodieTableMetadata.java:
##########
@@ -35,16 +36,18 @@
public abstract class AbstractHoodieTableMetadata implements
HoodieTableMetadata {
protected transient HoodieEngineContext engineContext;
+ protected transient HoodieStorage storage;
Review Comment:
Trying to understand what necessitated this change? I mean why do we need
`HoodieStorage` in addition to `StorageConfiguration`?
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieStorageConfig.java:
##########
@@ -243,7 +243,12 @@ public class HoodieStorageConfig extends HoodieConfig {
.withDocumentation("The fully-qualified class name of the factory class
to return readers and writers of files used "
+ "by Hudi. The provided class should implement
`org.apache.hudi.io.storage.HoodieIOFactory`.");
-
+ public static final ConfigProperty<String> HOODIE_STORAGE_CLASS =
ConfigProperty
Review Comment:
Do we need a separate config? Can we infer from `HOODIE_IO_FACTORY_CLASS`?
Or maybe we can add infer function.
##########
hudi-common/src/main/java/org/apache/hudi/common/config/HoodieStorageConfig.java:
##########
@@ -243,7 +243,12 @@ public class HoodieStorageConfig extends HoodieConfig {
.withDocumentation("The fully-qualified class name of the factory class
to return readers and writers of files used "
+ "by Hudi. The provided class should implement
`org.apache.hudi.io.storage.HoodieIOFactory`.");
-
+ public static final ConfigProperty<String> HOODIE_STORAGE_CLASS =
ConfigProperty
+ .key("hoodie.storage.class")
+ .defaultValue("org.apache.hudi.storage.hadoop.HoodieHadoopStorage")
Review Comment:
let's also add `withValidValues(...)`?
##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReaderFactory.java:
##########
@@ -40,9 +39,9 @@
*/
public class HoodieFileReaderFactory {
- protected final StorageConfiguration<?> storageConf;
- public HoodieFileReaderFactory(StorageConfiguration<?> storageConf) {
- this.storageConf = storageConf;
+ protected final HoodieStorage storage;
+ public HoodieFileReaderFactory(HoodieStorage storage) {
Review Comment:
`StorageConfiguration` is serializable. Now that is replaced by
`HoodieStorage` in file readers, writers, and io factory, so do we need
`HoodieStorage` to be serializable too?
##########
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##########
@@ -384,13 +384,12 @@ public Boolean isMetadataTable() {
public HoodieStorage getStorage() {
if (storage == null) {
+ HoodieStorage newStorage = HoodieStorageUtils.getStorage(metaPath,
getStorageConf());
Review Comment:
In the query engine code, I guess meta client is created exactly once
throughout the lifecycle of the query?
Also, is `HoodieStorage` used by workers/executors in any way? Since
`HoodieStorage` is not serialized, so we should be careful.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]