CTTY commented on code in PR #12407:
URL: https://github.com/apache/hudi/pull/12407#discussion_r1881347866


##########
rfc/rfc-60/rfc-60.md:
##########
@@ -74,37 +116,78 @@ not meet their use-case.
 
 ## Design
 
-### Interface
+### Config
+We will add two table-level parameters, namely Hoodie Table Config:
+
+The first one, `hoodie.storage.path` allows users to specify additional 
physical storage locations apart from the Base Path, separated by commas if 
there are multiple.
+
+The second one, `hoodie.storage.strategy.class` allows users to specify the 
storage strategy for the current Hudi table, such as 
+HoodieDefaultStorageStrategy (default), HoodieCacheLayerStorageStrategy, and 
HoodieObjectStorageStrategy.
+Take HoodieCacheLayerStorageStrategy as example, related table config should 
like this
+```text
+[sss@ssss ~]$ hdfs dfs -cat 
hdfs://nsxxx/user/jdr_lakehouse/gdm.db/xxx/.hoodie/hoodie.properties
+#Updated at 2024-08-07T11:49:10.252Z
+#Wed Aug 07 19:49:10 CST 2024
+hoodie.table.keygenerator.class=org.apache.hudi.keygen.ComplexAvroKeyGenerator
+hoodie.table.precombine.field=receive_ts
+hoodie.table.version=5
+hoodie.datasource.write.hive_style_partitioning=true
+hoodie.table.checksum=318067208
+hoodie.table.create.schema={xxxx}
+hoodie.table.cdc.enabled=false
+hoodie.archivelog.folder=archived
+hoodie.table.name=xxxx
+hoodie.table.type=COPY_ON_WRITE
+hoodie.datasource.write.partitionpath.urlencode=false
+hoodie.datasource.write.drop.partition.columns=false
+hoodie.timeline.layout.version=1
+hoodie.table.recordkey.fields=mid,dt
+hoodie.table.partition.fields=dt,application,business
+hoodie.storage.path=hdfs\://nsxxxx/user/jdr_lakehouse/jdr_lakehouse_traffic/gdm.db/gdm_jdr_xxxxx
+hoodie.storage.strategy.class=org.apache.hudi.common.storage.HoodieCacheLayerStorageStrategy
+```
+
+
+### Core Abstraction
 
 ```java
 /**
  * Interface for providing storage file locations.
  */
 public interface HoodieStorageStrategy extends Serializable {
   /**
-   * Return a storage location for the given filename.
+   * Return a storage location for the given path
    *
-   * @param fileId data file ID
-   * @return a storage location string for a data file
+   * @param path
+   * @param configMap
+   * @return Append the appropriate prefix based on the Path and return
    */
-  String storageLocation(String fileId);
+  StoragePath storageLocation(String path, Map<String, String> configMap);

Review Comment:
   I took a look at #12460 and I'm still hesitant on passing a config map to 
storage strategy API instead of fixed configs. Passing a config map makes it 
harder to enforce what should be passed in, and will eventually cause failures 
for certain strategy implementations that require specific props. 
   
   One way to solve this is to add a new class `StrategySpec` as a container 
and enforces certain props to be passed in when creating an instance of it. But 
even this is worse than simply asking for props like `fileId` and `instantTime` 
in the API directly.
   
   I have some rough thoughts on using `StorageStrategFactory` to dynamicly 
create strategies instead of passing props/configs when calling API, something 
like below: 
   ```
   Factory {
     getInstance(String path, tableConfig, Map<> props) {
       // some logic to infer strategy from path and tableConfig
       Reflection.new(className, props)
     }
   }
   ```
   
   I haven't got the time to dig deeper though, will need to circle back on 
this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to