zhangyue19921010 commented on code in PR #12407:
URL: https://github.com/apache/hudi/pull/12407#discussion_r1868853356
##########
rfc/rfc-60/rfc-60.md:
##########
@@ -84,23 +119,85 @@ public interface HoodieStorageStrategy extends
Serializable {
/**
* Return a storage location for the given filename.
*
- * @param fileId data file ID
+ * @param path fileName
Review Comment:
The path can be a `relative partition path` or a `relative file path with a
filename`.
When a path is passed in externally, the storageLocation is processed by
different storage strategies to prepend the appropriate prefix to the path.
For `HoodieWriteHandle`, the input is a relative partition path.
For example, in HoodieCacheLayerStorageStrategy, a relative path like
`tableName/dt=2024-12-04` is passed into HoodieWriteHandle. After processing by
HoodieCacheLayerStorageStrategy, it determines that the current write should go
to the cache layer and returns a full path with the cache layer prefix, such as
`hdfs://nsxxx/user/jdr_lakehouse_cache_layer/jdr_lakehouse_traffic/gdm.db/gdm_jdrxxxxx/`
Similarly, in HoodieObjectStorageStrategy, a relative path like
`tableName/dt=2024-12-04` is passed into HoodieWriteHandle. After processing by
HoodieObjectStorageStrategy, it returns a full path like
`s3://<table_storage_bucket>/0bfb3d6e/<hudi_table_name>/country=india/`
For `HoodieCommitMetaData#getFileIdAndFullPaths`, the input is a relative
path with a file, such as
dt=2021-01-05/00000000-3fd6-4d4f-b1e7-4b9768ee0911-0_0-8-7_20241104162125935.parquet.
```
public HashMap<String, String> getFileIdAndFullPaths(String instantTime,
HoodieStorageStrategy strategy) {
HashMap<String, String> fullPaths = new HashMap<>();
for (Map.Entry<String, String> entry :
getFileIdAndRelativePaths().entrySet()) {
String fullPath = entry.getValue() != null
? strategy.storageLocation(entry.getValue(),
instantTime).toString() : null;
fullPaths.put(entry.getKey(), fullPath);
}
return fullPaths;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]