CTTY commented on code in PR #12460:
URL: https://github.com/apache/hudi/pull/12460#discussion_r1881323281
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##########
@@ -112,8 +115,21 @@ private String makeWriteToken() {
return FSUtils.makeWriteToken(getPartitionId(), getStageId(),
getAttemptId());
}
+ protected StoragePath getPartitionPath(String partitionPath) {
Review Comment:
nit: we can change the method name to `toPhysicalPath`
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieCreateHandle.java:
##########
@@ -96,8 +96,7 @@ public HoodieCreateHandle(HoodieWriteConfig config, String
instantTime, HoodieTa
try {
HoodiePartitionMetadata partitionMetadata = new
HoodiePartitionMetadata(storage, instantTime,
- new StoragePath(config.getBasePath()),
- FSUtils.constructAbsolutePath(config.getBasePath(), partitionPath),
+ new StoragePath(config.getBasePath()),
getPartitionPath(partitionPath),
Review Comment:
Why do we need to store physical path in partition metadata?
##########
hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java:
##########
@@ -376,11 +383,13 @@ private void ensurePartitionsLoadedCorrectly(List<String>
partitionList) {
// Not loaded yet
try {
LOG.debug("Building file system view for partitions: " +
partitionSet);
-
+ HoodieStorage storage = metaClient.getStorage();
+ HashMap<String, String> config = new HashMap<>();
+ config.put(HoodieOSSStorageStrategy.TABLE_BASE_PATH,
metaClient.getBasePath().getPathWithoutSchemeAndAuthority().toUri().getPath());
// Pairs of relative partition path and absolute partition path
List<Pair<String, StoragePath>> absolutePartitionPathList =
partitionSet.stream()
.map(partition -> Pair.of(
- partition,
FSUtils.constructAbsolutePath(metaClient.getBasePath(), partition)))
+ partition, storage.getAllLocations(partition,
config).stream().findFirst().get()))
Review Comment:
Just curious, would `findFirst()` work here even in the production?
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java:
##########
@@ -429,13 +432,21 @@ public Option<HoodieMetadataColumnStats>
getColumnStatMetadata() {
/**
* Returns the files added as part of this record.
*/
- public List<StoragePathInfo> getFileList(HoodieStorage storage, StoragePath
partitionPath) {
+ public List<StoragePathInfo> getFileList(HoodieTableMetaClient
dataMetaClient, StoragePath partitionPath) {
+ HoodieStorage storage = dataMetaClient.getStorage();
+ String tableName = dataMetaClient.getTableConfig().getTableName();
+ StoragePath dataBasePath = dataMetaClient.getBasePath();
long blockSize = storage.getDefaultBlockSize(partitionPath);
Review Comment:
How would this line leverage storage strategy?
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieWriteHandle.java:
##########
@@ -112,8 +115,21 @@ private String makeWriteToken() {
return FSUtils.makeWriteToken(getPartitionId(), getStageId(),
getAttemptId());
}
+ protected StoragePath getPartitionPath(String partitionPath) {
+ boolean isMDT =
hoodieTable.getMetaClient().getTableConfig().getTableName().contains("metadata");
+ if (isMDT) {
+ return FSUtils.constructAbsolutePath(config.getBasePath(),
partitionPath);
+ } else {
+ HashMap<String, String> configMap = new HashMap<>();
+ Path basePath = new Path(config.getBasePath());
+ configMap.put(HoodieOSSStorageStrategy.FILE_ID_KEY, fileId);
+ configMap.put(HoodieOSSStorageStrategy.TABLE_BASE_PATH,
basePath.toUri().getPath());
+ return storage.storageLocation(partitionPath, configMap);
+ }
+ }
+
public StoragePath makeNewPath(String partitionPath) {
- StoragePath path = FSUtils.constructAbsolutePath(config.getBasePath(),
partitionPath);
+ StoragePath path = getPartitionPath(partitionPath);
Review Comment:
I assume this change is just for the POC? Ideally the conversion from
logical path to physical path should happen within `HoodieStorage` at L134
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]