boneanxs commented on code in PR #8452:
URL: https://github.com/apache/hudi/pull/8452#discussion_r1191878586
##########
hudi-common/src/main/java/org/apache/hudi/expression/Expression.java:
##########
@@ -40,14 +51,19 @@ public enum Operator {
}
}
- private final List<Expression> children;
+ List<Expression> getChildren();
Review Comment:
I change this to Interface, which doesn't allowed specify modifier.
##########
hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java:
##########
@@ -84,6 +96,19 @@ public List<String> getAllPartitionPaths() throws
IOException {
return getPartitionPathWithPathPrefixes(Collections.singletonList(""));
}
+ @Override
+ public List<String> getPartitionPathByExpression(List<String>
relativePathPrefixes,
Review Comment:
Sure, will add tests soon
##########
hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java:
##########
@@ -58,13 +62,21 @@ public class FileSystemBackedTableMetadata implements
HoodieTableMetadata {
private final SerializableConfiguration hadoopConf;
private final String datasetBasePath;
private final boolean assumeDatePartitioning;
+ private final boolean hiveStylePartitioningEnabled;
+ private final boolean urlEncodePartitioningEnabled;
public FileSystemBackedTableMetadata(HoodieEngineContext engineContext,
SerializableConfiguration conf, String datasetBasePath,
boolean assumeDatePartitioning) {
this.engineContext = engineContext;
this.hadoopConf = conf;
this.datasetBasePath = datasetBasePath;
this.assumeDatePartitioning = assumeDatePartitioning;
+ HoodieTableMetaClient metaClient = HoodieTableMetaClient.builder()
Review Comment:
`FileSystemBackedTableMetadata` directly implement interface
`HoodieTableMetadata`, while `metaClient` is initialized in Class
`BaseTableMetadata`.
Maybe I need to move metaClient out instead of local variable, in case
others may need it also.
##########
hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java:
##########
@@ -95,11 +120,38 @@ public List<String>
getPartitionPathWithPathPrefixes(List<String> relativePathPr
}).collect(Collectors.toList());
}
+ private int getRelativePathPartitionLevel(Types.RecordType partitionFields,
String relativePathPrefix) {
+ if (StringUtils.isNullOrEmpty(relativePathPrefix) || partitionFields ==
null || partitionFields.fields().size() == 1) {
+ return 0;
+ }
+
+ int level = 0;
+ for (int i = 1; i < relativePathPrefix.length() - 1; i++) {
Review Comment:
partitionFields have all partition columns, while relativePathPrefix only
contains partial partitions.
For ex. partitionFields could be <region, date, hour>, while
relativePathPrefix is `/region=US`, we should return 1 to indicate the start
partition index.
##########
hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java:
##########
@@ -95,11 +120,38 @@ public List<String>
getPartitionPathWithPathPrefixes(List<String> relativePathPr
}).collect(Collectors.toList());
}
+ private int getRelativePathPartitionLevel(Types.RecordType partitionFields,
String relativePathPrefix) {
+ if (StringUtils.isNullOrEmpty(relativePathPrefix) || partitionFields ==
null || partitionFields.fields().size() == 1) {
+ return 0;
+ }
+
+ int level = 0;
+ for (int i = 1; i < relativePathPrefix.length() - 1; i++) {
Review Comment:
By the way, is it possible we have more than 1 partition columns, while
`hoodie.datasource.write.partitionpath.urlencode` is disabled.
For ex. partitionFields is <region, date, hour> and some partition values
are "/US/2023/05/12/10"(which means region=US, date=2023/05/12, hour=10), now
we have only 3 partitions, but we could get 4 partition levels.
We can handle only one partition column while its values contains `/`(like
date=2023/05/12), but looks it's difficult to identify which value corresponds
to which column if we have many columns. Do you have any suggestions?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]