zhangyue19921010 commented on code in PR #13060:
URL: https://github.com/apache/hudi/pull/13060#discussion_r2025037864
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/prune/PrimaryKeyPruners.java:
##########
@@ -45,7 +45,7 @@ public class PrimaryKeyPruners {
public static final int BUCKET_ID_NO_PRUNING = -1;
- public static int getBucketId(List<ResolvedExpression> hashKeyFilters,
Configuration conf) {
+ public static int getBucketFieldHashing(List<ResolvedExpression>
hashKeyFilters, Configuration conf) {
Review Comment:
Sorry Danny, I didn't get this. Is that possible to get full partition path
during original dataBucket computation?
```
@Override
public Result applyFilters(List<ResolvedExpression> filters) {
List<ResolvedExpression> simpleFilters =
filterSimpleCallExpression(filters);
Tuple2<List<ResolvedExpression>, List<ResolvedExpression>> splitFilters
= splitExprByPartitionCall(simpleFilters, this.partitionKeys,
this.tableRowType);
this.predicates = ExpressionPredicates.fromExpression(splitFilters.f0);
this.columnStatsProbe = ColumnStatsProbe.newInstance(splitFilters.f0);
this.partitionPruner = createPartitionPruner(splitFilters.f1,
columnStatsProbe);
this.dataBucket = getDataBucket(splitFilters.f0);
// refuse all the filters now
return SupportsFilterPushDown.Result.of(new
ArrayList<>(splitFilters.f1), new ArrayList<>(filters));
}
```
What is PR did is get and pass hashing value to `getFilesInPartitions`, then
compute numBuckets , finally compute the final bucket id `hashing value %
numBuckets`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]