pltbkd commented on code in PR #20415:
URL: https://github.com/apache/flink/pull/20415#discussion_r939520164
##########
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSource.java:
##########
@@ -247,6 +254,30 @@ public void applyPartitions(List<Map<String, String>>
remainingPartitions) {
}
}
+ @Override
+ public List<String> applyDynamicFiltering(List<String>
candidateFilterFields) {
+ if (catalogTable.getPartitionKeys() != null
+ && catalogTable.getPartitionKeys().size() != 0) {
+ checkArgument(
+ !candidateFilterFields.isEmpty(),
+ "At least one field should be provided for dynamic
filtering");
+ checkState(
+ dynamicPartitionKeys == null, "Dynamic filtering should
not be applied twice.");
+
+ // only accept partition fields to do dynamic partition pruning
+ this.dynamicPartitionKeys = new ArrayList<>();
+ for (String field : candidateFilterFields) {
+ if (catalogTable.getPartitionKeys().contains(field)) {
Review Comment:
In my opinion even rejecting all fields is also acceptable, which is the
case of non-partition hive tables. The result is still correct, and as the
planner should not add dynamic filter data collector in such cases, there
should be no impact to performance. So I think it's ok not to do check and let
the job run normally. But maybe a warning log can be added here to notify the
users about this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]