[GitHub] [flink] pltbkd commented on a diff in pull request #20415: [FLINK-28711] Hive connector implements SupportsDynamicFiltering interface

GitBox Sat, 06 Aug 2022 04:51:27 -0700


pltbkd commented on code in PR #20415:
URL: https://github.com/apache/flink/pull/20415#discussion_r939520164



##########
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSource.java:
##########
@@ -247,6 +254,30 @@ public void applyPartitions(List<Map<String, String>> 
remainingPartitions) {
         }
     }
 
+    @Override
+    public List<String> applyDynamicFiltering(List<String> 
candidateFilterFields) {
+        if (catalogTable.getPartitionKeys() != null
+                && catalogTable.getPartitionKeys().size() != 0) {
+            checkArgument(
+                    !candidateFilterFields.isEmpty(),
+                    "At least one field should be provided for dynamic 
filtering");
+            checkState(
+                    dynamicPartitionKeys == null, "Dynamic filtering should 
not be applied twice.");
+
+            // only accept partition fields to do dynamic partition pruning
+            this.dynamicPartitionKeys = new ArrayList<>();
+            for (String field : candidateFilterFields) {
+                if (catalogTable.getPartitionKeys().contains(field)) {

Review Comment:
   In my opinion even rejecting all fields is also acceptable, which is the 
case of non-partition hive tables. The result is still correct, and as the 
planner should not add dynamic filter data collector in such cases, there 
should be no impact to performance. So I think it's ok not to do check and let 
the job run normally. But maybe a warning log can be added here to notify the 
users about this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] pltbkd commented on a diff in pull request #20415: [FLINK-28711] Hive connector implements SupportsDynamicFiltering interface

Reply via email to