[GitHub] [flink] swuferhong commented on a diff in pull request #20084: [FLINK-27988][table-planner] Let HiveTableSource extend from SupportStatisticsReport

GitBox Sun, 17 Jul 2022 19:20:59 -0700


swuferhong commented on code in PR #20084:
URL: https://github.com/apache/flink/pull/20084#discussion_r922935948



##########
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/HiveTableSource.java:
##########
@@ -261,6 +279,93 @@ public DynamicTableSource copy() {
         return source;
     }
 
+    @Override
+    public TableStats reportStatistics() {
+        try {
+            // only support BOUNDED source
+            if (isStreamingSource()) {
+                return TableStats.UNKNOWN;
+            }
+            if (flinkConf.get(HiveOptions.SOURCE_REPORT_STATISTICS)
+                    != FileSystemConnectorOptions.FileStatisticsType.ALL) {
+                return TableStats.UNKNOWN;
+            }
+
+            HiveSourceBuilder sourceBuilder =
+                    new HiveSourceBuilder(jobConf, flinkConf, tablePath, 
hiveVersion, catalogTable)
+                            .setProjectedFields(projectedFields)
+                            .setLimit(limit);

Review Comment:
   > we should consider how to handle the case after limit push and filter push 
down
   
   Now, hive source don't support filter push down. For limit push down, 
`PushLimitIntoTableSourceScanRule` happened after 
`FlinkRecomputeStatisticsProgram`, and `PushLimitIntoTableSourceScanRule` can 
re-compute  the new row count. So, I think there is no need to add re-compute 
stats logic in HiveTableSource.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] swuferhong commented on a diff in pull request #20084: [FLINK-27988][table-planner] Let HiveTableSource extend from SupportStatisticsReport

Reply via email to