[GitHub] [doris] weizhengte commented on a diff in pull request #18069: [Improvement](statistics) Support for statistics removing and incremental collection

via GitHub Fri, 24 Mar 2023 05:17:53 -0700


weizhengte commented on code in PR #18069:
URL: https://github.com/apache/doris/pull/18069#discussion_r1147507991



##########
fe/fe-core/src/main/java/org/apache/doris/statistics/AnalysisManager.java:
##########
@@ -94,6 +97,22 @@ public void createAnalysisJob(AnalyzeStmt analyzeStmt) {
         Set<String> partitionNames = analyzeStmt.getPartitionNames();
         Map<Long, AnalysisTaskInfo> analysisTaskInfos = new HashMap<>();
         long jobId = Env.getCurrentEnv().getNextId();
+
+        // If the analysis is not incremental, need to delete existing 
statistics.
+        if (!analyzeStmt.isIncrement) {
+            long dbId = analyzeStmt.getDbId();
+            TableIf table = analyzeStmt.getTable();
+            Set<Long> tblIds = Sets.newHashSet(table.getId());
+            Set<Long> partIds = partitionNames.stream()
+                    .map(p -> table.getPartition(p).getId())
+                    .collect(Collectors.toSet());
+            if (analyzeStmt.isHistogram) {
+                StatisticsRepository.dropHistogram(dbId, tblIds, colNames, 
partIds);

Review Comment:
   Indeed, for histograms. Indeed, this is true for histograms. 
   
   For other statistics, however, partition statistics need to be removed when 
they are not collected incrementally in order to summarize the correct table 
statistics. For example, collect statistics for only a few partitions



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [doris] weizhengte commented on a diff in pull request #18069: [Improvement](statistics) Support for statistics removing and incremental collection

Reply via email to