weizhengte commented on PR #18069: URL: https://github.com/apache/doris/pull/18069#issuecomment-1482727600
> After reading your commits, I still don't understand how they support incremental updates for stats. Here's what I think: Our current logic for collecting statistics is as follows: first collect partition statistics, and then aggregate partition statistics as table statistics. In the previous pr, we supported specifying the collection of certain partition statistics, so using the INCREMENTAL syntax through this PR should enable incremental collection of statistics. For example, if INCREMENTAL is not specified, we delete all the partition statistics of the corresponding columns, collect new statistics of the relevant partitions, and then summarize them into table statistics, which can be considered as "full collection". If INCREMENTAL is specified, we will not delete the original partition statistics, but only collect the statistics of the specified partitions, automatically complete the update of the corresponding partition statistics through the unique mode, and then summarize the new statistics into the table. This is done in this way "Incremental collection". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
