jialiangCHOU opened a new issue, #9959:
URL: https://github.com/apache/incubator-doris/issues/9959

   ## Background:
   Apache Doris only stores ndv for column statistics.. In the selection rate 
of inference predicates, especially when the data distribution is uneven, the 
error is large. Therefore, it is necessary to introduce the statistical 
information of column contour histogram for more accurate cost evaluation, so 
that the optimizer can infer the real best plan.
   
   ## Requirements:
   
   1. use the existing statistical information collection framework to collect 
the statistical data of column contour histogram (only support manual trigger 
Collection). The collection process should not affect the normal task execution 
of the cluster.
   
   2. transform the existing statistical information storage framework to 
support the storage of column contour histogram statistical data.
   
   3. query column contour histogram data is supported.
   
   4. the logic of deriving predicate selection rate based on column contour 
histogram is realized, and it is correctly used by the statistical information 
derivation framework.
   
   ## Project output requirements:
   
   With the help of column contour histogram, a more accurate predicate 
selection rate is derived to help the optimizer derive a better query plan. 
Finally, it helps Apache Doris improve query performance under the TPC-H 
benchmark.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to