liyafan82 opened a new pull request, #3015:
URL: https://github.com/apache/calcite/pull/3015

   According to the current implementation, the result of 
RelMdUtil#numDistinctVals is zero when the domain size is too big.
   However, the true result should be some other positive number. We solve the 
problem by 3 separate cases below:
   (The formulat to calculate is `N * [1 - (1 - 1 / N) ^ n]`, where N is the 
domain size and n is the number of selections)
    1. If the values of N and n are not too big, we directly calculate the 
formula as N * [1 - exp(n * ln(1 - 1 / N))].
    2. If the values of N or n are big, but the ratio n / N is not close to 0, 
we expand the exponent (n * ln(1 - 1 / N)) as a Taylor series.
    3. If the values of N or n are big, and the ratio n / N is close to 0, we 
expand the whole formula as a Taylor series. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to