xudong963 commented on code in PR #15296: URL: https://github.com/apache/datafusion/pull/15296#discussion_r2002299377
########## datafusion/expr-common/src/statistics.rs: ########## @@ -857,6 +857,143 @@ pub fn compute_variance( ScalarValue::try_from(target_type) } +/// Merges two distributions into a single distribution that represents their combined statistics. +/// This creates a more general distribution that approximates the mixture of the input distributions. +pub fn merge_distributions(a: &Distribution, b: &Distribution) -> Result<Distribution> { + let range_a = a.range()?; + let range_b = b.range()?; + + // Determine data type and create combined range + let combined_range = if range_a.is_unbounded() || range_b.is_unbounded() { Review Comment: Great, one concern is that I found the `Interval::union` works with intervals of the same data type. It seems that we can loose the requirement, such as, `Int64` with `Int32`, `int` with `float`, etc also can be unioned. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org