dmitrybugakov commented on code in PR #10510: URL: https://github.com/apache/datafusion/pull/10510#discussion_r1602926580
########## datafusion/expr/src/interval_arithmetic.rs: ########## @@ -1469,6 +1472,8 @@ pub enum NullableInterval { MaybeNull { values: Interval }, /// The value is definitely not null, and is within the specified range. NotNull { values: Interval }, + /// Added to handle cases with insufficient statistics + Unknown, } Review Comment: @alamb I have conducted some additional tests on various queries and observed the following results: ``` CREATE TABLE data_table ( id INT, value INT ); ``` ``` INSERT INTO data_table (id, value) VALUES (1, 100), (2, 200), (3, 300), (4, 400), (5, 500), (6, 600), (7, 700), (8, 800), (9, 900), (10, 1000); ``` ``` SELECT id, value FROM data_table WHERE value > 500; ``` _Log:_ `Schema: id: Int32, value: Int32` `Column 0: ColumnStatistics { null_count: Inexact(0), max_value: Exact(Int32(NULL)), min_value: Exact(Int32(NULL)), distinct_count: Absent }` `Column 1: ColumnStatistics { null_count: Inexact(0), max_value: Inexact(Int32(NULL)), min_value: Inexact(Int32(501)), distinct_count: Absent }` ``` SELECT AVG(value) AS average_value, SUM(value) AS total_value FROM data_table; ``` _Log:_ `Schema: average_value: Float64, total_value: Int64` `Column 0: ColumnStatistics { null_count: Absent, max_value: Absent, min_value: Absent, distinct_count: Absent }` `Column 1: ColumnStatistics { null_count: Absent, max_value: Absent, min_value: Absent, distinct_count: Absent }` ``` SELECT a.id AS id_a, a.value AS value_a, b.id AS id_b, b.value AS value_b FROM data_table a CROSS JOIN data_table b; ``` _Log:_ `Schema: id_a: Int32, value_a: Int32, id_b: Int32, value_b: Int32` `Column 0: ColumnStatistics { null_count: Exact(0), max_value: Absent, min_value: Absent, distinct_count: Absent }` `Column 1: ColumnStatistics { null_count: Exact(0), max_value: Absent, min_value: Absent, distinct_count: Absent }` `Column 2: ColumnStatistics { null_count: Exact(0), max_value: Absent, min_value: Absent, distinct_count: Absent }` `Column 3: ColumnStatistics { null_count: Exact(0), max_value: Absent, min_value: Absent, distinct_count: Absent }` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org