LemonCL opened a new issue, #55804: URL: https://github.com/apache/doris/issues/55804
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version doris-2.11 ### What's Wrong? ## Problem Description When querying Iceberg tables with STRING-typed partition columns, predicate pushdown fails when using CAST operations to convert the partition column to datetime types. ### Example Query ``` SELECT * FROM iceberg_table WHERE CAST(partition_col AS datetime) >= '2025-06-10 00:00:00' Expected Behavior ``` Expected Behavior The query should correctly filter data based on the datetime comparison and utilize Iceberg's predicate pushdown capabilities. Actual Behavior The predicate pushdown fails silently or produces incorrect results due to string comparison issues in the Iceberg layer. Root Cause Analysis 1. Type Mismatch Iceberg Storage: Partition column is stored as STRING type (e.g., "2025-06-10") SQL Query: Uses CAST(partition_col AS datetime) for datetime comparison Predicate Pushdown: The datetime literal gets converted to full datetime string format 2. String Comparison Issue The conversion results in incorrect string comparisons: // Incorrect conversion (before fix) DateLiteral('2025-06-10 00:00:00') → "2025-06-10 00:00:00" // Iceberg comparison becomes "2025-06-10" >= "2025-06-10 00:00:00" // Returns false due to string length difference 3. Specific Failure Cases >= operations fail: CAST(pt AS datetime) >= '2025-06-10 00:00:00' = operations may fail: Depending on string formatting > operations may work: Due to dictionary ordering coincidences Reproduction Steps 1. Create an Iceberg table with STRING-typed partition column: ``` CREATE TABLE test_table ( id INT, event_time STRING ) PARTITIONED BY (event_time); ``` 2. Insert data with date strings: ``` INSERT INTO test_table VALUES (1, '2025-06-09'), (2, '2025-06-10'), (3, '2025-06-11'); ``` 3. Run problematic query: ``` SELECT * FROM test_table WHERE CAST(event_time AS datetime) >= '2025-06-10 00:00:00'; ``` 4. Actual: return empty ### What You Expected? Expected: Returns rows with dates >= 2025-06-10 ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
