LemonCL opened a new issue, #55804:
URL: https://github.com/apache/doris/issues/55804

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   doris-2.11
   
   ### What's Wrong?
   
   ## Problem Description
   
   When querying Iceberg tables with STRING-typed partition columns, predicate 
pushdown fails when using CAST operations to convert the partition column to 
datetime types.
   
   ### Example Query
   ```
   SELECT * FROM iceberg_table 
   WHERE CAST(partition_col AS datetime) >= '2025-06-10 00:00:00'
   Expected Behavior
   ```
   Expected Behavior
   The query should correctly filter data based on the datetime comparison and 
utilize Iceberg's predicate pushdown capabilities.
   
   Actual Behavior
   The predicate pushdown fails silently or produces incorrect results due to 
string comparison issues in the Iceberg layer.
   
   Root Cause Analysis
   1. Type Mismatch
   Iceberg Storage: Partition column is stored as STRING type (e.g., 
"2025-06-10")
   SQL Query: Uses CAST(partition_col AS datetime) for datetime comparison
   Predicate Pushdown: The datetime literal gets converted to full datetime 
string format
   2. String Comparison Issue
   The conversion results in incorrect string comparisons:
   
   // Incorrect conversion (before fix)
   DateLiteral('2025-06-10 00:00:00') → "2025-06-10 00:00:00"
   
   // Iceberg comparison becomes
   "2025-06-10" >= "2025-06-10 00:00:00"  // Returns false due to string length 
difference
   
   3. Specific Failure Cases
   >= operations fail: CAST(pt AS datetime) >= '2025-06-10 00:00:00'
   = operations may fail: Depending on string formatting
   > operations may work: Due to dictionary ordering coincidences
   
   Reproduction Steps
   
   1. Create an Iceberg table with STRING-typed partition column:
   ```
   CREATE TABLE test_table (
       id INT,
       event_time STRING
   ) PARTITIONED BY (event_time);
   ```
   2. Insert data with date strings:
   ```
   INSERT INTO test_table VALUES 
   (1, '2025-06-09'),
   (2, '2025-06-10'), 
   (3, '2025-06-11');
   ```
   
   3. Run problematic query:
   ```
   SELECT * FROM test_table 
   WHERE CAST(event_time AS datetime) >= '2025-06-10 00:00:00';
   ```
   4. Actual: return empty
   
   ### What You Expected?
   
   Expected: Returns rows with dates >= 2025-06-10
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to