Dandandan opened a new pull request, #21665:
URL: https://github.com/apache/datafusion/pull/21665

   ## Which issue does this PR close?
   
   - Improves ClickBench Q36-Q42 performance by eliminating per-row CAST 
operations in date filters
   
   ## Rationale for this change
   
   ClickBench queries Q36-Q42 all filter on `EventDate` which is stored as 
`UInt16` but exposed as `Date32` via a view. The filter `EventDate >= 
'2013-07-01'` becomes:
   
   ```
   CAST(CAST(EventDate AS Int32) AS Date32) >= Date32("2013-07-01")
   ```
   
   This evaluates **4 CAST operations per row** (2 CASTs × 2 bounds). The 
existing `unwrap_cast_in_comparison` optimizer can invert this — cast the 
literal instead of the column — but it didn't support `Date32`/`Date64` types.
   
   ## What changes are included in this PR?
   
   Add `Date32` and `Date64` to `is_supported_numeric_type` and the 
corresponding match arms in `try_cast_numeric_literal` 
(`datafusion/expr-common/src/casts.rs`). Date32 is internally `i32` (days since 
epoch) and Date64 is `i64` (ms since epoch), so they participate in numeric 
comparisons identically to their integer counterparts.
   
   **Result for ClickBench Q36 (and Q37-Q42):**
   
   ```
   -- Before (FilterExec):
   CounterID@1 = 62
     AND CAST(CAST(EventDate@0 AS Int32) AS Date32) >= 2013-07-01
     AND CAST(CAST(EventDate@0 AS Int32) AS Date32) <= 2013-07-31
   
   -- After:
   CounterID@1 = 62
     AND EventDate@0 >= 15887
     AND EventDate@0 <= 15917
   ```
   
   Both CASTs fully unwrapped in filter, predicate, and pruning_predicate — 
from 4 per-row CASTs to 0.
   
   ## Are these changes tested?
   
   - All 19 existing `casts` unit tests pass
   - ClickBench sqllogictest updated with new expected plans for Q36-Q42
   - Query results unchanged
   
   ## Are there any user-facing changes?
   
   Filters comparing Date32/Date64 columns through CAST expressions will now 
have the CAST eliminated at plan time, improving filter performance.
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to