parthchandra opened a new pull request, #4008:
URL: https://github.com/apache/datafusion-comet/pull/4008

   ## Which issue does this PR close?
   Part of https://github.com/apache/datafusion-comet/issues/286
   Part of https://github.com/apache/datafusion-comet/issues/378
   
   
   ## Rationale for this change
   
   We currently fall back to Spark for timestamp_ntz casts
   
   ## What changes are included in this PR?
   
    Add native support for casting to and from `TimestampNTZType` (timestamp 
without timezone).                                                              
                                                                                
                                                                                
                        
   
   PR Description                                                               
                                                                                
                                                                                
                                                                                
                    
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                       
     **Implemented cast directions:**                                           
                                                                                
                                                                                
                                                                                
                      
     - TimestampNTZ -> String (timezone-independent)
     - TimestampNTZ -> Date (timezone-independent)                              
                                                                                
                                                                                
                                                                                
                      
     - TimestampNTZ -> Timestamp (session-TZ dependent)                         
                                                                                
                                                                                
                                                                                
                      
     - Date -> TimestampNTZ (timezone-independent)                              
                                                                                
                                                                                
                                                                                
                      
     - Timestamp -> TimestampNTZ (session-TZ dependent)                         
                                                                                
                                                                                
                                                                                
                      
                                                                                
                                                                                
                                                                                
                                                                                
                      
     **Not yet implemented:**                                                   
                                                                                
                                                                                
                                                                                
                      
     - String -> TimestampNTZ (marked `Incompatible`, tracked in #378)          
                                                                                
                                                                                
                                                                                
                      
                                                                                
                                                                                
                                                                                
                                                                                
                      
     ### Key implementation details                                             
                                                                                
                                                                                
                                                                                
                      
                                                                                
                                                                                
                                                                                
                                                                                
                      
     - **Timezone-independent casts** (NTZ↔Date, NTZ→String): Pure arithmetic 
on epoch microseconds; session timezone has no effect on results.               
                                                                                
                                                                                
                        
     - **Timezone-dependent casts** (NTZ↔Timestamp): Interprets/produces local 
datetimes in the session timezone. Uses `resolve_local_datetime()` helper to 
handle DST ambiguity (fall-back) and gaps (spring-forward) matching Spark's 
`ZonedDateTime` semantics.
                                                                                
                                                                                
                                                                                
                                                                                
                      
   
   ## How are these changes tested?
                                                                                
                                                                                
                                                                                
                                                                                
 
     ### Cast-specific test coverage                                            
                                                                                
                                                                                
                                                                                
                      
                 
     | Cast | Test method | Timezone coverage | Notes |                         
                                                                                
                                                                                
                                                                                
                      
     |------|------------|-------------------|-------|
     | Date → NTZ | `cast DateType to TimestampNTZType` | 17 representative 
zones | Includes half-hour (Kolkata +5:30), quarter-hour (Kathmandu +5:45, 
Chatham +12:45) offsets |                                                       
                                                                                
                               
     | Timestamp → NTZ | `cast TimestampType to TimestampNTZType` | 17 zones | 
Exercises DST transitions (Sao Paulo, Sydney, New York) |                       
                                                                                
                                                                                
                       
     | NTZ → String | `cast TimestampNTZType to StringType` | N/A 
(TZ-independent) | |                                                            
                                                                                
                                                                                
                                    
     | NTZ → Date | `cast TimestampNTZType to DateType` | 17 zones | |          
                                                                                
                                                                                
                                                                                
                      
     | NTZ → Timestamp | `cast TimestampNTZType to TimestampType` | 17 zones | 
|                                                                               
                                                                                
                                                                                
                       
     | String → NTZ | `cast StringType to TimestampNTZType` | — | ignored; not 
yet implemented |                                                               
                                                                                
                                                                                
                     
                                                                                
                                                                                
                                                                                
                                                                                
                      
     ### SQL integration tests (`cast_timestamp_ntz.sql`)                       
                                                                                
                                                                                
                                                                                
                      
                                                                                
                                                                                
                                                                                
                                                                                
                      
     - NTZ → String, Date, Timestamp
     - Date → NTZ, Timestamp → NTZ                                              
                                                                                
                                                                                
                                                                                
                      
     - Literal casts (e.g. `CAST(TIMESTAMP_NTZ'2020-01-01 12:34:56.789' AS 
string)`)                                                                       
                                                                                
                                                                                
                           
                                                                                
                                                                                
                                                                                
                                                                                
                      
     ### Test data                                                              
                                                                                
                                                                                
                                                                                
                      
                                                                                
                                                                                
                                                                                
                                                                                
                      
     `generateTimestampNTZ()` reuses `generateTimestampLiterals()` which covers 
epoch, modern dates, DST-transition dates, and sub-second precision values.     
                                                                                
                                                                                
                      
                 
     ### Timezone diversity                                                     
                                                                                
                                                                                
                                                                                
                      
                 
     The `representativeTimezones` list (17 zones) was chosen to cover:         
                                                                                
                                                                                
                                                                                
                      
     - Standard offsets: UTC, UTC+8 (Shanghai), UTC+9 (Tokyo), UTC-5/-4 (New 
York)
     - Half-hour offsets: Asia/Kolkata (UTC+5:30)                               
                                                                                
                                                                                
                                                                                
                      
     - Quarter-hour offsets: Asia/Kathmandu (UTC+5:45), Pacific/Chatham 
(UTC+12:45)                                                                     
                                                                                
                                                                                
                              
     - DST-transitioning zones: New York, Sydney, London, Sao Paulo             
                                                                                
                                                                                
                                                                                
                      
     - Non-DST zones: Dubai, Cairo, Johannesburg                                
                                                                                
                                                                                
                                                                                
                      
                                                                                
                                                                                
                                                                                
                                                                                
                      
     ### ANSI mode                                                              
                                                                                
                                                                                
                                                                                
                      
                                                                                
                                                                                
                                                                                
                                                                                
                      
     Each `castTimestampTest` invocation tests both `ANSI_ENABLED=false` (null 
on invalid input) and `ANSI_ENABLED=true` (exception on invalid input), plus 
`try_cast()`.                                                                   
                                                                                
                          
                 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to