andygrove opened a new issue, #3179:
URL: https://github.com/apache/datafusion-comet/issues/3179

   ## Summary
   
   When casting a string to `TimestampNTZ`, Comet incorrectly produces a 
timestamp with UTC timezone instead of a timestamp without timezone. This 
fundamentally changes the semantics of the value.
   
   ## Root Cause
   
   In `native/spark-expr/src/conversion_funcs/cast.rs`, the 
`cast_string_to_timestamp` function uses the pattern `DataType::Timestamp(_, 
_)` which matches both:
   - `Timestamp(Microsecond, Some("UTC"))` - timestamp with timezone
   - `Timestamp(Microsecond, None)` - TimestampNTZ (no timezone)
   
   The `cast_utf8_to_timestamp!` macro (lines 416-434) **unconditionally 
creates a timestamp WITH timezone**:
   ```rust
   let mut cast_array = 
PrimitiveArray::<$array_type>::builder(len).with_timezone("UTC");
   ```
   
   ## Expected Behavior
   
   For **TimestampNTZ** (Timestamp without timezone):
   - The result should be `Timestamp(Microsecond, None)` - no timezone
   - Values should be stored as-is without timezone conversion
   
   For **Timestamp** (with timezone):
   - The result should be `Timestamp(Microsecond, Some("UTC"))` - with UTC 
timezone
   - Values should be converted to UTC
   
   ## Suggested Fix
   
   Separate the handling for Timestamp and TimestampNTZ:
   
   ```rust
   match to_type {
       DataType::Timestamp(unit, Some(tz)) => {
           // Timestamp with timezone - apply timezone conversion
           cast_utf8_to_timestamp!(... with_timezone(tz) ...)
       }
       DataType::Timestamp(unit, None) => {
           // TimestampNTZ - no timezone, store as-is
           cast_utf8_to_timestamp_ntz!(... no timezone ...)
       }
   }
   ```
   
   ## Impact
   
   This bug affects any operation that involves casting strings to TimestampNTZ 
in Comet. Currently, TimestampNTZ casts are marked as `Incompatible` and fall 
back to Spark, so this bug is not exposed in production. However, it would be a 
blocker for enabling full TimestampNTZ support.
   
   ## Related
   
   - Issue #378 tracks adding full TimestampNTZ support to `supportedTypes`
   - The Scala layer correctly marks TimestampNTZ casting as incompatible in 
`CometCast.scala`
   
   ---
   
   > **Note:** This issue was generated with AI assistance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to