alamb opened a new issue, #5827:
URL: https://github.com/apache/arrow-rs/issues/5827

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   This is in the context of implementing `date_bin` for timestamps with 
timezones: https://github.com/apache/datafusion/issues/10602
   
   I made https://github.com/apache/arrow-rs/pull/5826 to document the behavior 
of casting timestamps and I found it very confusing. Specifically when you cast 
from `Timestamp(None)` to `Timestamp(Some(tz))` and then back to 
`Timetamp(None)` the underlying timestamp values are changed as shown in this 
example
   
   ```rust
   use arrow_array::Int64Array;
   use arrow_array::types::{TimestampSecondType};
   use arrow_cast::{cast, display};
   use arrow_array::cast::AsArray;
   use arrow_schema::{DataType, TimeUnit};
   let data_type  = DataType::Timestamp(TimeUnit::Second, None);
   let data_type_tz = DataType::Timestamp(TimeUnit::Second, 
Some("-05:00".into()));
   let a = Int64Array::from(vec![1_000_000_000, 2_000_000_000, 3_000_000_000]);
   let b = cast(&a, &data_type).unwrap(); // cast to timestamp without timezone
   let b = b.as_primitive::<TimestampSecondType>(); // downcast to result type
   assert_eq!(2_000_000_000, b.value(1)); // values are still the same
   
   // Convert timestamps without a timezone to timestamps with a timezone
   let c = cast(&b, &data_type_tz).unwrap();
   let c = c.as_primitive::<TimestampSecondType>(); // downcast to result type
   assert_eq!(2_000_018_000, c.value(1)); // value has been adjusted by offset
   
   // Convert from timestamp with timezone back to timestamp without timezone
   let d = cast(&c, &data_type).unwrap();
   let d = d.as_primitive::<TimestampSecondType>(); // downcast to result type
   assert_eq!(2_000_018_000, d.value(1)); // <---- **** THIS VALUE IS DIFFERENT 
THAN IT WAS INITITALLY
   assert_eq!("2033-05-18T08:33:20", display::array_value_to_string(&d, 
1).unwrap());
   ```
   
   Thus I wanted to discuss if we should change the behavior to make it less 
surprising or if there was a reason to leave the current behavior
   
   
   
   **Describe the solution you'd like**
   
   I propose making `casting timestamp with a timezone to timestamp without a 
timezone` do the inverse of `casting timestamp withpit a timezone to timestamp 
with a timezone`
   
   This would mean the final value of d in the above example is 
`2_000_000_000`, not `2_000_018_000`
   
   
   **Describe alternatives you've considered**
   Leave existing behavior
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to