alamb commented on code in PR #3795:
URL: https://github.com/apache/arrow-rs/pull/3795#discussion_r1124995968


##########
arrow-cast/src/parse.rs:
##########
@@ -112,34 +80,44 @@ pub fn string_to_timestamp_nanos(s: &str) -> Result<i64, 
ArrowError> {
     // without a timezone specifier as a local time, using T as a separator
     // Example: 2020-09-08T13:42:29.190855
     if let Ok(ts) = NaiveDateTime::parse_from_str(s, "%Y-%m-%dT%H:%M:%S%.f") {
-        return to_timestamp_nanos(ts);
+        if let Some(offset) = 
timezone.offset_from_local_datetime(&ts).single() {

Review Comment:
   : 
https://docs.rs/chrono-tz/0.8.1/chrono_tz/enum.Tz.html#method.offset_from_local_datetime



##########
arrow-cast/src/parse.rs:
##########
@@ -153,6 +131,42 @@ pub fn string_to_timestamp_nanos(s: &str) -> Result<i64, 
ArrowError> {
     )))
 }
 
+/// Accepts a string in RFC3339 / ISO8601 standard format and some
+/// variants and converts it to a nanosecond precision timestamp.
+///
+/// See [`string_to_datetime`] for the full set of supported formats
+///
+/// Implements the `to_timestamp` function to convert a string to a
+/// timestamp, following the model of spark SQL’s to_`timestamp`.
+///
+/// Internally, this function uses the `chrono` library for the
+/// datetime parsing
+///
+/// We hope to extend this function in the future with a second
+/// parameter to specifying the format string.
+///
+/// ## Timestamp Precision
+///
+/// Function uses the maximum precision timestamps supported by
+/// Arrow (nanoseconds stored as a 64-bit integer) timestamps. This
+/// means the range of dates that timestamps can represent is ~1677 AD
+/// to 2262 AM
+///
+/// ## Timezone / Offset Handling
+///
+/// Numerical values of timestamps are stored compared to offset UTC.
+///
+/// This function interprets string without an explicit time zone timestamps

Review Comment:
   ```suggestion
   /// This function interprets string without an explicit time zone as 
timestamps
   ```



##########
arrow-cast/src/parse.rs:
##########
@@ -153,6 +131,42 @@ pub fn string_to_timestamp_nanos(s: &str) -> Result<i64, 
ArrowError> {
     )))
 }
 
+/// Accepts a string in RFC3339 / ISO8601 standard format and some
+/// variants and converts it to a nanosecond precision timestamp.
+///
+/// See [`string_to_datetime`] for the full set of supported formats
+///
+/// Implements the `to_timestamp` function to convert a string to a
+/// timestamp, following the model of spark SQL’s to_`timestamp`.
+///
+/// Internally, this function uses the `chrono` library for the
+/// datetime parsing
+///
+/// We hope to extend this function in the future with a second
+/// parameter to specifying the format string.
+///
+/// ## Timestamp Precision
+///
+/// Function uses the maximum precision timestamps supported by
+/// Arrow (nanoseconds stored as a 64-bit integer) timestamps. This
+/// means the range of dates that timestamps can represent is ~1677 AD
+/// to 2262 AM
+///
+/// ## Timezone / Offset Handling
+///
+/// Numerical values of timestamps are stored compared to offset UTC.
+///
+/// This function interprets string without an explicit time zone timestamps
+/// relative to UTC, see [`string_to_datetime`] for alternative semantics
+///
+/// For example, both `1997-01-31 09:26:56.123Z`, `1997-01-31T09:26:56.123`,
+/// and `1997-01-31T14:26:56.123-05:00` will be parsed as the same value
+///

Review Comment:
   
   I think some context got lost -- I only think this statement is true if the 
system timezone is set to UTC - 5:
   
   ```suggestion
   /// For example, `1997-01-31 09:26:56.123Z`, `1997-01-31T09:26:56.123`,
   /// and `1997-01-31T14:26:56.123-05:00` will be parsed as the same value
   /// if the system timezone is set to Americas/New_York (UTC-5).
   ```
   



##########
arrow-cast/src/parse.rs:
##########
@@ -76,12 +42,14 @@ use chrono::prelude::*;
 ///     "2023-01-01 040506 +07:30:00",
 ///     "2023-01-01 04:05:06.789 PST",
 ///     "2023-01-01 04:05:06.789 -08",
-#[inline]
-pub fn string_to_timestamp_nanos(s: &str) -> Result<i64, ArrowError> {
+pub fn string_to_datetime<T: TimeZone>(

Review Comment:
   I don't fully understand this question 
   
   I would think you could call `string_to_datetime<Utc>(...)` which would give 
you a chrono `DateTime<Utc>` and you can then then do whatever you want with 
the result (e.g. convert it into a nanosecond timestamp, etc)
   



##########
arrow-cast/src/parse.rs:
##########
@@ -614,6 +629,34 @@ mod tests {
             naive_datetime.timestamp_nanos(),
             parse_timestamp("2020-09-08 13:42:29").unwrap()
         );
+
+        let tz: Tz = "+02:00".parse().unwrap();
+        let date = string_to_datetime(&tz, "2020-09-08 13:42:29").unwrap();
+        let utc = date.naive_utc().to_string();
+        assert_eq!(utc, "2020-09-08 11:42:29");
+        let local = date.naive_local().to_string();
+        assert_eq!(local, "2020-09-08 13:42:29");
+
+        let date = string_to_datetime(&tz, "2020-09-08 13:42:29Z").unwrap();
+        let utc = date.naive_utc().to_string();
+        assert_eq!(utc, "2020-09-08 13:42:29");
+        let local = date.naive_local().to_string();
+        assert_eq!(local, "2020-09-08 15:42:29");
+
+        let dt =
+            NaiveDateTime::parse_from_str("2020-09-08T13:42:29Z", 
"%Y-%m-%dT%H:%M:%SZ")
+                .unwrap();
+        let local: Tz = "+08:00".parse().unwrap();

Review Comment:
   👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to