lidavidm commented on a change in pull request #11358:
URL: https://github.com/apache/arrow/pull/11358#discussion_r725266070



##########
File path: cpp/src/arrow/compute/kernels/scalar_string_test.cc
##########
@@ -1429,6 +1430,17 @@ TYPED_TEST(TestStringKernels, Strptime) {
   this->CheckUnary("strptime", input1, timestamp(TimeUnit::MICRO), output1, 
&options);
 }
 
+TYPED_TEST(TestStringKernels, StrptimeZoneOffset) {
+  if (!arrow::internal::kStrptimeSupportsZone) {
+    GTEST_SKIP() << "strptime does not support %z on this platform";
+  }
+  std::string input1 = R"(["5/1/2020 +01", null, "12/11/1900 -01:30"])";
+  std::string output1 =
+      R"(["2020-04-30T23:00:00.000000", null, "1900-12-11T01:30:00.000000"])";
+  StrptimeOptions options("%m/%d/%Y %z", TimeUnit::MICRO);
+  this->CheckUnary("strptime", input1, timestamp(TimeUnit::MICRO), output1, 
&options);

Review comment:
       Sorry, I think I made things unclear here. This is what I'm worried 
about. CSV uses the ISO8601 parser by default. All these CSV files previously 
and currently result in a timestamp without timezone column:
   
   ```
   ts
   2010-01-01T00:00:00
   ```
   
   ```
   ts
   2010-01-01T00:00:00-0100
   ```
   
   ```
   ts
   2010-01-01T00:00:00-0100
   2010-01-01T00:00:00
   ```
   
   Now, I think the first one is reasonable to continue interpreting in this 
way. And the second seems reasonable to interpret as `timestamp[s, "UTC"]`. 
However, this is a user-visible change in behavior. Do we also want to reflect 
that in other places we use the ISO8601 parser?
   
   And what about the third? Do we error? (This wasn't previously parseable 
before.) Joris touches on this above as well.
   
   For strptime, yes, I meant that if the user provides `%z` then it seems 
reasonable to expect that they wanted a timestamp with timezone as the result, 
and otherwise it's reasonable to give back timestamp without timezone. I don't 
think strptime is a worry here, but rather the ISO8601 parser.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to