jorisvandenbossche commented on a change in pull request #11358: URL: https://github.com/apache/arrow/pull/11358#discussion_r727463318
########## File path: cpp/src/arrow/compute/kernels/scalar_string_test.cc ########## @@ -1429,6 +1430,17 @@ TYPED_TEST(TestStringKernels, Strptime) { this->CheckUnary("strptime", input1, timestamp(TimeUnit::MICRO), output1, &options); } +TYPED_TEST(TestStringKernels, StrptimeZoneOffset) { + if (!arrow::internal::kStrptimeSupportsZone) { + GTEST_SKIP() << "strptime does not support %z on this platform"; + } + std::string input1 = R"(["5/1/2020 +01", null, "12/11/1900 -01:30"])"; + std::string output1 = + R"(["2020-04-30T23:00:00.000000", null, "1900-12-11T01:30:00.000000"])"; + StrptimeOptions options("%m/%d/%Y %z", TimeUnit::MICRO); + this->CheckUnary("strptime", input1, timestamp(TimeUnit::MICRO), output1, &options); Review comment: > If it is a matter of varying offsets (but all aware timestamps) then it would probably be also correct to just pick a timezone (e.g. UTC) and use that for everything. In fact, it would probably even be valid to just always use UTC if all timestamps are aware. Yes, I agree we could always use UTC, even if the offsets are all the same. Having varying offsets is quite normal, if you have data across a DST, so I think we should handle that by default (and return in UTC). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org