jhmannok commented on PR #36519: URL: https://github.com/apache/arrow/pull/36519#issuecomment-1654840506
> The logic is hard to follow (the original code sure does not help...) But I'm not sure if the changes here are defensible without more explanation of the logic in the code. It might help to make sure we have clear test cases of expected behavior first. I think its worth to clarify what my logic/presumptions are. So essentially, the logic in the change is that: 1. The timestamp value is retrieved from the database (ie. database shows (2013-03-28 00:00:00.000)). The database does not store any timezone information but the timestamp of 2013-03-28 00:00:00 has an inherent timezone that is either outlined by the database itself or the users themselves know what the timezone is for this timestamp. Thus, the calendar object that could be provided into the accessor methods allows users to assert this information. 2. When this value is retrieved from the database, the accessor method for getTimestamp first creates a LocalDateTime object using the epochMilliseconds of the timestamp (ArrowFlightJdbcTimeStampVectorAccessor: line 93) 3. We then determine what timezone the LocalDateTime should be (ArrowFlightJdbcTimeStampVectorAccessor: line 97-103). 4. Since in ArrowFlightJdbcTimeStampVectorAccessor#getTimestamp line 135 uses Timestamp.valueOf(LocalDateTime) to create the timestamp object that is returned to the client, ArrowFlightJdbcTimeStampVectorAccessor: line 105 essentially creates a ZonedDateTime using the LocalDateTime + the timezone determined in step 3, essentially attaching the timezone to the LocalDateTime value (no timezone conversion takes place yet). Then we convert the timestamp to the system default timezone and return the timestamp as a LocalDateTime. What ended up happening before was that the LocalDateTime that was returned by the getLocalDateTime method reflects the timestamp being converted from the timezone of the calendar input (or system default when it is null) to UTC (since in the original code, the timezone of the vector defaulted to UTC). But Timestamp.valueOf(LocalDateTime) interprets the LocalDateTime passed in as it is in the system default timezone, thus causing the issue where we did get the correct time and date value but the timezone of the Timestamp object is the system timezone instead of UTC (or whatever the timezone of the Vector is) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
