[
https://issues.apache.org/jira/browse/ARROW-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17191169#comment-17191169
]
Micah Kornfield commented on ARROW-9915:
----------------------------------------
[~mjadczak_gsa] thank you for the detailed analysis. The last time this was
[discussed on the mailing
list|[https://lists.apache.org/thread.html/bd9286591fa8bd682322c625f4176701af78e2510005a0488da359b3%40%3Cdev.arrow.apache.org%3E]],
it was decided not to make these changes (but possibly add utility methods).
we can potentially discuss again.
> [Java] getObject API for temporal types is inconsistent and in some cases
> incorrect
> -----------------------------------------------------------------------------------
>
> Key: ARROW-9915
> URL: https://issues.apache.org/jira/browse/ARROW-9915
> Project: Apache Arrow
> Issue Type: Bug
> Components: Java
> Affects Versions: 0.13.0, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.16.0, 0.17.0,
> 0.17.1, 1.0.0
> Reporter: Matt Jadczak
> Priority: Major
>
> It seems that the work which has been tracked in ARROW-2015 and merged in
> [https://github.com/apache/arrow/pull/2966] to change the return types of the
> various Time and Date vector types when using the getObject API missed some
> of the vector types which are temporal and so should return a temporal type,
> and provided an incorrect implementation for others (some of this was pointed
> out in the initial PR review, but it seems that it slipped through the cracks
> and was not addressed before merging).
> Here is a table of the various temporal vector types, what they currently
> return from getObject, and what they should return, in my opinion (I have
> included ones in which the implementation is correct for completeness, and
> coloured them green).
>
>
> ||Vector class||Current return type||Proposed return type||Comments||
> |DateDayVector|Integer|LocalDate|Currently returns the raw value of days
> since epoch, should return the actual date|
> |DateMilliVector|LocalDateTime|LocalDate|This type is supposed to encode a
> date, not a datetime, so even though epoch millis are used, the result should
> be a LocalDate|
> |{color:#00875a}DurationVector{color}|{color:#00875a}Duration{color}|{color:#00875a}Duration{color}|{color:#00875a}Correct.{color}|
> |IntervalDayVector|Duration|Period|As per
> [https://github.com/apache/arrow/blob/master/format/Schema.fbs#L251] ,
> Interval should be a calendar-based datatype, not a time-based one. This is
> represented in Java by a Period type. However, I note that the Java Period
> class does not support milliseconds, unlike the Arrow type, which might be
> why Duration is being returned. Some discussion may be needed on the best way
> to deal with this.|
> |{color:#00875a}IntervalYearVector{color}|{color:#00875a}Period{color}|{color:#00875a}Period{color}|{color:#00875a}Correct.{color}|
> |TimeMicroVector|Long|LocalTime|Currently returns the raw number of micros,
> should return the actual time|
> |TimeMilliVector|LocalDateTime|LocalTime|Currently returns a datetime on
> 1970-01-01 with the correct time component, should just return the time|
> |TimeNanoVector|Long|LocalTime|Currently returns the raw number of nanos,
> should return the actual time|
> |TimeSecVector|Integer|LocalTime|Currently returns the raw number of seconds,
> should return the actual time|
> |{color:#00875a}TimeStampMicroVector{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}Correct.{color}|
> |{color:#00875a}TimeStampMilliVector{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}Correct.{color}|
> |{color:#00875a}TimeStampNanoVector{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}Correct.{color}|
> |{color:#00875a}TimeStampSecVector{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}Correct.{color}|
> |TimeStampMicroTZVector|Long|ZonedDateTime|Currently returns the underlying
> micros, and TZ has to be obtained separately. Should return the actual
> datetime with timezone|
> |TimeStampMilliTZVector|Long|ZonedDateTime|Currently returns the underlying
> millis, and TZ has to be obtained separately. Should return the actual
> datetime with timezone|
> |TimeStampNanoTZVector|Long|ZonedDateTime|Currently returns the underlying
> nanos, and TZ has to be obtained separately. Should return the actual
> datetime with timezone|
> |TimeStampSecTZVector|Long|ZonedDateTime|Currently returns the underlying
> seconds, and TZ has to be obtained separately. Should return the actual
> datetime with timezone|
> I am happy to submit a PR to fix this if there is no other reason for this to
> persist other than these types being rarely used, but note that these would
> all be breaking API changes.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)