[ 
https://issues.apache.org/jira/browse/ARROW-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192720#comment-17192720
 ] 

Micah Kornfield commented on ARROW-9915:
----------------------------------------

[~mjadczak_gsa] discussing it more sounds fine.  For topics like API breaking 
changes it is best to have a discussion on the dev mailing list.

 

 

> [Java] getObject API for temporal types is inconsistent and in some cases 
> incorrect
> -----------------------------------------------------------------------------------
>
>                 Key: ARROW-9915
>                 URL: https://issues.apache.org/jira/browse/ARROW-9915
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Java
>    Affects Versions: 0.13.0, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.16.0, 0.17.0, 
> 0.17.1, 1.0.0
>            Reporter: Matt Jadczak
>            Priority: Major
>
> It seems that the work which has been tracked in ARROW-2015 and merged in 
> [https://github.com/apache/arrow/pull/2966] to change the return types of the 
> various Time and Date vector types when using the getObject API missed some 
> of the vector types which are temporal and so should return a temporal type, 
> and provided an incorrect implementation for others (some of this was pointed 
> out in the initial PR review, but it seems that it slipped through the cracks 
> and was not addressed before merging).
> Here is a table of the various temporal vector types, what they currently 
> return from getObject, and what they should return, in my opinion (I have 
> included ones in which the implementation is correct for completeness, and 
> coloured them green).
>  
>  
> ||Vector class||Current return type||Proposed return type||Comments||
> |DateDayVector|Integer|LocalDate|Currently returns the raw value of days 
> since epoch, should return the actual date|
> |DateMilliVector|LocalDateTime|LocalDate|This type is supposed to encode a 
> date, not a datetime, so even though epoch millis are used, the result should 
> be a LocalDate|
> |{color:#00875a}DurationVector{color}|{color:#00875a}Duration{color}|{color:#00875a}Duration{color}|{color:#00875a}Correct.{color}|
> |IntervalDayVector|Duration|Period|As per 
> [https://github.com/apache/arrow/blob/master/format/Schema.fbs#L251] , 
> Interval should be a calendar-based datatype, not a time-based one. This is 
> represented in Java by a Period type. However, I note that the Java Period 
> class does not support milliseconds, unlike the Arrow type, which might be 
> why Duration is being returned. Some discussion may be needed on the best way 
> to deal with this.|
> |{color:#00875a}IntervalYearVector{color}|{color:#00875a}Period{color}|{color:#00875a}Period{color}|{color:#00875a}Correct.{color}|
> |TimeMicroVector|Long|LocalTime|Currently returns the raw number of micros, 
> should return the actual time|
> |TimeMilliVector|LocalDateTime|LocalTime|Currently returns a datetime on 
> 1970-01-01 with the correct time component, should just return the time|
> |TimeNanoVector|Long|LocalTime|Currently returns the raw number of nanos, 
> should return the actual time|
> |TimeSecVector|Integer|LocalTime|Currently returns the raw number of seconds, 
> should return the actual time|
> |{color:#00875a}TimeStampMicroVector{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}Correct.{color}|
> |{color:#00875a}TimeStampMilliVector{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}Correct.{color}|
> |{color:#00875a}TimeStampNanoVector{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}Correct.{color}|
> |{color:#00875a}TimeStampSecVector{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}LocalDateTime{color}|{color:#00875a}Correct.{color}|
> |TimeStampMicroTZVector|Long|ZonedDateTime|Currently returns the underlying 
> micros, and TZ has to be obtained separately. Should return the actual 
> datetime with timezone|
> |TimeStampMilliTZVector|Long|ZonedDateTime|Currently returns the underlying 
> millis, and TZ has to be obtained separately. Should return the actual 
> datetime with timezone|
> |TimeStampNanoTZVector|Long|ZonedDateTime|Currently returns the underlying 
> nanos, and TZ has to be obtained separately. Should return the actual 
> datetime with timezone|
> |TimeStampSecTZVector|Long|ZonedDateTime|Currently returns the underlying 
> seconds, and TZ has to be obtained separately. Should return the actual 
> datetime with timezone|
> I am happy to submit a PR to fix this if there is no other reason for this to 
> persist other than these types being rarely used, but note that these would 
> all be breaking API changes.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to