MaxGekk commented on issue #25022: [SPARK-24695][SQL] Move `CalendarInterval` 
to org.apache.spark.sql.types package
URL: https://github.com/apache/spark/pull/25022#issuecomment-533190773
 
 
   > Similarly, in standard SQL it's hard to add a conceptual day, as the 
day-time interval is actually stored as seconds.
   
   That's why both `java.time.Period` and `java.time.Duration` exist 
simultaneously. You can update the years month & day fields of local timestamp 
by fields:
   ```scala
   scala> java.time.LocalDate.of(2019, 9, 19).plus(java.time.Period.of(-1, 2, 
3))
   res0: java.time.LocalDate = 2018-11-22
   ```
   or if you need add concrete number of micros (nanoseconds), you can add this 
to time instance:
   ```scala
   java.time.LocalDate.of(2019, 9, 
19).atStartOfDay(java.time.ZoneOffset.UTC).plus(java.time.Duration.ofNanos(100000000000000L))
   res7: java.time.ZonedDateTime = 2019-09-20T03:46:40Z
   ```
   Both types solves different problems. `Period` allows to update logical time 
components, `Duration` allows to add any concrete intervals in micros. 
   That's why I propose to convert Catalyst's Interval types either to `Period` 
or `Duration`. The types open power of the standard and other libraries. 
   
   Having Spark's `CalendarInterval`, users face to additional restrictions:
   - They cannot compare instances of `CalendarInterval`. `Duration` is 
comparable
   - They cannot update `year`, `month` and `day` components of 
`LocalDate`/`LocalDateTime` because `CalendarInterval` contains micros and you 
have to apply it to a time zone
   - They cannot add `CalendarInterval` to an `Instant` because 
`CalendarInterval` contains months. So, you have to convert `Instant` to 
`ZonedDateTime` before adding `CalendarInterval`.
   
   > I don't think so. To reach the same result you have to look at the 
timestamp and calculate how many days to add to reach to the next 2 months.
   
   You are right, I just want to say that `day-time interval` is internally 
consistent, and can express any intervals. Why do we need months in 
`CalendarInterval` if `microseconds` of `CalendarInterval` can represent any 
time intervals between any valid timestamps? It seems we are just wasting 
user's memory, and on Spark side as well?
   
   > Users can use them, but there is no backward compatibility guarantee from 
Spark.
   
   It slightly contradict Reynold's objects in #25678 regarding modifying user 
app code and libraries.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to