MaxGekk commented on issue #25022: [SPARK-24695][SQL] Move `CalendarInterval` to org.apache.spark.sql.types package URL: https://github.com/apache/spark/pull/25022#issuecomment-533501405 > `CalendarInterval` should contain 3 ints for months, days and seconds. @cloud-fan Sorry, this design choice is not clear to me. As you describe above, interval type can be "conceptual" like `Period` or concrete like `Duration`. In the first case, we should store `year`, `month`, `day`, and I would continue `hour`, `minute`, `seconds`, `microsecond`. So, you can change any local timestamp by its components. The second type stores diff between any valid timestamps. In our case, `long` is enough to store diff in microseconds. The first type can store date-time components in the compacted way - `months` (int) + `microseconds` (long) because `year` can be calculated from `month`, and (`days`, `hours`, `minutes`, `seconds`) from `microseconds`. We can consider Spark `CalendarInterval` from this perspective. But you propose to store `days` separately, why (besides of the reason that parquet does the same)? And third component `seconds` confuses me as well? Why seconds but not microseconds? Do you want to lose the precision when subtracting 2 timestamps?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
