MaxGekk commented on issue #25022: [SPARK-24695][SQL] Move `CalendarInterval` 
to org.apache.spark.sql.types package
URL: https://github.com/apache/spark/pull/25022#issuecomment-533501405
 
 
   > `CalendarInterval` should contain 3 ints for months, days and seconds.
   
   @cloud-fan Sorry, this design choice is not clear to me. As you describe 
above, interval type can be "conceptual" like `Period` or concrete like 
`Duration`. In the first case, we should store `year`, `month`, `day`, and I 
would continue `hour`, `minute`, `seconds`, `microsecond`. So, you can change 
any local timestamp by its components. The second type stores diff between any 
valid timestamps. In our case, `long` is enough to store diff in microseconds.
   
    The first type can store date-time components in the compacted way - 
`months` (int) + `microseconds` (long) because `year` can be calculated from 
`month`, and (`days`, `hours`, `minutes`, `seconds`) from `microseconds`. We 
can consider Spark `CalendarInterval` from this perspective.
   
   But you propose to store `days` separately, why (besides of the reason that 
parquet does the same)?
   
   And third component `seconds` confuses me as well? Why seconds but not 
microseconds? Do you want to lose the precision when subtracting 2 timestamps?  
     

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to