felipecrv commented on code in PR #6906:
URL: https://github.com/apache/arrow-rs/pull/6906#discussion_r1894369802


##########
arrow-arith/src/numeric.rs:
##########
@@ -550,6 +568,21 @@ date!(Date64Type);
 trait IntervalOp: ArrowPrimitiveType {
     fn add(left: Self::Native, right: Self::Native) -> Result<Self::Native, 
ArrowError>;
     fn sub(left: Self::Native, right: Self::Native) -> Result<Self::Native, 
ArrowError>;
+    fn mul_int(left: Self::Native, right: i32) -> Result<Self::Native, 
ArrowError>;
+    fn mul_float(left: Self::Native, right: f64) -> Result<Self::Native, 
ArrowError>;
+    fn div_int(left: Self::Native, right: i32) -> Result<Self::Native, 
ArrowError>;
+    fn div_float(left: Self::Native, right: f64) -> Result<Self::Native, 
ArrowError>;

Review Comment:
   > Unfortunately intervals cannot be coerced to a duration as proposed...
   
   For the purpose of multiplication/division, some of the interval types can 
(internally to the kernel) -- before returning the output array, they are 
converted back to interval type to preserve their semantic meaning.
   
   ## YEAR_MONTH
   
   Example: the double of `1 year and 1 month` is `2 years and 2 months` which 
is what you get if you convert `(1y, 1m)` to `13m` then 2x to `26m` which 
should then become `(2y, 2m)`. This is all fine because every year, no matter 
what, has 12 months. So 12 months can become a year in the interval. This is 
why the underlying implementation itself is in number of months.
   
   Daylight savings and calendar consideration only come into play when you add 
an interval to a specific timestamp/date which is not the case here. It would 
be wrong to add `1y+1m` to `2024-12-20` as just adding the number of seconds 
that interval in 30-day months converts to, but the operation we are doing here 
is the scaling of the interval by some factor while preserving the semantic 
meaning.
   
   But maybe the **multiplication by a float** should not be available. Any 
system that wants to support `interval[YEAR_MONTH]*float` multiplication should 
round the float first or convert the interval to an interval type with more 
resolution.
   
   ## DAY_TIME
   
   `IntervalDayTimeType` expects the millis part to be less than a whole day 
worth of milliseconds (i.e. leap seconds are not accounted for) [1][2]. Perhaps 
the output of scaling operations on them (mul/div) should be a `MONTH_DAY_NANO` 
interval. Because unlike years that always have 12 months, days don't always 
have the same number of millis in them.
   
   ## MONTH_DAY_NANO
   
   For `MonthDayNano` my approach breaks indeed -- the components should be 
scaled independently. And the number of nanos is unbounded -- it doesn't have 
to be less than a day's worth of nanos.
   
   Multiplication by float and division gets really confusing though.
   
   [1] 
https://github.com/apache/arrow/blob/02a165922e46e5ed6dd3ed2446141cd0922a7c54/format/Schema.fbs#L398
   [2] 
https://github.com/apache/arrow/commit/7f7f72da6fa75f955bd69ac147b8494608c9aeb4



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to