waynexia commented on PR #11471:
URL: https://github.com/apache/datafusion/pull/11471#issuecomment-2264460545

   Sorry for the huge delay! (I'm catching up on things in past weeks)
   
   I used to use custom interval types to match our definition of intervals. 
Because of two reasons:
   - Substrait has restricted the value range of its two interval types to 
`[-10,000..10,000]` years
   - Our interval types have different ranges with substraits. E.g., 
`IntervalDayTime` may overflow 
[here](https://github.com/apache/datafusion/pull/11471/files#diff-474e53672159d74dae38992a914a74eba81b8350ebe161f11d755f06414ed7b4R1779)
 because it only has one `i32` as millisecond for substrait's two `i32`s 
(second and millisecond).
   
   For the consumer side, we can support both types and report errors when 
substrait's overflow for wider scenarios. But I'm not sure if we should only 
produce substrait types for the above reasons. Given this kind of type 
information should be static, we cannot decide which type to use depending on 
the value at runtime.
   
   One possible solution is not to emit `IntervalYearMonth` and 
`IntervalDayTime`, which may overflow. And use `IntervalMonthDayNano` instead 
as it's the widest one. But this is not ideal as well. This seems to be a 
nontrivial task that leaps the gap between two type systems while keeping the 
difference of the type system between our system and substrait as small as 
possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to