waynexia commented on PR #11471: URL: https://github.com/apache/datafusion/pull/11471#issuecomment-2264460545
Sorry for the huge delay! (I'm catching up on things in past weeks) I used to use custom interval types to match our definition of intervals. Because of two reasons: - Substrait has restricted the value range of its two interval types to `[-10,000..10,000]` years - Our interval types have different ranges with substraits. E.g., `IntervalDayTime` may overflow [here](https://github.com/apache/datafusion/pull/11471/files#diff-474e53672159d74dae38992a914a74eba81b8350ebe161f11d755f06414ed7b4R1779) because it only has one `i32` as millisecond for substrait's two `i32`s (second and millisecond). For the consumer side, we can support both types and report errors when substrait's overflow for wider scenarios. But I'm not sure if we should only produce substrait types for the above reasons. Given this kind of type information should be static, we cannot decide which type to use depending on the value at runtime. One possible solution is not to emit `IntervalYearMonth` and `IntervalDayTime`, which may overflow. And use `IntervalMonthDayNano` instead as it's the widest one. But this is not ideal as well. This seems to be a nontrivial task that leaps the gap between two type systems while keeping the difference of the type system between our system and substrait as small as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org