* I think "safe" means "no exception", so in case of overflow safe operators return NULL. But here we are talking about real exceptions. (At least the existing "safe" operations in Calcite work this way). * Having an operator table that uses a different implementation for arithmetic sounds to me like a recipe for confusion. Moreover, this would not allow mixing checked and non-checked arithmetic in the same program (not sure whether there are use cases for this, though) * The visitor proposal would use the same operator table, but switch operations to checked/unchecked depending on the type system. Moreover, using this visitor is optional. * For aggregation we already have two kinds of issues: * the result type of aggregation is unspecified by the standard; Calcite uses the same type as the input data type https://issues.apache.org/jira/browse/CALCITE-6324, and * The type of intermediate results in aggregates is unspecified https://issues.apache.org/jira/browse/CALCITE-6516, https://issues.apache.org/jira/browse/CALCITE-6427 * Moreover, I think the standard says that the order of aggregation is unspecified; this already makes a difference in the absence of overflow for floating point, where addition is not associative
I think the aggregates can be solved in general by allowing them to compute on a wider data type and then casting the final result. This would work even if using checked arithmetic. Maybe the problem is that Calcite already made a choice for both these types (intermediate results, and final results), and it's not the same as other databases are making. But that's a separate problem. ________________________________ From: James Starr <jamesst...@gmail.com> Sent: Friday, October 4, 2024 12:09 PM To: dev@calcite.apache.org <dev@calcite.apache.org> Subject: Re: Checked arithmetic I am reasonably sure checked arithmetic is left to implementation to define due to the mixed concerns of different DBs. If you define checked arithmetic, it limits certain optimizations that may be made, such as pushing an agg below a join. For instance, if a rewrite because a rewrite could change the order of operations such that an overflow does or does not happen, what is the correct answer in this case? The error, the garbage result from an overflow, or the result that did not hit the overflow? I believe some systems have CURRENCY and BIG DECIMAL types, and this might be the reason. Maybe the CURRENCY types disallow rewrites to at least make the query repeatable and consistent. An alternative to a visitor, would be to use a different operator table with safe operators. Considering Calcite as planner, I think the operator/type needs to expose if over-follow is defined or not, as opposed to checked or unchecked which is an implementation issue. James