This (overflow) is an excellent point, but this also affects aggregations which were introduced a long time ago. They already inherit Java semantics for all of the relevant types (silent wrap around). We probably want to be consistent, meaning either changing aggregations (which incurs a cost for changing API) or continuing the java semantics here.
This is why having these discussions explicitly in the community before a release is so critical, in my view. It’s very easy for these semantic changes to go unnoticed on a JIRA, and then ossify. > On 2 Oct 2018, at 15:48, Ariel Weisberg <ar...@weisberg.ws> wrote: > > Hi, > > I think we should decide based on what is least surprising as you mention, > but isn't overridden by some other concern. > > It seems to me the priorities are > > * Correctness > * Performance > * User visible complexity > * Developer visible complexity > > Defaulting to silent implicit data loss is not ideal from a correctness > standpoint. > > Doing something better like using wider types doesn't seem like a performance > issue. > > From a user standpoint doing something less lossy doesn't look more complex > as long as it's consistent, and documented and doesn't change from version to > version. > > There is some developer complexity, but this is a public API and we only get > one shot at this. > > I wonder about how overflow is handled as well. In VoltDB I think we threw on > overflow and tended to just do widening conversions to make that less common. > We didn't imitate another database (as far as I know) we just went with what > least likely to silently corrupt data. > https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L2213 > <https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L2213> > https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L3764 > <https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L3764> > > Ariel > > On Tue, Oct 2, 2018, at 7:30 AM, Benedict Elliott Smith wrote: >> ç introduced arithmetic operators, and alongside these >> came implicit casts for their operands. There is a semantic decision to >> be made, and I think the project would do well to explicitly raise this >> kind of question for wider input before release, since the project is >> bound by them forever more. >> >> In this case, the choice is between lossy and lossless casts for >> operations involving integers and floating point numbers. In essence, >> should: >> >> (1) float + int = float, double + bigint = double; or >> (2) float + int = double, double + bigint = decimal; or >> (3) float + int = decimal, double + bigint = decimal >> >> Option 1 performs a lossy implicit cast from int -> float, or bigint -> >> double. Simply casting between these types changes the value. This is >> what MS SQL Server does. >> Options 2 and 3 cast without loss of precision, and 3 (or thereabouts) >> is what PostgreSQL does. >> >> The question I’m interested in is not just which is the right decision, >> but how the right decision should be arrived at. My view is that we >> should primarily aim for least surprise to the user, but I’m keen to >> hear from others. >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> <mailto:dev-unsubscr...@cassandra.apache.org> >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> <mailto:dev-h...@cassandra.apache.org> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > <mailto:dev-unsubscr...@cassandra.apache.org> > For additional commands, e-mail: dev-h...@cassandra.apache.org > <mailto:dev-h...@cassandra.apache.org>