TL;DR: We need to add software engineering processes to ensure that Calcite is consistent with other SQL systems. What should those be?
There has been a lot of activity recently adding functions such as LOG2 [1] and ensuring that existing functions such as LOG and SQRT [2] are consistent with other implementations. Another case was about whether Calcite should allow casting a VARCHAR to a VARBINARY [3]. These discussions have been time-consuming. For example, the LOG2 PR received almost a hundred comments, and that is a function that should be simple to specify and implement. If a DB did not have LOG2, instead of LOG2(n) I would just write LOG(n) / LOG(2) and it would work. In one of the discussions a statement was made that "SQL rules say that SQRT should return NULL for negative values". I checked Oracle -- the definitive SQL implementation for many years -- and its doc says that it returns NaN. In another incident[4], the TO_CHAR function was removed from the MySQL library and then added back when someone pointed out that it is present in MariaDB, and our MySQL library covers MariaDB. We just can't keep doing this. We need to quickly and efficiently arrive at the correct specification for when Calcite is attempting to emulate other SQL systems, and then we need to stay with that specification. I propose that we use software engineering. One idea is to create a Quidem script that can be run against MySQL, and passes, and when run against Calcite-pretending-to-be-MySQL, it also passes. Anyone adding to that script must test it against a MySQL instance before merging to main. And similarly for other SQL dialects. As we encounter underspecified areas - e.g. the behavior of SQRT when applied to -0 - we can resolve them by adding to that script. Does anyone have any ideas for how we can solve this using automated tools, backed by changes to our process, rather than by endless discussions? (By the way, this discussion relates to libraries (the "fun" parameter), conformance and our parser settings (the "lex" parameter, quoted and unquoted casing, etc.). We could have a similar discussion about dialects - the SQL generated by the JDBC adapter - but please let's have that discussion in another thread.) Julian [1] https://issues.apache.org/jira/browse/CALCITE-6224 [2] https://issues.apache.org/jira/browse/CALCITE-5638 [3] https://issues.apache.org/jira/browse/CALCITE-6210 [4] https://issues.apache.org/jira/browse/CALCITE-6222