Re: Misuse of the SqlKind field

Julian Hyde Tue, 06 Aug 2024 13:27:20 -0700

Here’s what’s in the javadoc:

> Enumerates the possible types of SqlNode.
>
> The values are immutable, canonical constants, so you can use Kinds to find
> particular types of expressions quickly. To identity a call to a common 
> operator
> such as '=', use SqlNode.isA:
>. exp.isA(EQUALS)
>
> Only commonly-used nodes have their own type; other nodes are of type OTHER.
> Some of the values, such as SET_QUERY, represent aggregates.
>
> To quickly choose between a number of options, use a switch statement:
>
> switch (exp.getKind()) {
>  case EQUALS:
>      ...;
>  case NOT_EQUALS:
>      ...;
>  default:
>      throw new AssertionError("unexpected");
>  }
>
> Note that we do not even have to check that a SqlNode is a SqlCall.
>
> To identify a category of expressions, use SqlNode.isA with an aggregate
> SqlKind. The following expression will return true for calls to '=' and '>=‘,
> but false for the constant '5', or a call to '+’:
>
>   exp.isA(SqlKind.COMPARISON)
>
> RexNode also has a getKind method; SqlKind values are preserved
> during translation from SqlNode to RexNode, where applicable.
>
> There is no water-tight definition of "common", but that's OK. There
> will always be operators that don't have their own kind, and for
> these we use the SqlOperator. But for really the common ones, e.g.
> the many places where we are looking for AND, OR and EQUALS,
> the enum helps.
>
> (If we were using Scala, SqlOperator would be a case class, and we
> wouldn't need SqlKind. But we're not.)

I wrote most of that in 2013 and 2014 and it still holds up. As you can see, 
there is a tension in the definition of ‘common’.

On the one hand, we don’t want people to have to define a new SqlKind for every 
single operator instance. But on the other hand, rules using == or .equals on 
instances of SqlOperator seems fussy and can drag a load of dependencies into 
our library of rules. Operators have no unique identifier (remember, different 
SQL dialects can have functions with the same name that have different 
semantics).

My working definition of ‘common’ is ‘common enough to be used in a rewrite 
rule’ and that the code seems ’tidier’ if I am comparing the SqlKind rather 
than the operator instance. Subjective, but it’s got us this far. 

Julian

> On Aug 6, 2024, at 12:14 PM, Mihai Budiu <[email protected]> wrote:
> 
> Hello all,
> 
> I am beginning to believe that the SqlKind field is being misused in Calcite 
> by using it to denote custom function implementations (e.g. 
> SqlKind.SUBSTR_BIG_QUERY).
> 
> I have filed an issue about function name resolution when using multiple 
> libraries: https://issues.apache.org/jira/browse/CALCITE-6518. But I am 
> having difficulties solving this issue fully.
> 
> The code that does name resolution for functions (in SqlUtil.lookupRoutine) 
> expects that functions that are not standard are having the kind 
> SqlKind.OTHER_FUNCTION. But this is not true for many functions that have 
> been updated recently. If there are multiple matches for a function name, 
> SqlUtil.lookupSubjectRoutine keeps under consideration only functions with 
> the kind OTHER_FUNCTION.
> 
> If I am right, the real fix would be to remove all the newly introduced kinds.
> 
> I would appreciate a confirmation from someone who has been around long 
> enough to know what SqlKind is supposed to represent.
> 
> Alternatively, we can make the SqlKind enum have multiple fields, one of 
> which could be used to represent "OTHER_FUNCTION".
> 
> Thank you,
> Mihai

Re: Misuse of the SqlKind field

Reply via email to