The SQL standard hints into implicit casts into several places but it is
not 100% explicit about it (no pun intended).

For instance, "4.11 Data conversions" in Part 2 (2011 edition) states:

> Implicit type conversion can occur in expressions, fetch operations,
> single row select operations, inserts, deletes,
> and updates.

Explicit type conversions can be specified by the use of the CAST operator.
> A CAST operator defines how values of a source data type are converted
> into a value of a target data type according to the Syntax Rules and General
> Rules of Subclause 6.13, “<cast specification>”.


If my understanding is correct, if you provide an ambiguous expression,
it's to be expected that the validator will have to resolve it to
accomplish its task, and I agree with Mihai on that part.

We have explicit CAST to force something different than what the validator
would do by itself, and that's generally enough.

For a system having an EQUALS function accepting parameters of different
types, I'd expect it to be a separate function from the "standard" one,
maybe registered as a user-defined function, so again there would be no
ambiguity at validation time?

Concerning the transformations like translating to CASE, I agree that it
should be configurable.

I think that historically we have those transformations in place to
restrict the syntactical variants on expressions, as we are finally
manipulating and matching expressions syntactically other than semantically
(similar considerations arise often for the IN -> OR translation and
similar), but it's true that for cases like that of Matthew we might be too
constraining.

Best regards,
Alessandro

On Sat, 8 Mar 2025 at 20:08, Steven Phillips <ste...@dremio.com.invalid>
wrote:

> I think it's reasonable for a system to not want these transformations. A
> given execution engine could choose to implement  an EQUALS function
> between the two different types, rather than using implicit cast. Also, for
> use cases like Matthew's, it seems reasonable that we could configure
> SqlValidator and SqlToRel to not do any type coercion.
>
> On Sat, Mar 8, 2025 at 10:18 AM Mihai Budiu <mbu...@gmail.com> wrote:
>
> > Yes, these transformations are essential for compiling programs, because
> > they make the program's meaning unambiguous. Consider your example where
> > you check equality between two columns of different types A and B.  The
> > meaning of the program is very different depending on what casts are
> > inserted:
> >
> > a = b
> >
> > Can be interpreted as
> >
> > (B)a = b
> >
> > or as
> >
> > a = (A)b
> >
> > These will give different results in some cases. The Validator makes it
> > clear what the choice of the compiler is.
> >
> > Mihai
> >
> > ________________________________
> > From: Matthew McMillian <matthewgmcmill...@gmail.com>
> > Sent: Friday, March 7, 2025 10:34 PM
> > To: dev@calcite.apache.org <dev@calcite.apache.org>
> > Subject: Questions Regarding Type Coercion and CAST Transformations in
> > SqlValidator and RexBuilder
> >
> > Hi Calcite community,
> >
> > I'm using Calcite to analyze SQL queries and have encountered some issues
> > with transformations that interfere with my use case. I hope the
> community
> > can provide guidance.
> >
> > My current workflow:
> >
> >    1. Parse the query into a SqlNode.
> >    2. Validate the SqlNode using SqlValidator (callRewrite=false),
> applying
> >    type coercion and other transformations.
> >    3. Walk the validated SqlNode to perform custom validation, using
> >    SqlValidator and SqlValidatorScope for column resolution.
> >    4. Convert the SqlNode to a RelNode.
> >    5. Walk the RelNode for further custom validation.
> >
> > For my use case, I care deeply about the original syntax of the query. I
> > need to preserve the original query structure rather than an equivalent
> > transformation. Steps #2 and #4 introduce changes that create challenges.
> > For example, a comparison between two columns of different types.
> >
> >    - <bool_column> = <int_column> is rewritten in step #2 as
> > CAST(<bool_column>
> >    AS INT) = <int_column> (AbstractTypeCoercion
> >    <
> >
> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L157
> > >
> >    ).
> >    - In step #4, it further transforms into CASE(IS NOT
> >    NULL(<bool_column>), CASE(<bool_column>, 1, 0), null) (RexBuilder
> >    <
> >
> https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rex/RexBuilder.java#L940
> > >
> >    ).
> >
> > I have two questions:
> >
> >    1. Are the aforementioned types of transformations essential for
> >    Calcite to compile queries? Is there a way to disable them while still
> >    compiling queries?
> >    2. Is my current approach reasonable, or is there a better way to
> >    achieve my goal within Calcite?
> >
> >
> > Any insights would be greatly appreciated.
> >
> > Thanks,
> > Matthew McMillian
> >
>

Reply via email to