The problem is that I don't know how to fix this. Would be glad to do the work 
if someone helped design a way to do it.

Turns out that the OperandTypeChecker can mutate the RexNodes it receives as 
arguments, but I am not sure that this is "safe", because it looks like the API 
has been designed with immutability in mind. For example, these combinators 
that can combine multiple type-checkers seem to assume immutability.

Mihai
________________________________
From: Julian Hyde <jhyde.apa...@gmail.com>
Sent: Friday, September 6, 2024 12:59 PM
To: dev@calcite.apache.org <dev@calcite.apache.org>
Subject: Re: QUESTION: SUBSTRING implementation

I totally agree.

Julian

> On Sep 6, 2024, at 12:57, Mihai Budiu <mbu...@gmail.com> wrote:
>
> A problem with the way Calcite handles such functions is that the 
> type-checker and the runtime are separate and do not communicate with each 
> other. In the type checking code that people usually write for a function 
> they write an OperandTypeChecker, which makes some assumptions about the 
> range of implicit casts that the function allows. For example, for SUBSTRING 
> such a type checker would allow any numeric types, or even string types, with 
> the assmption that they would be converted to integer. As a result of the 
> type-checking, the function is accepted, but the implicit casts that were 
> inferred are not inserted as explicit casts to the arguments of the RexCall.
>
> So when the runtime has to implement the function it has no idea about the 
> kinds of arguments that may be expected, and what it's supposed to do with 
> them. The runtime has to handle many more cases, e.g,, INTEGER, TINYINT, 
> DECIMAL, STRING. This makes it difficult to keep the runtime in sync with the 
> type-checker as people implement new versions of the many functions from 
> different dialects. Moreover, the runtime may incorrectly convert such 
> values. For example, if you call decimal.intValue() you are not checking for 
> overflow and not rounding in the same way a CAST(decimal AS INTEGER) may do.
>
> Ideally the type-checker for functions would be allowed to insert implicit 
> casts, this would simplify a lot the task of both generating code in the 
> runtime, and of implementing the runtime itself.
>
> For substring the type-checker could insert a cast from DECIMAL to INTEGER. 
> The runtime now only has to worry about INTEGER values.
>
> Mihai
> ________________________________
> From: Julian Hyde <jhyde.apa...@gmail.com>
> Sent: Friday, September 6, 2024 12:44 PM
> To: dev@calcite.apache.org <dev@calcite.apache.org>
> Subject: Re: QUESTION: SUBSTRING implementation
>
> The trouble is, we are going beyond the standard. The standard says
>
>  The declared type of a <start position>, <string length>,
>  <regex occurrence>, or <regex capture group>
>  shall be exact numeric with scale 0 (zero).
>
> People often misunderstand the “shall” in standards. It is talking about the 
> SQL query, not the implementation. A standards-compliant query will only have 
> integer values of <start position>, and compliant implementation (such as 
> Calcite) must handle such queries.
>
> The standard doesn’t say anything about what should happen if the query is 
> not standards-compliant, viz <start position> is not an integer. In 
> particular, it doesn’t say we should fail when we receive such queries. But 
> we can do anything, like return -77 or rounding the integer up, or rounding 
> it down, and still be compliant.
>
> What we have decided to do in Calcite is to apply our usual rules for 
> converting decimal arguments to integers. I am saying let’s define those 
> ‘usual rules’, and apply it to all functions, whether they are in the SQL 
> standard, in a library (such as Postgres or BigQuery) or user-defined 
> functions.
>
>
> Julian
>
>
>
>> On Sep 6, 2024, at 1:02 AM, stanilovsky evgeny <estanilovs...@gridgain.com> 
>> wrote:
>>
>> I understand you are talking about, but ... LEFT is just extension under 
>> standard functions collection and DB implementors are free here, they can 
>> write restrictions in documentation [1] or treat fractional types as integer 
>> implicitly. I don`t think we need to change (LEFT, RIGHT and so on string 
>> functions), they already restricted in documentation, while standard 
>> (substring, overlay) need to be improved.
>>
>> [1] 
>> https://learn.microsoft.com/en-us/sql/t-sql/functions/left-transact-sql?view=sql-server-ver16
>>
>>> On Thu, 05 Sep 2024 21:07:27 +0300, Julian Hyde <jh...@apache.org> wrote:
>>>
>>> Thanks for taking the time to look up the SQL standard. But
>>> unfortunately the SQL standard isn't much use here, because it doesn't
>>> say much about implicit conversions - it leaves that to the
>>> implementor, such as Calcite or Postgres.
>>>
>>> Calcite has its own rules for implicit conversions of arguments. It
>>> also has lots of functions that take numeric and integer arguments.
>>> Calcite's users expect that the conversion rules are consistent across
>>> all functions. Picking one at random, LEFT(string, length) is enabled
>>> in several libraries including Postgres. You should implement the
>>> conversion for SUBSTRING in such a way that LEFT gets it for free.
>>>
>>> On Thu, Sep 5, 2024 at 7:20 AM stanilovsky evgeny
>>> <estanilovs...@gridgain.com> wrote:
>>>>
>>>> Thank all for reply, in section : 6.18 <string value function>
>>>> I see only OVERLAY function that not corresponds here too:
>>>>
>>>> <character overlay function> ::=
>>>> OVERLAY <left paren> <character value expression> PLACING <character value
>>>> expression>
>>>> FROM <start position> [ FOR <string length> ]
>>>> [ USING <char length units> ] <right paren>
>>>>
>>>> <start position> ::= <numeric value expression>
>>>> <string length> ::= <numeric value expression>
>>>>
>>>> or Julian you are talking not only about this kind of functions ?
>>>> In such a case it consumes a lot of time for such a check.
>>>>
>>>> Why we can`t move here sequentially ? Just fix operand types ? What
>>>> generic approach you are talking about ?
>>>> SqlSingleOperandTypeChecker STRING_INTEGER is used only in : SqlFunction
>>>> REPEAT
>>>> SqlSingleOperandTypeChecker STRING_INTEGER_INTEGER only in
>>>> SqlSubstringFunction
>>>> STRING_STRING_INTEGER, STRING_STRING_INTEGER_INTEGER only in
>>>> SqlOverlayFunction
>>>>
>>>>
>>>>> Is the desired behavior specific to the SUBSTRING function? Or should it
>>>>> be generic, for all functions that have an argument of type INTEGER?
>>>>>
>>>>> If it’s generic, can you give some other functions as examples where we
>>>>> would want this behavior.
>>>>>
>>>>> Also, if it’s generic, the code should probably not be part of the
>>>>> SUBSTRING function.
>>>>>
>>>>>
>>>>>> On Sep 4, 2024, at 8:31 AM, Norman Jordan
>>>>>> <norman.jor...@improving.com.INVALID> wrote:
>>>>>>
>>>>>> This makes sense. Running a quick test with MySQL, I can see that
>>>>>> decimal values do not give an error. It appears that MySQL will round a
>>>>>> decimal value to the nearest integer value.
>>>>>> ________________________________
>>>>>> From: stanilovsky evgeny <estanilovs...@gridgain.com>
>>>>>> Sent: Wednesday, September 4, 2024 5:17 AM
>>>>>> To: dev@calcite.apache.org <dev@calcite.apache.org>
>>>>>> Subject: QUESTION: SUBSTRING implementation
>>>>>>
>>>>>> Hi all, i want to discuss current SUBSTRING func implementation.
>>>>>>
>>>>>> Lets take a standard and found:
>>>>>> <character substring function> ::=
>>>>>> SUBSTRING <left paren> <character value expression> FROM <start
>>>>>> position>
>>>>>> [ FOR <string length> ] <right paren>
>>>>>>
>>>>>> and further : <start position> ::= <numeric value expression>
>>>>>>
>>>>>> thus it not restrict <start position> for only integer types
>>>>>>
>>>>>> Calcite documentation says:
>>>>>> SUBSTRING(string FROM integer FOR integer) (we see restrictions here)
>>>>>>
>>>>>> Lets dig deeper:
>>>>>> Calcite implementation operands checker not restrict operands too :
>>>>>> 1. OperandTypes.STRING_NUMERIC - (1 param: substring ('asd', 2)) (not
>>>>>> restricted params)
>>>>>> 2. OperandTypes.STRING_INTEGER_INTEGER - (2 params: substring ('asd', 2,
>>>>>> 3)) (only integer)
>>>>>>
>>>>>> So if i call "SELECT SUBSTRING('asd', 1.2)" runtime exception will
>>>>>> occur:
>>>>>> java.lang.RuntimeException: while resolving method 'substring[class
>>>>>> java.lang.String, class java.math.BigDecimal]' in class class
>>>>>> org.apache.calcite.runtime.SqlFunctions
>>>>>>>             at
>>>>>>> org.apache.calcite.adapter.enumerable.EnumUtils.call(EnumUtils.java:770)
>>>>>>>             at
>>>>>>> org.apache.calcite.adapter.enumerable.RexImpTable$MethodImplementor.call(RexImpTable.java:2866)
>>>>>>>             at
>>>>>>> org.apache.calcite.adapter.enumerable.RexImpTable$MethodImplementor.implementSafe(RexImpTable.java:2847)
>>>>>>
>>>>>> So i appeal to align (1 and 2 operands checker implementation, so for 2
>>>>>> operands it need: STRING_NUMERIC_NUMERIC) and append appropriate
>>>>>> implementation (with will cut off fractional numeric part) into
>>>>>> SqlFunctions.
>>>>>>
>>>>>> wdyt ? if there will be no objections i will fill an issue.
>>>>>>
>>>>>> thanks !
>>>>>> Warning: The sender of this message could not be validated and may not
>>>>>> be the actual sender.
>

Reply via email to