Re:Re: [DISCUSS] FLIP-387: Support named parameters for functions and call procedures

Xuyang Wed, 20 Dec 2023 18:40:07 -0800

Hi, Benchao.


When Feng Jin and I tried the poc together, we found that when using udaf, 
Calcite directly using the function's input parameters from 
SqlCall#getOperandList. But in fact, these input parameters may use named 
arguments, the order of parameters may be wrong, and they may not include 
optional parameters that need to set default values. Actually, we should use 
new SqlCallBinding(this, scope, call).operands() to let this method correct the 
order and add default values. (You can see the modification in 
SqlToRelConverter in poc branch[1])


We have not reported this bug to the calcite community yet. Our original plan 
was to report this bug to the calcite community during the process of doing 
this flip, and fix it separately in flink's own calcite file. Because the time 
for Calcite to release the version is uncertain. And the time to upgrade flink 
to the latest calcite version is also unknown.


The link to the poc code is at the bottom of the flip[2]. I'm post it here 
again[1].


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
[2] 
https://github.com/apache/flink/compare/master...hackergin:flink:poc_named_argument



--

    Best！
    Xuyang





在 2023-12-20 13:31:26，"Benchao Li" <libenc...@apache.org> 写道：
>I didn't see your POC code, so I assumed that you'll need to add
>SqlStdOperatorTable#DEFAULT and
>SqlStdOperatorTable#ARGUMENT_ASSIGNMENT to FlinkSqlOperatorTable, am I
>right?
>
>If yes, this would enable many builtin functions to allow default and
>optional arguments, for example, `select md5(DEFAULT)`, I guess this
>is not what we want to support right? If so, I would suggest to throw
>proper errors for these unexpected usages.
>
>Benchao Li <libenc...@apache.org> 于2023年12月20日周三 13:16写道：
>>
>> Thanks Feng for driving this, it's a very useful feature.
>>
>> In the FLIP, you mentioned that
>> > During POC verification, bugs were discovered in Calcite that caused 
>> > issues during the validation phase. We need to modify the SqlValidatorImpl 
>> > and SqlToRelConverter to address these problems.
>>
>> Could you log bugs in Calcite, and reference the corresponding Jira
>> number in your code. We want to upgrade Calcite version to latest as
>> much as possible, and maintaining many bugfixes in Flink will add
>> additional burdens for upgrading Calcite. By adding corresponding
>> issue numbers, we can easily make sure that we can remove these Flink
>> hosted bugfixes when we upgrade to a version that already contains the
>> fix.
>>
>> Feng Jin <jinfeng1...@gmail.com> 于2023年12月14日周四 19:30写道：
>> >
>> > Hi Timo
>> > Thanks for your reply.
>> >
>> > >  1) ArgumentNames annotation
>> >
>> > I'm sorry for my incorrect expression. argumentNames is a method of
>> > FunctionHints. We should introduce a new arguments method to replace this
>> > method and return Argument[].
>> > I updated the FLIP doc about this part.
>> >
>> > >  2) Evolution of FunctionHint
>> >
>> > +1 define DataTypeHint as part of ArgumentHint. I'll update the FLIP doc.
>> >
>> > > 3)  Semantical correctness
>> >
>> > I realized that I forgot to submit the latest modification of the FLIP
>> > document. Xuyang and I had a prior discussion before starting this discuss.
>> > Let's restrict it to supporting only one eval() function, which will
>> > simplify the overall design.
>> >
>> > Therefore, I also concur with not permitting overloaded named parameters.
>> >
>> >
>> > Best,
>> > Feng
>> >
>> > On Thu, Dec 14, 2023 at 6:15 PM Timo Walther <twal...@apache.org> wrote:
>> >
>> > > Hi Feng,
>> > >
>> > > thank you for proposing this FLIP. This nicely completes FLIP-65 which
>> > > is great for usability.
>> > >
>> > > I read the FLIP and have some feedback:
>> > >
>> > >
>> > > 1) ArgumentNames annotation
>> > >
>> > >  > Deprecate the ArgumentNames annotation as it is not user-friendly for
>> > > specifying argument names with optional configuration.
>> > >
>> > > Which annotation does the FLIP reference here? I cannot find it in the
>> > > Flink code base.
>> > >
>> > > 2) Evolution of FunctionHint
>> > >
>> > > Introducing @ArgumentHint makes a lot of sense to me. However, using it
>> > > within @FunctionHint looks complex, because there is both `input=` and
>> > > `arguments=`. Ideally, the @DataTypeHint can be defined inline as part
>> > > of the @ArgumentHint. It could even be the `value` such that
>> > > `@ArgumentHint(@DataTypeHint("INT"))` is valid on its own.
>> > >
>> > > We could deprecate `input=`. Or let both `input` and `arguments=`
>> > > coexist but never be defined at the same time.
>> > >
>> > > 3) Semantical correctness
>> > >
>> > > As you can see in the `TypeInference` class, named parameters are
>> > > prepared in the stack already. However, we need to watch out between
>> > > helpful explanation (see `InputTypeStrategy#getExpectedSignatures`) and
>> > > named parameters (see `TypeInference.Builder#namedArguments`) that can
>> > > be used in SQL.
>> > >
>> > > If I remember correctly, named parameters can be reordered and don't
>> > > allow overloading of signatures. Thus, only a single eval() should have
>> > > named parameters. Looking at the FLIP it seems you would like to support
>> > > multiple parameter lists. What changes are you planning to TypeInference
>> > > (which is also declared as @PublicEvoving)? This should also be
>> > > documented as the annotations should compile into this class.
>> > >
>> > > In general, I would prefer to keep it simple and don't allow overloading
>> > > named parameters. With the optionality, users can add an arbitrary
>> > > number of parameters to the signature of the same eval method.
>> > >
>> > > Regards,
>> > > Timo
>> > >
>> > > On 14.12.23 10:02, Feng Jin wrote:
>> > > > Hi all,
>> > > >
>> > > >
>> > > > Xuyang and I would like to start a discussion of FLIP-387: Support
>> > > > named parameters for functions and call procedures [1]
>> > > >
>> > > > Currently, when users call a function or call a procedure, they must
>> > > > specify all fields in order. When there are a large number of
>> > > > parameters, it is easy to make mistakes and cannot omit specifying
>> > > > non-mandatory fields.
>> > > >
>> > > > By using named parameters, you can selectively specify the required
>> > > > parameters, reducing the probability of errors and making it more
>> > > > convenient to use.
>> > > >
>> > > > Here is an example of using Named Procedure.
>> > > > ```
>> > > > -- for scalar function
>> > > > SELECT my_scalar_function(param1 => ‘value1’, param2 => ‘value2’’) FROM
>> > > []
>> > > >
>> > > > -- for table function
>> > > > SELECT  *  FROM TABLE(my_table_function(param1 => 'value1', param2 =>
>> > > 'value2'))
>> > > >
>> > > > -- for agg function
>> > > > SELECT my_agg_function(param1 => 'value1', param2 => 'value2') FROM []
>> > > >
>> > > > -- for call procedure
>> > > > CALL  procedure_name(param1 => ‘value1’, param2 => ‘value2’)
>> > > > ```
>> > > >
>> > > > For UDX and Call procedure developers, we introduce a new annotation
>> > > > to specify the parameter name, indicate if it is optional, and
>> > > > potentially support specifying default values in the future
>> > > >
>> > > > ```
>> > > > public @interface ArgumentHint {
>> > > >      /**
>> > > >       * The name of the parameter, default is an empty string.
>> > > >       */
>> > > >      String name() default "";
>> > > >
>> > > >      /**
>> > > >       * Whether the parameter is optional, default is true.
>> > > >       */
>> > > >      boolean isOptional() default true;
>> > > > }}
>> > > > ```
>> > > >
>> > > > ```
>> > > > // Call Procedure Development
>> > > >
>> > > > public static class NamedArgumentsProcedure implements Procedure {
>> > > >
>> > > >     // Example usage: CALL myNamedProcedure(in1 => 'value1', in2 =>
>> > > 'value2')
>> > > >
>> > > >     // Example usage: CALL myNamedProcedure(in1 => 'value1', in2 =>
>> > > > 'value2', in3 => 'value3')
>> > > >
>> > > >     @ProcedureHint(
>> > > >             input = {@DataTypeHint(value = "STRING"),
>> > > > @DataTypeHint(value = "STRING"), @DataTypeHint(value = "STRING")},
>> > > >             output = @DataTypeHint("STRING"),
>> > > >                  arguments = {
>> > > >                  @ArgumentHint(name = "in1", isOptional = false),
>> > > >                  @ArgumentHint(name = "in2", isOptional = true)
>> > > >                  @ArgumentHint(name = "in3", isOptional = true)})
>> > > >     public String[] call(ProcedureContext procedureContext, String
>> > > > arg1, String arg2, String arg3) {
>> > > >         return new String[]{arg1 + ", " + arg2 + "," + arg3};
>> > > >     }
>> > > > }
>> > > > ```
>> > > >
>> > > >
>> > > > Currently, we offer support for two scenarios when calling a function
>> > > > or procedure:
>> > > >
>> > > > 1. The corresponding parameters can be specified using the parameter
>> > > > name, without a specific order.
>> > > > 2. Unnecessary parameters can be omitted.
>> > > >
>> > > >
>> > > > There are still some limitations when using Named parameters:
>> > > > 1. Named parameters do not support variable arguments.
>> > > > 2. UDX or procedure classes that support named parameters can only
>> > > > have one eval method.
>> > > > 3. Due to the current limitations of Calcite-947[2], we cannot specify
>> > > > a default value for omitted parameters, which is Null by default.
>> > > >
>> > > >
>> > > >
>> > > > Also, thanks very much for the suggestions and help provided by Zelin
>> > > > and Lincoln.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > 1.
>> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-387%3A+Support+named+parameters+for+functions+and+call+procedures
>> > > .
>> > > >
>> > > > 2. https://issues.apache.org/jira/browse/CALCITE-947
>> > > >
>> > > >
>> > > >
>> > > > Best,
>> > > >
>> > > > Feng
>> > > >
>> > >
>> > >
>>
>>
>>
>> --
>>
>> Best,
>> Benchao Li
>
>
>
>-- 
>
>Best,
>Benchao Li

Re:Re: [DISCUSS] FLIP-387: Support named parameters for functions and call procedures

Reply via email to