[ 
https://issues.apache.org/jira/browse/DRILL-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899992#comment-16899992
 ] 

ASF GitHub Bot commented on DRILL-7337:
---------------------------------------

arina-ielchiieva commented on issue #1835: DRILL-7337: Add vararg UDFs support
URL: https://github.com/apache/drill/pull/1835#issuecomment-518187686
 
 
   +1, please squash the commits.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add vararg UDFs support
> -----------------------
>
>                 Key: DRILL-7337
>                 URL: https://issues.apache.org/jira/browse/DRILL-7337
>             Project: Apache Drill
>          Issue Type: Sub-task
>    Affects Versions: 1.16.0
>            Reporter: Volodymyr Vysotskyi
>            Assignee: Volodymyr Vysotskyi
>            Priority: Major
>              Labels: doc-impacting
>             Fix For: 1.17.0
>
>
> The aim of this Jira is to add support for vararg UDFs to simplify UDFs 
> creation for the case when it is required to accept different numbers of 
> arguments.
> h2. Requirements for vararg UDFs:
>  * It should be possible to register vararg UDFs with the same name, but with 
> different argument types;
>  * Only vararg UDFs with a single variable-length argument placed after all 
> other arguments should be allowed;
>  * Vararg UDF should have less priority than the regular one for the case 
> when they both are suitable;
>  * Besides simple functions, vararg support should be added to the aggregate 
> functions.
> h2. Implementation details
> The lifecycle of UDF is the following:
>  * UDF is validated in {{FunctionConverter}} class and for the case when 
> there is no problem (UDF has required fields with required types, required 
> annotations, etc.), it is converted to the {{DrillFuncHolder}} to be 
> registered in the function registry. Also, corresponding {{SqlFunction}} 
> instances are created based on {{DrillFuncHolder}} to be used in Calcite;
>  * When a query uses this UDF, Calcite validate that UDF with required name, 
> arguments number and arguments types (for Drill arguments types are not 
> checked at this stage) exists;
>  * After Calcite was able to find the required {{SqlFunction instance}}, it 
> uses Drill to find required {{DrillFuncHolder}}. All the work for determining 
> the most suitable function is done in {{FunctionResolver}} and in 
> {{TypeCastRules.getCost()}};
>  * At the execution stage, {{DrillFuncHolder}} found again using 
> {{FunctionCall}} instance;
>  * {{DrillFuncHolder}} is used for code generation.
> Considering these steps, the first thing to be done for adding support for 
> vararg UDFs is updating logic in {{FunctionConverter}} to allow registering 
> vararg UDFs taking into account requirements declared above.
> Calcite uses {{SqlOperandTypeChecker}} to verify arguments number, so Drill 
> should provide its own for vararg UDFs to be able to use them. To determine 
> whether UDF is vararg, new {{isVarArg}} property will be added to the 
> {{FunctionTemplate}}.
> {{TypeCastRules.getCost()}} method should be updated to be able to find 
> vararg UDFs and prioritize regular UDFs.
> Code generation logic should be updated to handle vararg UDFs. Generated code 
> for varag argument will look in the following way:
> {code:java}
>                   NullableVarCharHolder[] inputs = new 
> NullableVarCharHolder[3];
>                   inputs[0] = out14;
>                   inputs[1] = out19;
>                   inputs[2] = out24;
> {code}
> To create own varagr UDF, new {{isVarArg}} property should be set to {{true}} 
> in {{FunctionTemplate}}.
>  After that, required vararg input should be declared as an array.
> Here is an example if vararg UDF:
> {code:java}
>   @FunctionTemplate(name = "concat_varchar",
>                     isVarArg = true,
>                     scope = FunctionTemplate.FunctionScope.SIMPLE)
>   public class VarCharConcatFunction implements DrillSimpleFunc {
>     @Param *VarCharHolder[] inputs*;
>     @Output VarCharHolder out;
>     @Inject DrillBuf buffer;
>  
>      @Override
>     public void setup() {
>     }
>      @Override
>     public void eval() {
>       int length = 0;
>       for (VarCharHolder input : inputs) {
>         length += input.end - input.start;
>       }
>        out.buffer = buffer = buffer.reallocIfNeeded(length);
>       out.start = out.end = 0;
>        for (VarCharHolder input : inputs) {
>         for (int id = input.start; id < input.end; id++) {
>           out.buffer.setByte(out.end++, input.buffer.getByte(id));
>         }
>       }
>     }
>   }
> {code}
> h2. Limitations connected with VarArg UDFs:
>  * Specified nulls handling in FunctionTemplate does not affect vararg 
> parameters, i.e. the user should add UDFs with non-nullable and nullable 
> value holder vararg fields;
>  * VarArg UDFs supports only values of the same type including nullability 
> for vararg arguments for value holder vararg fields. If vararg field is 
> FieldReader, all the responsibility for handling types and nullability of 
> input vararg fields is placed on the UDF implementation;
>  * The scalar replacement does not happen for vararg arguments;
>  * UDF implementation should consider the case when vararg field is empty.
> *For documentation*
> New functions: collect_to_list, TBA.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to