[
https://issues.apache.org/jira/browse/CALCITE-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17320574#comment-17320574
]
Julian Hyde commented on CALCITE-4564:
--------------------------------------
Please review [PR 2395|https://github.com/apache/calcite/pull/2395].
I renamed {{interface UdfInitializer}} to {{interface FunctionContext}} but
otherwise the design is pretty much as described above.
> Initialization context for non-static user-defined functions (UDFs)
> -------------------------------------------------------------------
>
> Key: CALCITE-4564
> URL: https://issues.apache.org/jira/browse/CALCITE-4564
> Project: Calcite
> Issue Type: Bug
> Components: extensions
> Reporter: Julian Hyde
> Assignee: Julian Hyde
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.27.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> I propose to allow user-defined functions (UDFs) to read from an
> initialization context during construction. The initialization context would
> be a new Java {{interface UdfInitializer}} that provides, among other things,
> a type factory and the values of the arguments to the function call whose
> values are literals.
> The purpose of this feature is to allow functions to do more work at
> initialization time and less work on each invocation. Suppose I wanted to
> write a UDF {{regexMatch(pattern, string)}} that matches Java regular
> expressions. If {{pattern}} is a literal, I would like to create an instance
> of the function object that calls {{Pattern.compile(pattern)}} in its
> constructor and stores the resulting {{Pattern}} object as a field. Each
> invocation of the function can use that {{Pattern}} object, and does not have
> to pay the cost of compilation.
> In order to use this feature, a UDF class would have a public constructor
> with a single argument that is a {{UdfInitializer}}. The method that invokes
> the function, conventionally called {{eval}}, must be non-static.
> This feature is optional. A UDF that has a public constructor with zero
> arguments (which is the current contract for non-static UDFs) will continue
> to work. [class
> MyPlusFunction|https://github.com/apache/calcite/blob/4bc916619fd286b2c0cc4d5c653c96a68801d74e/core/src/test/java/org/apache/calcite/util/Smalls.java#L429]
> is an example of this kind of UDF.
> This feature would apply to all UDFs, including table functions (i.e. those
> whose argument are tables or which return tables) and aggregate functions.
> The initialization context would not affect type derivation aspects of the
> function. The return type, operand types, and so forth, will already have
> been derived during validate time, and is complete well before any code is
> generated or executed. If you want to control type derivation, you should
> create your own sub-class of {{SqlOperator}}, as today.
> There are some implementation challenges:
> * The code generator will need to generate an instance of {{UdfInitializer}}
> for each UDF call that occurs in the query. Some data structures that are
> readily available at validate time (e.g. {{RexCall}}) are not easily
> re-created at run time, so we should be conservative what information is
> available via {{UdfInitializer}}.
> * The code generator must ensure that those instances are constructed exactly
> once during the execution of the query; those instances should not be
> variables in the {{execute}} method, but should instead be fields, or perhaps
> static fields, in the generated class.
> * This functionality needs to work through both the interpreter ({{Bindable}}
> convention) and generated code ({{Enumerable}} convention).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)