Darpan Lunagariya (e6data computing) created CALCITE-7631:
-------------------------------------------------------------
Summary: Introduce a composable RexImplementorTable SPI for
operator code generation
Key: CALCITE-7631
URL: https://issues.apache.org/jira/browse/CALCITE-7631
Project: Calcite
Issue Type: Improvement
Components: core
Affects Versions: 1.42.0
Reporter: Darpan Lunagariya (e6data computing)
h2. Problem
Enumerable code generation resolves the implementor for every operator through
the {{RexImpTable.INSTANCE}} singleton. There is exactly one extension hook:
{{get(SqlOperator)}} consults {{ImplementableFunction}} when the operator is a
{{SqlUserDefinedFunction}} (a function registered in a schema). For any *other*
operator, a custom or dialect operator that an adapter registers through a
{{SqlOperatorTable}} — there is no way to supply a code-generation implementor:
* the backing maps are {{private final ImmutableMap}};
* the {{Builder}} / {{AbstractBuilder}} and their {{define*}} methods are
private;
* every consumer references {{RexImpTable.INSTANCE}} directly, most importantly
{{RexToLixTranslator.visitCall}}, but also {{RexExecutorImpl}},
{{EnumerableAggregate}}, {{EnumerableMatch}}, {{EnumerableTableFunctionScan}}.
Practical consequences for such an operator:
* it throws {{"cannot translate call"}} during code generation; and
* it cannot be constant-folded, because {{ReduceExpressionsRule}} runs through
{{RexExecutorImpl}}, which compiles the whole batch and fails if a single
operator has no implementor.
h2. Why this matters
{{RexImpTable}} is the only registry of its kind that is a hard, non-composable
singleton. Every other piece of pluggable behaviour in Calcite is an interface
(or builder) with a default, obtained or composed at configuration time:
* operators — {{SqlOperatorTable}} + {{SqlOperatorTables.chain(...)}}
* type system — {{RelDataTypeSystem}} (+ {{RelDataTypeSystem.DEFAULT}})
* metadata — {{RelMetadataProvider}} / {{ChainedRelMetadataProvider}}
* cost — {{RelOptCostFactory}}
* constant executor — {{RexExecutor}} (set on the planner)
So an adapter can already _define_ its operators (compose a
{{SqlOperatorTable}}) and _validate_ them, but it cannot _generate code_ for
them. The validation half of "defining a function" is open; the code-generation
half is sealed. Closing that asymmetry is the goal.
h2. Proposal
Make the implementor table a first-class, composable SPI; the code-generation
counterpart of {{SqlOperatorTable}}, *without changing default behaviour*.
# Extract an interface {{RexImplementorTable}} with the existing lookups
({{get}} for scalar / aggregate / match / windowed-table-function operators).
{{RexImpTable}} becomes its default implementation; {{RexImpTable.INSTANCE}}
and a new {{RexImpTable.instance()}} remain the default.
# Add {{RexImplementorTables.chain(...)}} (mirroring
{{SqlOperatorTables.chain}}): consult each table in turn, first non-null wins;
chain order provides override.
# Thread an injectable {{RexImplementorTable}} (defaulting to the built-ins)
through the code-generation entry points — {{RexToLixTranslator}} (new
overloads of {{translateProjects}} / {{translateCondition}}) and
{{RexExecutorImpl}} (for constant folding) — sourced the same way
{{conformance}} already travels into {{EnumerableRelImplementor}}.
An adapter then supplies implementors for its own operators by composing
{{RexImplementorTables.chain(myTable, RexImpTable.instance())}} — exactly
parallel to how it composes its {{SqlOperatorTable}} today.
h3. Backward compatibility
* {{RexImpTable.INSTANCE}} and all existing public methods remain; the default
resolution path is unchanged.
* New table-carrying overloads are added; the older overloads are deprecated
and delegate to them.
* The match / windowed-table-function lookups change from "throw on miss" to
"return {{null}} on miss" so a chained table can fall through; call sites that
require an implementor preserve the same failure via an explicit check.
h3. Example
{code:java}
RexImplementorTable table =
RexImplementorTables.chain(myAdapterImplementors, RexImpTable.instance());
// constant folding
planner.setExecutor(new RexExecutorImpl(dataContext, table));
{code}
h2. Scope / non-goals
* Enumerable-engine _execution_ of custom *aggregates* additionally needs the
table at planning time (the {{EnumerableAggregate}} constructor pre-checks
operator support), which can be a follow-up; the SPI itself already covers
aggregate implementors.
* The operator *catalog* ({{SqlLibrary}} / {{SqlLibraryOperators}}) is
unchanged. That is first-party content behind the already-open
{{SqlOperatorTable}} and is intentionally out of scope.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)