[
https://issues.apache.org/jira/browse/CALCITE-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Feng Zhu updated CALCITE-3224:
------------------------------
Description:
h3. *Background*
Current RexNode-to-Expression implementation relies on BlockBuilder's
incorrect “optimizations” to inline unsafe operations. As illustrated in
CALCITE-3173, when this cooperation is broken in some special cases, it will
cause exceptions like NPE, such as CALCITE-3142, CALCITE-3143, CALCITE-3150.
Though we can fix these problems under current implementation framework
with some efforts like the PR in CALCITE-3142, the logic will become more and
more complex. To pursue a thorough and elegant solution, we implement a new
one. Moreover, it also ensures the correctness for non-optimized code.
h3. *Major Features*
* *Visitor Pattern*: Each RexNode will be visited only once in a bottom-up
way, rather than recursively visiting a RexNode many times with different
NullAs settings.
* *Conditional Semantic*: It can naturally guarantee the correctness even
without BlockBuilder’s “optimizings”. Each line of code generated for a RexNode
is null safe.
* *Interface Compatibility*: The implementation only updates
_RexToLixTranslator_ and _RexImpTable_. Interfaces such as CallImplementor keep
unchanged.
h3. *Implementation*
For each RexNode, the visitor will generally generate two declaration
statements, one for value and one for nullable. The code snippet is like:
{code:java}
{valueVariable} = {valueExpression}
{isNullVariable} = {isNullExpression}
{code}
The visitor’s result will be the variable pair (*_isNullVariable_*,
*_valueVariable_*).
*Other changes:*
(1) ReImplement different RexCall implementations (e.g., CastImplementor,
BinaryImplementor and etc.) as seperated files and remove them into the newly
created package _org.apache.calcite.adapter.enumerable.rex,_ and organize them
in RexCallImpTable.
(2) move some util functions into EnumUtils.
h3. *Example Demonstration*
Take a simple test case as example, in which the "commission" column is
nullable.
{code:java}
@Test public void testNPE() {
CalciteAssert.hr()
.query("select \"commission\" + 10 as s\n"
+ "from \"hr\".\"emps\"")
.returns("S=1010\nS=510\nS=null\nS=260\n");
}
{code}
The codegen progress and non-optimized code are demonstrated in the figure
below.
!codegen.png!
# When visiting *RexInputRef (commission)*, the visitor generates three lines
of code, the result is a pair of ParameterExpression (*_input_isNull_*,
*_input_value_*).
# Then the visitor visits *RexLiteral (10)* and generates two lines of code.
The result is (*_literal_isNull_*, *_literal_value_*).
# After that, when visiting *RexCall(Add)*, (_*input_isNull*_,
_*input_value*_) and (_*literal_isNull*_, _*literal_value*_) can be used to
implement the logic. The visitor also generates two lines of code and return
the variable pair.
In the end, the result Expression is constructed based on
(_*binary_call_isNull*_, _*binary_call_value*_)
was:
h3. *Background*
Current RexNode-to-Expression implementation relies on BlockBuilder's
incorrect “optimizations” to inline unsafe operations. As illustrated in
CALCITE-3173, when this cooperation is broken in some special cases, it will
cause exceptions like NPE, such as CALCITE-3142, CALCITE-3143, CALCITE-3150.
Though we can fix these problems under current implementation framework
with some efforts like the PR in CALCITE-3142, the logic will become more and
more complex. To pursue a thorough and elegant solution, we implement a new
one. Moreover, it also ensures the correctness for non-optimized code.
h3. *Major Features*
* *Visitor Pattern*: Each RexNode will be visited only once in a bottom-up
way, rather than recursively visiting a RexNode many times with different
NullAs settings.
* *Conditional Semantic*: It can naturally guarantee the correctness even
without BlockBuilder’s “optimizings”. Each line of code generated for a RexNode
is null safe.
* *Interface Compatibility*: The implementation only updates
_RexToLixTranslator_ and _RexImpTable_. Interfaces such as CallImplementor keep
unchanged.
h3. *Implementation*
For each RexNode, the visitor will generally generate two declaration
statements, one for value and one for nullable. The code snippet is like:
{code:java}
{valueVariable} = {valueExpression}
{isNullVariable} = {isNullExpression}
{code}
The visitor’s result will be the variable pair (*_isNullVariable_*,
*_valueVariable_*).
Other changes:
(1) ReImplement and move different RexCall implementations (e.g.,
CastImplementor, BinaryImplementor and etc.) into seperated files in new
created package _org.apache.calcite.adapter.enumerable.rex,_ and organize them
in RexCallImpTable.
(2) move some util functions into EnumUtils.
h3. *Example Demonstration*
Take a simple test case as example, in which the "commission" column is
nullable.
{code:java}
@Test public void testNPE() {
CalciteAssert.hr()
.query("select \"commission\" + 10 as s\n"
+ "from \"hr\".\"emps\"")
.returns("S=1010\nS=510\nS=null\nS=260\n");
}
{code}
The codegen progress and non-optimized code are demonstrated in the figure
below.
!codegen.png!
# When visiting *RexInputRef (commission)*, the visitor generates three lines
of code, the result is a pair of ParameterExpression (*_input_isNull_*,
*_input_value_*).
# Then the visitor visits *RexLiteral (10)* and generates two lines of code.
The result is (*_literal_isNull_*, *_literal_value_*).
# After that, when visiting *RexCall(Add)*, (_*input_isNull*_,
_*input_value*_) and (_*literal_isNull*_, _*literal_value*_) can be used to
implement the logic. The visitor also generates two lines of code and return
the variable pair.
In the end, the result Expression is constructed based on
(_*binary_call_isNull*_, _*binary_call_value*_)
> New RexNode-to-Expression CodeGen Implementation
> ------------------------------------------------
>
> Key: CALCITE-3224
> URL: https://issues.apache.org/jira/browse/CALCITE-3224
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.20.0
> Reporter: Feng Zhu
> Assignee: Feng Zhu
> Priority: Critical
> Labels: pull-request-available
> Attachments: codegen.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> h3. *Background*
> Current RexNode-to-Expression implementation relies on BlockBuilder's
> incorrect “optimizations” to inline unsafe operations. As illustrated in
> CALCITE-3173, when this cooperation is broken in some special cases, it will
> cause exceptions like NPE, such as CALCITE-3142, CALCITE-3143, CALCITE-3150.
> Though we can fix these problems under current implementation framework
> with some efforts like the PR in CALCITE-3142, the logic will become more and
> more complex. To pursue a thorough and elegant solution, we implement a new
> one. Moreover, it also ensures the correctness for non-optimized code.
> h3. *Major Features*
> * *Visitor Pattern*: Each RexNode will be visited only once in a bottom-up
> way, rather than recursively visiting a RexNode many times with different
> NullAs settings.
> * *Conditional Semantic*: It can naturally guarantee the correctness even
> without BlockBuilder’s “optimizings”. Each line of code generated for a
> RexNode is null safe.
> * *Interface Compatibility*: The implementation only updates
> _RexToLixTranslator_ and _RexImpTable_. Interfaces such as CallImplementor
> keep unchanged.
> h3. *Implementation*
> For each RexNode, the visitor will generally generate two declaration
> statements, one for value and one for nullable. The code snippet is like:
> {code:java}
> {valueVariable} = {valueExpression}
> {isNullVariable} = {isNullExpression}
> {code}
> The visitor’s result will be the variable pair (*_isNullVariable_*,
> *_valueVariable_*).
> *Other changes:*
> (1) ReImplement different RexCall implementations (e.g., CastImplementor,
> BinaryImplementor and etc.) as seperated files and remove them into the newly
> created package _org.apache.calcite.adapter.enumerable.rex,_ and organize
> them in RexCallImpTable.
> (2) move some util functions into EnumUtils.
> h3. *Example Demonstration*
> Take a simple test case as example, in which the "commission" column is
> nullable.
> {code:java}
> @Test public void testNPE() {
> CalciteAssert.hr()
> .query("select \"commission\" + 10 as s\n"
> + "from \"hr\".\"emps\"")
> .returns("S=1010\nS=510\nS=null\nS=260\n");
> }
> {code}
> The codegen progress and non-optimized code are demonstrated in the figure
> below.
> !codegen.png!
> # When visiting *RexInputRef (commission)*, the visitor generates three
> lines of code, the result is a pair of ParameterExpression (*_input_isNull_*,
> *_input_value_*).
> # Then the visitor visits *RexLiteral (10)* and generates two lines of code.
> The result is (*_literal_isNull_*, *_literal_value_*).
> # After that, when visiting *RexCall(Add)*, (_*input_isNull*_,
> _*input_value*_) and (_*literal_isNull*_, _*literal_value*_) can be used to
> implement the logic. The visitor also generates two lines of code and return
> the variable pair.
> In the end, the result Expression is constructed based on
> (_*binary_call_isNull*_, _*binary_call_value*_)
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)