[ 
https://issues.apache.org/jira/browse/CALCITE-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Zhu updated CALCITE-3173:
------------------------------
    Description: 
*Abstract:* Both RexImpTable and BlockBuilder have codegen issues, but it is 
interesting that they can work together well for most cases.

    We can illustrate the problem with a simple test case in JdbcTest, in which 
the "commission" column is nullable.
{code:java}
@Test public void testNullSafeCheck() {
    CalciteAssert.hr()
      .query("select \"commission\" + 10 as s from \"hr\".\"emps\"")
      .returns("S=1010\n"
               + "S=510\n"
               + "S=null\n"
               + "S=260\n");
}
{code}
    This test case can pass as the BlockBuilder is in default optimization 
mode. However, when we set it into un-optimization mode in _EnumerableCalc_, 
this test will fail due to NPE. The following picture demonstrates their 
differences.

!code.png!

*1.RexImpTable generates unsafe code*
     Before translating the RexCall (_*Add*_), RexImpTable firstly translate 
its operands with (nullAs=*IS_NULL*) [1] to detect whether it is null (i.e., 
{color:#ff0000}_inp4_unboxed_{color}). Then RexImpTable sets this operand's 
nullable in RexToLixTranslator as false [2]. After that, the operand will be 
translated again with *NOT_POSSIBLE* [3] to get the value (i.e., 
{color:#ff0000}_inp4_0_unboxed_{color}). In the end, the RexCall is implemented 
by NotNullImplementor.However, it is not safe to conduct operations in the 
second translation phase. 
 *2.BlockBuiler optimization's semantic issue buries NPE*
     BlockBuilder.optimize() changes the code semantic in this case. For 
conditional-like clauses (if...else, ?:, etc), InlineVariableVisitor will 
wrongly make variables inlined.

    In general, they can work together for most cases. However, when some 
special branch is triggered by query, the problem will be exposed. For example, 
the NewExpression (_new java.math.BigDecimal_) in CALCITE-3143 breaks the 
inline optimization phase.

 

*How to fix?*
     I have digged into this problem a couple of days and tried many approaches 
to fix it. But in this progress, I found the limitation in current 
implementation. 
     The whole recursive framework essentially conducts a sequential codegen 
beheavior, and may visit a RexNode again and again with different NullAs 
setting. Due to the limitation, it is difficult to meet null-safe codegen 
semantics. We can also find that there are many branches for special cases in 
current implementation. Even we can handle issues every time, the logic will 
becomes more and more complex  and unfriendly for maintenance.   

 

Therefore, I propose to re-consider this part, including several initial points.
 (1) {color:#ff0000}_Visitor Pattern_{color} (RexVisitor<Result>). 
Theoretically, RexNode can be translated into Expression by visiting the node 
only once. We can implement RexVisitor rather than current recursive 
translation.
 (2)The {color:#ff0000}Result{color} consists of three items (code: 
BuilderStatement, isNull: ParameterExpression, value: Expression).So it is easy 
to decide how  to implement a RexNode according to its children.

 

Please correct me if I make something wrong. Look forward to suggestions!

 

[1][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1062]

[2][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1064]

[3][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1113]

 

 

  was:
*Abstract:* Both RexImpTable and BlockBuilder have codegen issues, but it is 
interesting that they can work together well for most cases.

    We can illustrate the problem with a simple test case in JdbcTest, in which 
the "commission" column is nullable.
{code:java}
@Test public void testNullSafeCheck() {
    CalciteAssert.hr()
      .query("select \"commission\" + 10 as s from \"hr\".\"emps\"")
      .returns("S=1010\n"
               + "S=510\n"
               + "S=null\n"
               + "S=260\n");
}
{code}
    This test case can pass as the BlockBuilder is in default optimization 
mode. However, when we set it into un-optimization mode in _EnumerableCalc_, 
this test will fail due to NPE. The following picture demonstrates their 
differences.

!code.png!

*1.RexImpTable generates unsafe code*
     Before translating the RexCall (_*Add*_), RexImpTable firstly translate 
its operands with (nullAs=*IS_NULL*) [1] to detect whether it is null (i.e., 
{color:#ff0000}_inp4_unboxed_{color}). Then RexImpTable sets this operand's 
nullable in RexToLixTranslator as false [2]. After that, the operand will be 
translated again with *NOT_POSSIBLE* [3] to get the value (i.e., 
{color:#ff0000}_inp4_0_unboxed_{color}). In the end, the RexCall is implemented 
by NotNullImplementor.However, it is not safe to conduct operations in the 
second translation phase. 
 *2.BlockBuiler optimization's semantic issue buries NPE*
     BlockBuilder.optimize() changes the code semantic in this case. For 
conditional-like clauses (if...else, ?:, etc), InlineVariableVisitor will 
wrongly make variables inlined.

    In general, they can work together for most cases. However, when some 
special branch is triggered by query, the problem will be exposed. For example, 
the NewExpression in CALCITE-3143 breaks the optimization phase.

 

*How to fix?*
     I have digged into this problem a couple of days and tried many approaches 
to fix it. But in this progress, I found the limitation in current 
implementation. 
     The whole recursive framework essentially conducts a sequential codegen 
beheavior, and may visit a RexNode again and again with different NullAs 
setting. Due to the limitation, it is difficult to meet null-safe codegen 
semantics. We can also find that there are many branches for special cases in 
current implementation. Even we can handle issues every time, the logic will 
becomes more and more complex  and unfriendly for maintenance.   

 

Therefore, I propose to re-consider this part, including several initial points.
 (1) {color:#ff0000}_Visitor Pattern_{color} (RexVisitor<Result>). 
Theoretically, RexNode can be translated into Expression by visiting the node 
only once. We can implement RexVisitor rather than current recursive 
translation.
 (2)The {color:#ff0000}Result{color} consists of three items (code: 
BuilderStatement, isNull: ParameterExpression, value: Expression).So it is easy 
to decide how  to implement a RexNode according to its children.

 

Please correct me if I make something wrong. Look forward to suggestions!

 

[1][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1062]

[2][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1064]

[3][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1113]

 

 


> RexNode Code Generation Problem
> -------------------------------
>
>                 Key: CALCITE-3173
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3173
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: next
>            Reporter: Feng Zhu
>            Priority: Major
>         Attachments: code.png
>
>
> *Abstract:* Both RexImpTable and BlockBuilder have codegen issues, but it is 
> interesting that they can work together well for most cases.
>     We can illustrate the problem with a simple test case in JdbcTest, in 
> which the "commission" column is nullable.
> {code:java}
> @Test public void testNullSafeCheck() {
>     CalciteAssert.hr()
>       .query("select \"commission\" + 10 as s from \"hr\".\"emps\"")
>       .returns("S=1010\n"
>                + "S=510\n"
>                + "S=null\n"
>                + "S=260\n");
> }
> {code}
>     This test case can pass as the BlockBuilder is in default optimization 
> mode. However, when we set it into un-optimization mode in _EnumerableCalc_, 
> this test will fail due to NPE. The following picture demonstrates their 
> differences.
> !code.png!
> *1.RexImpTable generates unsafe code*
>      Before translating the RexCall (_*Add*_), RexImpTable firstly translate 
> its operands with (nullAs=*IS_NULL*) [1] to detect whether it is null (i.e., 
> {color:#ff0000}_inp4_unboxed_{color}). Then RexImpTable sets this operand's 
> nullable in RexToLixTranslator as false [2]. After that, the operand will be 
> translated again with *NOT_POSSIBLE* [3] to get the value (i.e., 
> {color:#ff0000}_inp4_0_unboxed_{color}). In the end, the RexCall is 
> implemented by NotNullImplementor.However, it is not safe to conduct 
> operations in the second translation phase. 
>  *2.BlockBuiler optimization's semantic issue buries NPE*
>      BlockBuilder.optimize() changes the code semantic in this case. For 
> conditional-like clauses (if...else, ?:, etc), InlineVariableVisitor will 
> wrongly make variables inlined.
>     In general, they can work together for most cases. However, when some 
> special branch is triggered by query, the problem will be exposed. For 
> example, the NewExpression (_new java.math.BigDecimal_) in CALCITE-3143 
> breaks the inline optimization phase.
>  
> *How to fix?*
>      I have digged into this problem a couple of days and tried many 
> approaches to fix it. But in this progress, I found the limitation in current 
> implementation. 
>      The whole recursive framework essentially conducts a sequential codegen 
> beheavior, and may visit a RexNode again and again with different NullAs 
> setting. Due to the limitation, it is difficult to meet null-safe codegen 
> semantics. We can also find that there are many branches for special cases in 
> current implementation. Even we can handle issues every time, the logic will 
> becomes more and more complex  and unfriendly for maintenance.   
>  
> Therefore, I propose to re-consider this part, including several initial 
> points.
>  (1) {color:#ff0000}_Visitor Pattern_{color} (RexVisitor<Result>). 
> Theoretically, RexNode can be translated into Expression by visiting the node 
> only once. We can implement RexVisitor rather than current recursive 
> translation.
>  (2)The {color:#ff0000}Result{color} consists of three items (code: 
> BuilderStatement, isNull: ParameterExpression, value: Expression).So it is 
> easy to decide how  to implement a RexNode according to its children.
>  
> Please correct me if I make something wrong. Look forward to suggestions!
>  
> [1][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1062]
> [2][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1064]
> [3][https://github.com/apache/calcite/blob/1748f0503e7b626a8d0165f1698adb8b61bbc31e/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L1113]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to