[jira] [Comment Edited] (CALCITE-2454) HepPlanner should not pick cached RelNode when type mismatch while digest matches

Julian Hyde (JIRA) Sun, 13 Jan 2019 15:54:55 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741684#comment-16741684
 ]


Julian Hyde edited comment on CALCITE-2454 at 1/13/19 11:53 PM:
----------------------------------------------------------------

I am +1 on on this change after some cleanup (see below); however, I think 
typeOmitted and equalSansNullability could use rework.

Review comments:
 * VolcanoPlanner.key() is obsolete
 * In Values.java, there are a couple of calls to pw.nest() that will always 
return false
 * Make RexCall.SIMPLE_OPS private and immutable (Guava's ImmutableEnumSet 
would do the trick)
 * In RexCall.equalSansNullability you need some javadoc, and also explain why 
it is better/different than RelDataTypeFactory.equalSansNullability; also 'c' 
is unused, which means you have a bug
 * Can you explain why RexLiteral.typeOmitted is necessary? It doesn't feel 
like a field, even though it is derived from other fields. I see it is used in 
some complex logic in RexCall.appendOperands which also calls the broken 
equalSansNullability method.
 * In RexLiteral.shouldIncludeType needs javadoc. Also, please improve the 
indent of the complex 'if' condition to improve readability.
 * It doesn't seem right to add a RexDigestIncludeType argument to printAsJava.
 * In SqlCollation.getCoercibilityDyadic, I agree with your changes. I'm not 
sure why the code used to do 'new'. Maybe someone was worried about mutability.
 * You are technically correct in moving slow tests such as LatticeTest and 
JdbcTest to the end of the list in CalciteSuite. However, in a multi-threaded 
environment that may make the whole suite take longer.

Can you clarify: when you omit charset and collation, are you using the SESSION 
defaults or the GLOBAL default? It's important that you use the GLOBAL default, 
because may need to compare digests that are created in sessions that have 
different default charsets.

I liked your description in your email; can you add that to javadoc somewhere:
{quote}The idea to approach the issue is to add data type to the RexLiteral
 representation.
 For instance, use 1:BIGINT instead of just 1. The downside is extra types
 might add extra verbosity.

So the idea is to hide types for "well known cases":
 1) Hide "NOT NULL" for not null literals
 2) Hide INTEGER, BOOLEAN, SYMBOL, TIME(0), TIMESTAMP(0), DATE(0) types
 3) Hide collation when it matches IMPLICIT/COERCIBLE
 4) Hide charset when it matches default
 5) Hide CHAR(xx) when literal length is equal to the precision of the type.
 In other words, use 'Bob' instead of 'Bob':CHAR(3)
 6) Hide BOOL for AND/OR arguments. In other words, AND(true, null) means
 null is BOOL.
 7) Hide types for literals in simple binary operations (e.g. +, -, *, /,
 comparison) when type of the other argument is clear.
 For instance: =(true. null) means null is BOOL. =($0, null) means the type
 of null matches the type of $0.
{quote}


was (Author: julianhyde):
Review comments:
* VolcanoPlanner.key() is obsolete
* In Values.java, there are a couple of calls to pw.nest() that will always 
return false
* Make RexCall.SIMPLE_OPS private and immutable (Guava's ImmutableEnumSet would 
do the trick)
* In RexCall.equalSansNullability you need some javadoc, and also explain why 
it is better/different than RelDataTypeFactory.equalSansNullability; also 'c' 
is unused, which means you have a bug
* Can you explain why RexLiteral.typeOmitted is necessary? It doesn't feel like 
a field, because it's not part of the identity of a literal, more like the 
context in which that literal is used. Since RexNodes (including RexLiterals) 
can be used multiple times, their identity must not contain their usage context.
* In RexLiteral, shouldIncludeType needs javadoc. Please improve the indent of 
the complex 'if' condition to improve readability.
* It doesn't seem right to add a RexDigestIncludeType argument to printAsJava.
* In SqlCollation.getCoercibilityDyadic, I agree with your changes. I'm not 
sure why the code used to do 'new'. Maybe someone was worried about mutability.
* You are technically correct in moving slow tests such as LatticeTest and 
JdbcTest to the end of the list in CalciteSuite. However, in a multi-threaded 
environment that may make the whole suite take longer.

Can you clarify: when you omit charset and collation, are you using the SESSION 
defaults or the GLOBAL default? It's important that you use the GLOBAL default, 
because may need to compare digests that are created in sessions that have 
dfiferent default charsets.

> HepPlanner should not pick cached RelNode when type mismatch while digest 
> matches
> ---------------------------------------------------------------------------------
>
>                 Key: CALCITE-2454
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2454
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.17.0
>            Reporter: Yuzhao Chen
>            Assignee: Julian Hyde
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: expr_types.png, project_with_literals_digest.png
>
>
> After CALCITE-2116, a RelNode will be reused based on digest, but now many 
> nodes's digests do not have type info, e.g. LogicalProject, this will affect 
> some rules when we wanna transform to a node with same digest and same type 
> (HepPlanner may pick out a RelNode with different type).
> So, we should add type info just like VolcanoPlanner.
> We encounter this problem with plan like below:
> {code:java}
> Union
> +-- project1(a: int, b:int, c: double)
> | +-- project2(a: bigint: b: int, c:double)
> +-- project3
> {code}
> We wanna cast a from bigint -> int so we can have a union with project1 and 
> project3, but with ProjectMergeRule, while merging project1 and project2, 
> HepPlanner will think project2 and project1 are identical and it gives back 
> project2. But actually we wanna pick project1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (CALCITE-2454) HepPlanner should not pick cached RelNode when type mismatch while digest matches

Reply via email to