[ 
https://issues.apache.org/jira/browse/IMPALA-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690692#comment-16690692
 ] 

Paul Rogers edited comment on IMPALA-7855 at 11/17/18 8:45 PM:
---------------------------------------------------------------

A related issue occurs in {{ExprRewriterTest.TestToSql()}} in the CTAS test. 
(This test will be made into a separate method, {{TestCTASToSql()}}). When run 
with the "integrated rewrite" feature enabled, we get into this odd situation:

 * Analyze the {{CreateTableAsSelect}} statement. Create a temporary copy of 
the associated {{SELECT}} statement.
 * Rewrite the {{SELECT}} statement from {{SELECT 1 + 1}} (both {{TINYINT}}) to 
{{SELECT 2}} (as type {{SMALLINT}}.)
 * Perform column substitutions, reset and reanalyze. Because the value is 2, 
it takes the type TINYINT.
 * Create the base table expressions using the newly rewritten value 
({{TINYINT}}) though the result expression is still {{SMALLINT}}.
 * Use the base expressions from the above (type as {{TINY}}) to declare the 
target table column.
 * Now, try to map the result expression {{SMALLINT}} into the newly created 
table column {{TINYINT}}. Fails with a overflow error.

The problem here is in one, or both, of two places:

 * The constant folding rule changes the type of the rewritten expression to 
{{SMALLINT}} (a conservative guess), though the 
{{NumericLiteral.analyzeImpl()}} method will choose {{TINYINT}} for the same 
value.
 * The {{Expr.substituteImpl()}} method forces a re-analyze of numeric 
literals, losing the wider type created by the constant folding rule.

There is another inconsistency which has not (yet) manifested as a test failure 
(probably because we have no suitable tests):

 * After the above, the base table expression for a {{SELECT}} statement has 
one schema ({{TINYINT}}), the result expression has another ({{SMALLINT}}).

While the inconsistency in types may seem a minor issue, it does lead to 
analysis failures and does need to be addressed.

Perhaps two fixes are needed:

 * When rewriting a numeric literal in the constant folding rule, apply the 
rules from {{NumericLiteral}} to override the type guessed by the constant 
evaluation.
 * Modify the {{substituteImpl}} method to a) don't reset numeric literals, or, 
more generally, b) don't reset expressions that did not change (or their 
children did not change.)


was (Author: paul.rogers):
A related issue occurs in {{ExprRewriterTest.TestToSql()}} in the CTAS test. 
(This test will be made into a separate method, {{TestCTASToSql()}}). When run 
with the "integrated rewrite" feature enabled, we get into this odd situation:

* Analyze the CreateTableAsSelect statement. Create a temporary copy of the 
associated {{SELECT}} statement.
* Rewrite the {{SELECT}} statement from {{SELECT 1 + 1}} (both {{TINYINT}}) to 
{{SELECT 2}} (as type {{SMALLINT}}.)
* Perform column substitutions, reset and reanalyze. Because the value is 2, it 
takes the type TINYINT.
* Create the base table expressions using the newly rewritten value 
({{TINYINT}}) though the result expression is still {{SMALLINT}}.
* Use the base expressions from the above (type as {{TINY}}) to declare the 
target table column.
* Now, try to map the result expression {{SMALLINT}} into the newly created 
table column {{TINYINT}}. Fails with a overflow error.

The problem here is in one, or both, of two places:

* The constant folding rule changes the type of the rewritten expression to 
{{SMALLINT}} (a conservative guess), though the 
{{NumericLiteral.analyzeImpl()}} method will choose {{TINYINT}} for the same 
value.
* The {{Expr.trySubstitute()} method forces a re-analyze of numeric literals, 
losing the wider type created by the constant folding rule.

There is another inconsistency which has not (yet) manifested as a test failure 
(probably because we have no suitable tests):

* After the above, the base table expression for a {{SELECT}} statement has one 
schema ({{TINYINT}}), the result expression has another ({{SMALLINT}}).

While the inconsistency in types may seem a minor issue, it does lead to 
analysis failures and does need to be addressed.

Perhaps two fixes are needed:

* When rewriting a numeric literal in the constant folding rule, apply the 
rules from {{NumericLiteral}} to override the type guessed by the constant 
evaluation.
* Modify the {{trySubstitute}} method to a) don't reset numeric literals, or, 
more generally, b) don't reset expressions that did not change (or their 
children did not change.)

> Excessive type widening leads to unnecessary casts
> --------------------------------------------------
>
>                 Key: IMPALA-7855
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7855
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> When writing unit tests, created the following query:
> {code:sql}
> with
>   query1 (a, b) as (
>     select 1 + 1 + id, 2 + 3 + int_col from functional.alltypestiny)
> insert into functional.alltypestiny (id, int_col)
>   partition (month = 5, year = 2018)
>   select * from query1
> {code}
> The above fails with the following error:
> {noformat}
> ERROR: AnalysisException: Possible loss of precision for target table 
> 'functional.alltypestiny'.
> Expression 'query1.a' (type: BIGINT) would need to be cast to INT for column 
> 'id'
> {noformat}
> The following does work (for planning, may not actually execute):
> {code:sql}
> with
>   query1 (a, b) as (
>     select
>       cast(1 + 1 + id as int),
>       cast(2 + 3 + int_col as int)
>     from functional.alltypestiny)
> insert into functional.alltypestiny (id, int_col)
>   partition (month = 5, year = 2018)
>   select * from query1
> {code}
> What this says is the the planner selected type {{BIGINT}} for the 
> (rewritten) expression {{2 + id}} where {{id}} is of type {{INT}}. {{BIGINT}} 
> is a conservative guess: adding 2 to the largest {{INT}} could overflow and 
> require a {{BIGINT}}.
> Yet, for such a simple case, such aggressive type promotion may be overly 
> cautious.
> To verify that this is an issue, let's try something similar with Postgres to 
> see if it is as aggressive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to