[
https://issues.apache.org/jira/browse/IMPALA-7865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers reassigned IMPALA-7865:
-----------------------------------
Assignee: (was: Paul Rogers)
> Repeated type widening of arithmetic expressions
> ------------------------------------------------
>
> Key: IMPALA-7865
> URL: https://issues.apache.org/jira/browse/IMPALA-7865
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 3.0
> Reporter: Paul Rogers
> Priority: Minor
>
> An issue related to IMPALA-7855 occurs in {{ExprRewriterTest.TestToSql()}} in
> the CTAS test. (This test will be made into a separate method,
> {{TestCTASToSql()}}). When run with the "integrated rewrite" feature enabled,
> we get into this odd situation:
> * Analyze the {{CreateTableAsSelect}} statement. Create a temporary copy of
> the associated {{SELECT}} statement.
> * Rewrite the {{SELECT}} statement from {{SELECT 1 + 1}} (both {{TINYINT}},
> with {{SMALLINT}} for the {{+}} operation) to {{SELECT 2}} (as type
> {{TINYINT}}.)
> * After constant folding, the rule checks the original type of the
> expression ({{SMALLINT}}) and casts the result ({{TINYINT}}) to the original
> type ({{SMALLINT}}) using an implicit cast.
> * Perform column substitutions, reset and reanalyze. This process discards
> implicit casts. Because the value is 2, it takes the type {{TINYINT}}.
> * Create the base table expressions using the newly rewritten value
> ({{TINYINT}}) though the result expression is still {{SMALLINT}}.
> * Use the base expressions from the above (type as {{TINYINT}}) to declare
> the target table column.
> * Now, try to map the result expression {{SMALLINT}} into the newly created
> table column {{TINYINT}}. Fails with a overflow error.
> While IMPALA-7855 describes how types are widened unnecessarily due to a
> single expression, the problem here occurs over time, due to repeated
> analysis of the same numeric expression:
> * The analyzer implements a set of type propagation rules that generates a
> resulting type for arithmetic expressions that is wider than the types of the
> arguments. For example for {{tinyint_col + 1}}, {{tinyint_col}} and {{1}} are
> {{TINYINT}}, but the result of the expression is promoted to {{SMALLINT}}.
> * The planner then sets the type of the constant (1 here) to {{SMALLINT}}.
> * Repeat the process on the next cycle. {{tinyint_col}} is {{TINYINT}},
> {{1}} is {{SMALLINT}}. Now the result of the expression is {{INT}} and {{1}}
> is retyped to be {{INT}}.
> * Repeat again and the expression (and constant) are promoted to {{BIGINT}}.
> Meanwhile, analysis has taken a clone of the expression with the old types.
> As a result, the types of columns in the result list for a SELECT statement
> can differ from the same columns recorded in the SELECT list.
> * After the above, the base table expression for a {{SELECT}} statement has
> one schema ({{TINYINT}}), the result expression has another ({{SMALLINT}}).
> While the inconsistency in types may seem a minor issue, it does lead to
> analysis failures and does need to be addressed.
> Perhaps two fixes are needed:
> * When rewriting a numeric literal in the constant folding rule, apply the
> rules from {{NumericLiteral}} to override the type guessed by the constant
> evaluation.
> * Modify the {{substituteImpl}} method to a) don't reset numeric literals,
> or, more generally, b) don't reset expressions that did not change (or their
> children did not change.)
> Longer term, the implicit cast mechanism is overly fragile: we add it then
> discard it, resulting in subtle type inconsistencies.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]