[
https://issues.apache.org/jira/browse/IMPALA-13302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17877896#comment-17877896
]
Michael Smith commented on IMPALA-13302:
----------------------------------------
I think this is an interaction of two bugs:
# registerConjuncts can now call markConjunctAssigned on Expr that weren't
first registered vi registerConjunct with this Analyzer. In most cases this
isn't a problem because markConjunctAssigned essentially ignores a null ID.
This is primarily a problem if the Expr did get an ID assigned in the 1st
analysis, then reAnalyze encounters a constant false expression and skips
registering them, but still assigns them with their old ID. That can lead to
two Expr having the same ID.
# It's pretty easy to construct an expression that is only partially rewritten
during the first pass, because several RewriteExprRules don't analyze new
conjuncts they create (leading to other rules skipping rewrites). This seems to
be the primary way we trigger bug (1), where a rewrite rule (examples:
ExtractCommonConjunctRule, NormalizeExprsRule) produces a constant false
conjunct but SimplifyConditionalsRule doesn't rewrite it because it wasn't
analyzed.
I'm going to file a separate bug for (2) as it has other implications
(incomplete rewrites).
> Some ExprRewriteRule results are not analyzed, leading to unmaterialized
> slots from reAnalyze
> ---------------------------------------------------------------------------------------------
>
> Key: IMPALA-13302
> URL: https://issues.apache.org/jira/browse/IMPALA-13302
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.3.0
> Reporter: Michael Smith
> Assignee: Michael Smith
> Priority: Critical
>
> IMPALA-12164 skipped registering conjuncts that the analyzer expects to
> remove because an earlier conjunct evaluates to constant False. However some
> ExprRewriteRules don't analyze the predicates they produce, which can lead to
> those conjuncts not actually being removed until a reAnalyze phase.
> reAnalyze uses a new Analyzer (with new GlobalState); it restarts counting
> Expr IDs from 0. That can lead to re-using the same Expr ID and marking it as
> assigned. Then when a new Expr gets the same ID, it will skip materializing
> slots, which can cause problems later (like if that Expr is part of a hash
> join).
> Some example queries:
> {code}
> WITH v AS (SELECT 1 FROM functional.alltypestiny t1
> JOIN functional.alltypestiny t2 ON t1.id = t2.id)
> SELECT 1
> FROM functional.alltypestiny t1
> WHERE ((t1.id = 1 and false) or (t1.id = 1 and false))
> AND t1.id = 1 AND t1.id = 1
> UNION ALL
> SELECT 1
> FROM functional.alltypestiny t1
> WHERE ((t1.id = 1 and false) or (t1.id = 1 and false))
> AND t1.id = 1 AND t1.id = 1
> UNION ALL SELECT 1 FROM v
> UNION ALL SELECT 1 FROM v;
> {code}
> (already fixed via IMPALA-13203) and
> {code}
> WITH v as (SELECT 1 FROM functional.alltypes t1
> JOIN functional.alltypes t2 ON t1.id = t2.id)
> SELECT 1 FROM functional.alltypes t1
> WHERE t1.id = 1 AND t1.id = 1 AND t1.id = 1 AND false
> UNION ALL
> SELECT 1 FROM functional.alltypes t1
> WHERE t1.id = 1 AND false
> UNION ALL SELECT 1 FROM v
> UNION ALL SELECT 1 FROM v;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]