[
https://issues.apache.org/jira/browse/IMPALA-13302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879172#comment-17879172
]
Quanlong Huang commented on IMPALA-13302:
-----------------------------------------
I was confused in how exprs using the same ids lead to this bug since I saw
some valid queries do have different conjuncts using the same id. I was even
more confused when I realized GlobalState.assignedConjuncts is a set backed by
IdentityHashMap:
{code:java}
public Set<ExprId> assignedConjuncts =
Collections.newSetFromMap(new IdentityHashMap<ExprId, Boolean>());{code}
[https://github.com/apache/impala/blob/77a87bb10/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L502-L503]
So even if Exprs have the same int id values, they should be different in
GlobalState.assignedConjuncts as long as they have different ExprId
{*}objects{*}.
Later I realized it's replaced (maybe unintentionally) with a HashSet:
{code:java}
public void setAssignedConjuncts(Set<ExprId> assigned) {
globalState_.assignedConjuncts = Sets.newHashSet(assigned);
} {code}
[https://github.com/apache/impala/blob/77a87bb10/fe/src/main/java/org/apache/impala/analysis/Analyzer.java#L3373]
I think this leads to Analyzer#getUnassignedConjuncts() work incorrectly.
Fixing this also fixes this bug, i.e. using this commit:
[https://github.com/stiga-huang/impala/commit/1f3ac6e8d28224d84ce687610f4eea9c67842cec]
> Some ExprRewriteRule results are not analyzed, leading to unmaterialized
> slots from reAnalyze
> ---------------------------------------------------------------------------------------------
>
> Key: IMPALA-13302
> URL: https://issues.apache.org/jira/browse/IMPALA-13302
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.3.0
> Reporter: Michael Smith
> Assignee: Michael Smith
> Priority: Critical
>
> IMPALA-12164 skipped registering conjuncts that the analyzer expects to
> remove because an earlier conjunct evaluates to constant False. However some
> ExprRewriteRules don't analyze the predicates they produce, which can lead to
> those conjuncts not actually being removed until a reAnalyze phase.
> reAnalyze uses a new Analyzer (with new GlobalState); it restarts counting
> Expr IDs from 0. That can lead to re-using the same Expr ID and marking it as
> assigned. Then when a new Expr gets the same ID, it will skip materializing
> slots, which can cause problems later (like if that Expr is part of a hash
> join).
> Some example queries:
> {code}
> WITH v AS (SELECT 1 FROM functional.alltypestiny t1
> JOIN functional.alltypestiny t2 ON t1.id = t2.id)
> SELECT 1
> FROM functional.alltypestiny t1
> WHERE ((t1.id = 1 and false) or (t1.id = 1 and false))
> AND t1.id = 1 AND t1.id = 1
> UNION ALL
> SELECT 1
> FROM functional.alltypestiny t1
> WHERE ((t1.id = 1 and false) or (t1.id = 1 and false))
> AND t1.id = 1 AND t1.id = 1
> UNION ALL SELECT 1 FROM v
> UNION ALL SELECT 1 FROM v;
> {code}
> (already fixed via IMPALA-13203) and
> {code}
> WITH v as (SELECT 1 FROM functional.alltypes t1
> JOIN functional.alltypes t2 ON t1.id = t2.id)
> SELECT 1 FROM functional.alltypes t1
> WHERE t1.id = 1 AND t1.id = 1 AND t1.id = 1 AND false
> UNION ALL
> SELECT 1 FROM functional.alltypes t1
> WHERE t1.id = 1 AND false
> UNION ALL SELECT 1 FROM v
> UNION ALL SELECT 1 FROM v;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]