[
https://issues.apache.org/jira/browse/IMPALA-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870884#comment-17870884
]
ASF subversion and git services commented on IMPALA-12800:
----------------------------------------------------------
Commit 0918147e69b1625e76886506b065c310c7cc52d1 in impala's branch
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0918147e6 ]
IMPALA-13270: Fix IllegalStateException on runtime filter
IMPALA-12800 improved ExprSubstitutionMap to use a HashMap for lookups.
Some methods in ExprSubstitutionMap guard against duplicate entries, but
not creation or adding fields, so cases were added where repeated
expressions would be added to the map. In practice, only the first entry
added would be matched. IMPALA-12800 started removing duplicate entries
from the map to reduce memory use, but missed that one caller -
RuntimeFilterGenerator - was expecting the map size to exactly match the
input expression list.
Fixes the IllegalStateException caused by runtime filters where the same
expression is repeated multiple times by changing the precondition to
verify that each SlotRef has a mapping added. It doesn't verify the
final size, because SlotRefs may be repeated and the map will avoid
adding duplicates.
Removes trim method - added in IMPALA-13270 - as it no longer provides
any benefit when performing lookups with a HashMap, and may actually do
more work during the trim. test_query_compilation.py continues to pass,
and I see no discernible difference in "Single node plan" time; both are
30-40ms on my machine.
Adds a test case that failed with the old precondition. IDE-assisted
search did not find any other cases where ExprSubstitutionMap#size is
compared against a non-zero value.
Change-Id: I23c7bcf33e5185f10a6ae475debb8ab70a2ec5eb
Reviewed-on: http://gerrit.cloudera.org:8080/21638
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Queries with many nested inline views see performance issues with
> ExprSubstitutionMap
> -------------------------------------------------------------------------------------
>
> Key: IMPALA-12800
> URL: https://issues.apache.org/jira/browse/IMPALA-12800
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 4.3.0
> Reporter: Joe McDonnell
> Assignee: Michael Smith
> Priority: Critical
> Fix For: Impala 4.5.0
>
> Attachments: impala12800repro.sql, impala12800schema.sql,
> long_query_jstacks.tar.gz
>
>
> A user running a query with many layers of inline views saw a large amount of
> time spent in analysis.
>
> {noformat}
> - Authorization finished (ranger): 7s518ms (13.134ms)
> - Value transfer graph computed: 7s760ms (241.953ms)
> - Single node plan created: 2m47s (2m39s)
> - Distributed plan created: 2m47s (7.430ms)
> - Lineage info computed: 2m47s (39.017ms)
> - Planning finished: 2m47s (672.518ms){noformat}
> In reproducing it locally, we found that most of the stacks end up in
> ExprSubstitutionMap.
>
> Here are the main stacks seen while running jstack every 3 seconds during a
> 75 second execution:
> Location 1: (ExprSubstitutionMap::compose -> contains -> indexOf -> Expr
> equals) (4 samples)
> {noformat}
> java.lang.Thread.State: RUNNABLE
> at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
> at java.util.ArrayList.indexOf(ArrayList.java:323)
> at java.util.ArrayList.contains(ArrayList.java:306)
> at
> org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:120){noformat}
> Location 2: (ExprSubstitutionMap::compose -> verify -> Expr equals) (9
> samples)
> {noformat}
> java.lang.Thread.State: RUNNABLE
> at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
> at
> org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173)
> at
> org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:126){noformat}
> Location 3: (ExprSubstitutionMap::combine -> verify -> Expr equals) (5
> samples)
> {noformat}
> java.lang.Thread.State: RUNNABLE
> at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
> at
> org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173)
> at
> org.apache.impala.analysis.ExprSubstitutionMap.combine(ExprSubstitutionMap.java:143){noformat}
> Location 4: (TupleIsNullPredicate.wrapExprs -> Analyzer.isTrueWithNullSlots
> -> FeSupport.EvalPredicate -> Thrift serialization) (4 samples)
> {noformat}
> java.lang.Thread.State: RUNNABLE
> at java.lang.StringCoding.encode(StringCoding.java:364)
> at java.lang.String.getBytes(String.java:941)
> at
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227)
> at
> org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:532)
> at
> org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:467)
> at org.apache.impala.thrift.TClientRequest.write(TClientRequest.java:394)
> at
> org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:3034)
> at
> org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:2709)
> at org.apache.impala.thrift.TQueryCtx.write(TQueryCtx.java:2400)
> at org.apache.thrift.TSerializer.serialize(TSerializer.java:84)
> at
> org.apache.impala.service.FeSupport.EvalExprWithoutRowBounded(FeSupport.java:206)
> at
> org.apache.impala.service.FeSupport.EvalExprWithoutRow(FeSupport.java:194)
> at org.apache.impala.service.FeSupport.EvalPredicate(FeSupport.java:275)
> at
> org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Analyzer.java:2888)
> at
> org.apache.impala.analysis.TupleIsNullPredicate.requiresNullWrapping(TupleIsNullPredicate.java:181)
> at
> org.apache.impala.analysis.TupleIsNullPredicate.wrapExpr(TupleIsNullPredicate.java:147)
> at
> org.apache.impala.analysis.TupleIsNullPredicate.wrapExprs(TupleIsNullPredicate.java:136){noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]