[jira] [Commented] (IMPALA-12800) Queries with many nested inline views see performance issues with ExprSubstitutionMap
[ https://issues.apache.org/jira/browse/IMPALA-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848462#comment-17848462 ] ASF subversion and git services commented on IMPALA-12800: -- Commit ae6846b1cd039b2cd6f8753ce3ff810c5b2d3ce3 in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=ae6846b1c ] IMPALA-12800: Skip O(n^2) ExprSubstitutionMap::verify() for release builds ExprSubstitutionMap::compose() and combine() call verify() to check the new ExprSubstitutionMap for duplicates. This algorithm is O(n^2) and can add significant overhead to SQLs with a large number of expressions or inline views. This changes verify() to skip the check for release builds (keeping it for debug builds). In a query with 20+ layers of inline views and thousands of expressions, turning off the verify() call cuts the execution time from 51 minutes to 18 minutes. This doesn't fully solve slowness in ExprSubstitutionMap. Further improvement would require Expr to support hash-based algorithms, which is a much larger change. Testing: - Manual performance comparison with/without the verify() call Change-Id: Ieeacfec6a5b487076ce5b19747319630616411f0 Reviewed-on: http://gerrit.cloudera.org:8080/21444 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins > Queries with many nested inline views see performance issues with > ExprSubstitutionMap > - > > Key: IMPALA-12800 > URL: https://issues.apache.org/jira/browse/IMPALA-12800 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Priority: Critical > Attachments: impala12800repro.sql, impala12800schema.sql, > long_query_jstacks.tar.gz > > > A user running a query with many layers of inline views saw a large amount of > time spent in analysis. > > {noformat} > - Authorization finished (ranger): 7s518ms (13.134ms) > - Value transfer graph computed: 7s760ms (241.953ms) > - Single node plan created: 2m47s (2m39s) > - Distributed plan created: 2m47s (7.430ms) > - Lineage info computed: 2m47s (39.017ms) > - Planning finished: 2m47s (672.518ms){noformat} > In reproducing it locally, we found that most of the stacks end up in > ExprSubstitutionMap. > > Here are the main stacks seen while running jstack every 3 seconds during a > 75 second execution: > Location 1: (ExprSubstitutionMap::compose -> contains -> indexOf -> Expr > equals) (4 samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at java.util.ArrayList.indexOf(ArrayList.java:323) > at java.util.ArrayList.contains(ArrayList.java:306) > at > org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:120){noformat} > Location 2: (ExprSubstitutionMap::compose -> verify -> Expr equals) (9 > samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at > org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173) > at > org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:126){noformat} > Location 3: (ExprSubstitutionMap::combine -> verify -> Expr equals) (5 > samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at > org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173) > at > org.apache.impala.analysis.ExprSubstitutionMap.combine(ExprSubstitutionMap.java:143){noformat} > Location 4: (TupleIsNullPredicate.wrapExprs -> Analyzer.isTrueWithNullSlots > -> FeSupport.EvalPredicate -> Thrift serialization) (4 samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at java.lang.StringCoding.encode(StringCoding.java:364) > at java.lang.String.getBytes(String.java:941) > at > org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227) > at > org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:532) > at > org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:467) > at org.apache.impala.thrift.TClientRequest.write(TClientRequest.java:394) > at > org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:3034) > at > org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:2709) > at org.apache.impala.thrift.TQueryCtx.write(TQueryCtx.java:2400) > at org.apache.thrift.TSerializer.serialize(TSerializer.java:84) > at >
[jira] [Commented] (IMPALA-12800) Queries with many nested inline views see performance issues with ExprSubstitutionMap
[ https://issues.apache.org/jira/browse/IMPALA-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815894#comment-17815894 ] Joe McDonnell commented on IMPALA-12800: Attached a SQL that reproduces the issue. First run impala12800schema.sql, then run impala12800repro.sql. On my machine, impala12800repro.sql takes about 70-80 seconds. > Queries with many nested inline views see performance issues with > ExprSubstitutionMap > - > > Key: IMPALA-12800 > URL: https://issues.apache.org/jira/browse/IMPALA-12800 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Priority: Critical > Attachments: impala12800repro.sql, impala12800schema.sql, > long_query_jstacks.tar.gz > > > A user running a query with many layers of inline views saw a large amount of > time spent in analysis. > > {noformat} > - Authorization finished (ranger): 7s518ms (13.134ms) > - Value transfer graph computed: 7s760ms (241.953ms) > - Single node plan created: 2m47s (2m39s) > - Distributed plan created: 2m47s (7.430ms) > - Lineage info computed: 2m47s (39.017ms) > - Planning finished: 2m47s (672.518ms){noformat} > In reproducing it locally, we found that most of the stacks end up in > ExprSubstitutionMap. > > Here are the main stacks seen while running jstack every 3 seconds during a > 75 second execution: > Location 1: (ExprSubstitutionMap::compose -> contains -> indexOf -> Expr > equals) (4 samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at java.util.ArrayList.indexOf(ArrayList.java:323) > at java.util.ArrayList.contains(ArrayList.java:306) > at > org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:120){noformat} > Location 2: (ExprSubstitutionMap::compose -> verify -> Expr equals) (9 > samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at > org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173) > at > org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:126){noformat} > Location 3: (ExprSubstitutionMap::combine -> verify -> Expr equals) (5 > samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at > org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173) > at > org.apache.impala.analysis.ExprSubstitutionMap.combine(ExprSubstitutionMap.java:143){noformat} > Location 4: (TupleIsNullPredicate.wrapExprs -> Analyzer.isTrueWithNullSlots > -> FeSupport.EvalPredicate -> Thrift serialization) (4 samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at java.lang.StringCoding.encode(StringCoding.java:364) > at java.lang.String.getBytes(String.java:941) > at > org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227) > at > org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:532) > at > org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:467) > at org.apache.impala.thrift.TClientRequest.write(TClientRequest.java:394) > at > org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:3034) > at > org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:2709) > at org.apache.impala.thrift.TQueryCtx.write(TQueryCtx.java:2400) > at org.apache.thrift.TSerializer.serialize(TSerializer.java:84) > at > org.apache.impala.service.FeSupport.EvalExprWithoutRowBounded(FeSupport.java:206) > at > org.apache.impala.service.FeSupport.EvalExprWithoutRow(FeSupport.java:194) > at org.apache.impala.service.FeSupport.EvalPredicate(FeSupport.java:275) > at > org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Analyzer.java:2888) > at > org.apache.impala.analysis.TupleIsNullPredicate.requiresNullWrapping(TupleIsNullPredicate.java:181) > at > org.apache.impala.analysis.TupleIsNullPredicate.wrapExpr(TupleIsNullPredicate.java:147) > at > org.apache.impala.analysis.TupleIsNullPredicate.wrapExprs(TupleIsNullPredicate.java:136){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12800) Queries with many nested inline views see performance issues with ExprSubstitutionMap
[ https://issues.apache.org/jira/browse/IMPALA-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815838#comment-17815838 ] Joe McDonnell commented on IMPALA-12800: ExprSubstitutionMap::compose's call to contain is O(N) in the size of the ExprSubstitutionMap. ExprSubstitutionMap::verify is checking for duplicated with an O(N^2) algorithm. See https://github.com/apache/impala/blob/9baf790606073d88c3a2fd431110812140df0cb7/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java#L170-L182 Expr doesn't support hashing right now, so hash-based algorithms aren't an option without more work. It is possible that we could run verify() only for debug builds. Removing just the verify() call reduced the analysis time by 1/2 for this case. I'm working on providing a SQL that reproduces this. > Queries with many nested inline views see performance issues with > ExprSubstitutionMap > - > > Key: IMPALA-12800 > URL: https://issues.apache.org/jira/browse/IMPALA-12800 > Project: IMPALA > Issue Type: Improvement > Components: Frontend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Priority: Critical > > A user running a query with many layers of inline views saw a large amount of > time spent in analysis. > > {noformat} > - Authorization finished (ranger): 7s518ms (13.134ms) > - Value transfer graph computed: 7s760ms (241.953ms) > - Single node plan created: 2m47s (2m39s) > - Distributed plan created: 2m47s (7.430ms) > - Lineage info computed: 2m47s (39.017ms) > - Planning finished: 2m47s (672.518ms){noformat} > In reproducing it locally, we found that most of the stacks end up in > ExprSubstitutionMap. > > Here are the main stacks seen while running jstack every 3 seconds during a > 75 second execution: > Location 1: (ExprSubstitutionMap::compose -> contains -> indexOf -> Expr > equals) (4 samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at java.util.ArrayList.indexOf(ArrayList.java:323) > at java.util.ArrayList.contains(ArrayList.java:306) > at > org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:120){noformat} > Location 2: (ExprSubstitutionMap::compose -> verify -> Expr equals) (9 > samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at > org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173) > at > org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:126){noformat} > Location 3: (ExprSubstitutionMap::combine -> verify -> Expr equals) (5 > samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at org.apache.impala.analysis.Expr.equals(Expr.java:1008) > at > org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173) > at > org.apache.impala.analysis.ExprSubstitutionMap.combine(ExprSubstitutionMap.java:143){noformat} > Location 4: (TupleIsNullPredicate.wrapExprs -> Analyzer.isTrueWithNullSlots > -> FeSupport.EvalPredicate -> Thrift serialization) (4 samples) > {noformat} > java.lang.Thread.State: RUNNABLE > at java.lang.StringCoding.encode(StringCoding.java:364) > at java.lang.String.getBytes(String.java:941) > at > org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227) > at > org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:532) > at > org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:467) > at org.apache.impala.thrift.TClientRequest.write(TClientRequest.java:394) > at > org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:3034) > at > org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:2709) > at org.apache.impala.thrift.TQueryCtx.write(TQueryCtx.java:2400) > at org.apache.thrift.TSerializer.serialize(TSerializer.java:84) > at > org.apache.impala.service.FeSupport.EvalExprWithoutRowBounded(FeSupport.java:206) > at > org.apache.impala.service.FeSupport.EvalExprWithoutRow(FeSupport.java:194) > at org.apache.impala.service.FeSupport.EvalPredicate(FeSupport.java:275) > at > org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Analyzer.java:2888) > at > org.apache.impala.analysis.TupleIsNullPredicate.requiresNullWrapping(TupleIsNullPredicate.java:181) > at > org.apache.impala.analysis.TupleIsNullPredicate.wrapExpr(TupleIsNullPredicate.java:147) > at > org.apache.impala.analysis.TupleIsNullPredicate.wrapExprs(TupleIsNullPredicate.java:136){noformat} -- This message was sent by