[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..

IMPALA-9358: Query slowdown with inline views and hundreds of columns

IMPALA-8386 introduced an expensive precondition check using the function
ExprSubstitutionMap.checkComposedFrom(). This check has significant
performance impact on statements that contain inline views with hundreds
of columns. Most of the cost is in the get() calls used to find
expressions in the local substitution map.

The fix is to add a getWithHint() call that uses the current loop index as a
starting point to search for expressions. This leverages the fact that
expressions have identical positions in both substitution maps in most
common cases.

A more generic approach would be to accelerate expression equality search
using hash functions but that would be a much riskier fix and Impala
currently lacks the infrasturucture to so.

Testing:
Performance testing with a query with 1000 expressions of the
following form:
  with a as (select c1 c1, c1 c2, c1 c3, ... from t)
  select c1, c2, c3, ... from a;

repro query went from 12 sec to 1 sec.
There was no noticeable time spent in the precondition now.

Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Reviewed-on: http://gerrit.cloudera.org:8080/15157
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
1 file changed, 16 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 5
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 4
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 08 Feb 2020 02:39:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5305/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 4
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 07 Feb 2020 21:47:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 4
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 07 Feb 2020 21:47:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-07 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 3: Code-Review+2

Carrying forward the +1 votes.


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 3
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 07 Feb 2020 21:47:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5158/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 3
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 07 Feb 2020 19:02:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-07 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 3
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 07 Feb 2020 18:41:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-07 Thread Kurt Deschler (Code Review)
Hello Quanlong Huang, Anurag Mantripragada, Vihang Karajgaonkar, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15157

to look at the new patch set (#3).

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..

IMPALA-9358: Query slowdown with inline views and hundreds of columns

IMPALA-8386 introduced an expensive precondition check using the function
ExprSubstitutionMap.checkComposedFrom(). This check has significant
performance impact on statements that contain inline views with hundreds
of columns. Most of the cost is in the get() calls used to find
expressions in the local substitution map.

The fix is to add a getWithHint() call that uses the current loop index as a
starting point to search for expressions. This leverages the fact that
expressions have identical positions in both substitution maps in most
common cases.

A more generic approach would be to accelerate expression equality search
using hash functions but that would be a much riskier fix and Impala
currently lacks the infrasturucture to so.

Testing:
Performance testing with a query with 1000 expressions of the
following form:
  with a as (select c1 c1, c1 c2, c1 c3, ... from t)
  select c1, c2, c3, ... from a;

repro query went from 12 sec to 1 sec.
There was no noticeable time spent in the precondition now.

Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
---
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
1 file changed, 16 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15157/3
--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 3
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 2: Code-Review+1

patch looks good to me. Looks like Quanlong has some comments too. I will give 
a +1 and let him promote it to a +2 once he is okay with the patch.


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 07:08:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 2: Code-Review+1

(3 comments)

Nice find! The fix looks good to me.

http://gerrit.cloudera.org:8080/#/c/15157/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15157/1//COMMIT_MSG@25
PS1, Line 25: Performance testing with a query with 1000 expressions of the
> The repro query went from 12 sec to 1 sec. There was no noticeable time spe
Could you please add the performance gain to the commit message?


http://gerrit.cloudera.org:8080/#/c/15157/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15157/2//COMMIT_MSG@15
PS2, Line 15: get_hint
It's "getWithHint" now.


http://gerrit.cloudera.org:8080/#/c/15157/2/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/15157/2/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@80
PS2, Line 80: (
nit: need a space after "if"



--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 06:50:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 2: Code-Review+1

LGTM! I ran the FE tests successfully.


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 04:25:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5132/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 01:27:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Kurt Deschler (Code Review)
Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@76
PS1, Line 76: getWithH
> nit, we generally stick to using camelcase for method names. May be rename
Done


http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@77
PS1, Line 77: for (int i = hint; i < lhs_.size(); ++i) {
:   if (lhsExpr.equals(lhs_.get(i))) return rhs_.get(i);
: }
> The expression is typically found on the first iteration of the first loop.
Done


http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@81
PS1, Line 81: for (int i = 0; i < hint; ++i) {
> this can be in the same line as 80 within {} to match the coding style in t
Done



--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 00:44:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Kurt Deschler (Code Review)
Hello Anurag Mantripragada, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15157

to look at the new patch set (#2).

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..

IMPALA-9358: Query slowdown with inline views and hundreds of columns

IMPALA-8386 introduced an expensive precondition check using the function
ExprSubstitutionMap.checkComposedFrom(). This check has significant
performance impact on statements that contain inline views with hundreds
of columns. Most of the cost is in the get() calls used to find
expressions in the local substitution map.

The fix is to add a get_hint() call that uses the current loop index as a
starting point to search for expressions. This leverages the fact that
expressions have identical positions in both substitution maps in most
common cases.

A more generic approach would be to accelerate expression equality search
using hash functions but that would be a much riskier fix and Impala
currently lacks the infrasturucture to so.

Testing:
Performance testing with a query with 1000 expressions of the
following form:
  with a as (select c1 c1, c1 c2, c1 c3, ... from t)
  select c1, c2, c3, ... from a;

Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
---
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
1 file changed, 16 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15157/2
--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 1:

(2 comments)

Nice find! The patch looks good to me. Left some nits related to code-styling.

http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@76
PS1, Line 76: get_hint
nit, we generally stick to using camelcase for method names. May be rename to 
getUsingHint or getWithHint()?


http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@81
PS1, Line 81:   hint = lhs_.size();
this can be in the same line as 80 within {} to match the coding style in the 
rest of the code.



--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 06 Feb 2020 00:26:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Kurt Deschler (Code Review)
Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 1:

> Patch Set 1:
>
> (2 comments)
>
> Thanks for making this quick-fix. Comments are more for my understanding than 
> about the code change.
> This change seems small enough but did you have a chance to run any EE/FE 
> tests that touch this path as a sanity check that nothing went there?

I only used the repro testcase to verify correctness. Will run some FE tests 
now.


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 05 Feb 2020 14:00:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-05 Thread Kurt Deschler (Code Review)
Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 1:

(2 comments)

> Patch Set 1:
>
> (2 comments)
>
> Thanks for making this quick-fix. Comments are more for my understanding than 
> about the code change.
> This change seems small enough but did you have a chance to run any EE/FE 
> tests that touch this path as a sanity check that nothing went there?

http://gerrit.cloudera.org:8080/#/c/15157/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15157/1//COMMIT_MSG@25
PS1, Line 25: Performance testing with a query with 1000 expressions of the
> Not a blocking comment, feel free to skip it but would it be possible to in
The repro query went from 12 sec to 1 sec. There was no noticeable time spend 
in the precondition now.


http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@77
PS1, Line 77: for (int i = hint; i < lhs_.size(); ++i) {
:   if (lhsExpr.equals(lhs_.get(i))) return rhs_.get(i);
: }
> Just trying to understand this. Is it safe to assume that most equality sea
The expression is typically found on the first iteration of the first loop.



--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 05 Feb 2020 13:57:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-04 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 1:

(2 comments)

Thanks for making this quick-fix. Comments are more for my understanding than 
about the code change.
This change seems small enough but did you have a chance to run any EE/FE tests 
that touch this path as a sanity check that nothing went there?

http://gerrit.cloudera.org:8080/#/c/15157/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15157/1//COMMIT_MSG@25
PS1, Line 25: Performance testing with a query with 1000 expressions of the
Not a blocking comment, feel free to skip it but would it be possible to 
include the performance improvement you observed here?


http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/15157/1/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@77
PS1, Line 77: for (int i = hint; i < lhs_.size(); ++i) {
:   if (lhsExpr.equals(lhs_.get(i))) return rhs_.get(i);
: }
Just trying to understand this. Is it safe to assume that most equality 
searches end in this for loop?



--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 05 Feb 2020 03:55:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15157 )

Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5124/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 05 Feb 2020 01:26:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9358: Query slowdown with inline views and hundreds of columns

2020-02-04 Thread Kurt Deschler (Code Review)
Kurt Deschler has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15157


Change subject: IMPALA-9358: Query slowdown with inline views and hundreds of 
columns
..

IMPALA-9358: Query slowdown with inline views and hundreds of columns

IMPALA-8386 introduced an expensive precondition check using the function
ExprSubstitutionMap.checkComposedFrom(). This check has significant
performance impact on statements that contain inline views with hundreds
of columns. Most of the cost is in the get() calls used to find
expressions in the local substitution map.

The fix is to add a get_hint() call that uses the current loop index as a
starting point to search for expressions. This leverages the fact that
expressions have identical positions in both substitution maps in most
common cases.

A more generic approach would be to accelerate expression equality search
using hash functions but that would be a much riskier fix and Impala
currently lacks the infrasturucture to so.

Testing:
Performance testing with a query with 1000 expressions of the
following form:
  with a as (select c1 c1, c1 c2, c1 c3, ... from t)
  select c1, c2, c3, ... from a;

Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
---
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
1 file changed, 17 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15157/1
--
To view, visit http://gerrit.cloudera.org:8080/15157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I77423d9c10e1edbb505cb210b5c072281b5d7cfc
Gerrit-Change-Number: 15157
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler