[Impala-ASF-CR] Fix formatting of HBaseScanNode explain output.

2020-04-17 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15749


Change subject: Fix formatting of HBaseScanNode explain output.
..

Fix formatting of HBaseScanNode explain output.

In the case with more than one hbase predicate the indentation level
wasn't correctly formatted in the explain string.

Instead of:

|  |  13:SCAN HBASE [default.dimension d]
|  | hbase filters:
|  | d:foo EQUAL '1'
|  | d:bar EQUAL '2'
|  | d:baz EQUAL '3'
|  | predicate:

This was produced:

|  | 13:SCAN HBASE [default.dimension d]
|  | hbase filters:
   d:foo EQUAL '1'
   d:bar EQUAL '2'
   d:baz EQUAL '3'
|  | predicate:

Change-Id: I30fad791408a1f7e35e9b3f2e6cb4958952dd567
---
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
2 files changed, 19 insertions(+), 14 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/15749/1
--
To view, visit http://gerrit.cloudera.org:8080/15749
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I30fad791408a1f7e35e9b3f2e6cb4958952dd567
Gerrit-Change-Number: 15749
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9727: Fix HBaseScanNode explain formatting

2020-05-05 Thread Shant Hovsepian (Code Review)
Hello Quanlong Huang, Thomas Tauber-Marshall, Fang-Yu Rao, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15749

to look at the new patch set (#2).

Change subject: IMPALA-9727: Fix HBaseScanNode explain formatting
..

IMPALA-9727: Fix HBaseScanNode explain formatting

In the case with more than one hbase predicate the indentation level
wasn't correctly formatted in the explain string.

Instead of:

|  |  13:SCAN HBASE [default.dimension d]
|  | hbase filters:
|  | d:foo EQUAL '1'
|  | d:bar EQUAL '2'
|  | d:baz EQUAL '3'
|  | predicate:

This was produced:

|  | 13:SCAN HBASE [default.dimension d]
|  | hbase filters:
   d:foo EQUAL '1'
   d:bar EQUAL '2'
   d:baz EQUAL '3'
|  | predicate:

Change-Id: I30fad791408a1f7e35e9b3f2e6cb4958952dd567
---
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
2 files changed, 18 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/15749/2
--
To view, visit http://gerrit.cloudera.org:8080/15749
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I30fad791408a1f7e35e9b3f2e6cb4958952dd567
Gerrit-Change-Number: 15749
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9727: Fix HBaseScanNode explain formatting

2020-05-05 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15749 )

Change subject: IMPALA-9727: Fix HBaseScanNode explain formatting
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15749/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15749/1//COMMIT_MSG@7
PS1, Line 7: Fix formatting of HBaseScanNode explain output.
> Since this is not an urgent hot-fix, could you create a JIRA for this?
Done


http://gerrit.cloudera.org:8080/#/c/15749/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
File fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java:

http://gerrit.cloudera.org:8080/#/c/15749/1/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java@591
PS1, Line 591: + "'" + filter.filter_constant + "'"
> nit: could you move this to the previous line?
Agreed will do, but this was the formatted enforced by clang-format.



--
To view, visit http://gerrit.cloudera.org:8080/15749
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30fad791408a1f7e35e9b3f2e6cb4958952dd567
Gerrit-Change-Number: 15749
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 05 May 2020 22:40:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT in Set operations

2020-08-30 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16350 )

Change subject: IMPALA-10099: Push down DISTINCT in Set operations
..


Patch Set 4:

> Patch Set 3:
>
> (1 comment)

Good point. The functional tests were given an Analysis Error for the pushdown 
in those cases, though added planner tests for it as well.


--
To view, visit http://gerrit.cloudera.org:8080/16350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Gerrit-Change-Number: 16350
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sun, 30 Aug 2020 06:37:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT in Set operations

2020-08-30 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16350

to look at the new patch set (#4).

Change subject: IMPALA-10099: Push down DISTINCT in Set operations
..

IMPALA-10099: Push down DISTINCT in Set operations

INTERSECT/EXCEPT are not duplicate preserving operations. The distinct
aggregations can happen in each operand, the leftmost operand only, or
after all the operands in a separate aggregation step. Except for a
couple special cases we would use the last strategy most often.

This change pushes the distinct aggregation down to the leftmost operand
in cases where there are no analytic functions, or when a distinct or
grouping operation already eliminates duplicates.

In general DISTINCT placement such as in this case should be done
throughout the entire plan tree in a cost based manner as described in
IMPALA-5260

Testing:
 * TpcdsPlannerTest
 * PlannerTest
 * TPC-DS 30TB Perf run for any affected queries
   - Q14-1 180s -> 150s
   - Q14-2 109s -> 90s
   - Q8 no significant change
 * SetOperation Planner Tests
 * Analyzer tests
 * Tpcds Functional Workload

Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
---
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/setoperation-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
6 files changed, 2,049 insertions(+), 1,806 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/16350/4
--
To view, visit http://gerrit.cloudera.org:8080/16350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Gerrit-Change-Number: 16350
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9: Code-Review+1

(1 comment)

LGTM nice little fix.

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461
PS7, Line 461: timestamp_col <= '2010-12-01';
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 21:39:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 12:

> Patch Set 12: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6368/

Flaky test?

shell.test_shell_interactive.TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt[table_format_and_file_extension:
 ('textfile', '.txt') | protocol: hs2] (from pytest)

E   TIMEOUT: Timeout exceeded.
E   
E   version: 3.3
E   command: 
/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell
E   args: 
['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', 
'--protocol=hs2', '-ilocalhost:21050']
E   searcher: 
E   buffer (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
default> '
E   before (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
default> '
E   after: 
E   match: None
E   match_index: None
E   exitstatus: None
E   flag_eof: False
E   pid: 12993
E   child_fd: 24
E   closed: False
E   timeout: 30
E   delimiter: 
E   logfile: None
E   logfile_read: None
E   logfile_send: None
E   maxread: 2000
E   ignorecase: False
E   searchwindowsize: None
E   delaybeforesend: 0.05
E   delayafterclose: 0.1
E   delayafterterminate: 0.1


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 12
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 03:07:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9979: part 2: partitioned top-n

2020-09-10 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16242 )

Change subject: IMPALA-9979: part 2: partitioned top-n
..


Patch Set 17:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/16242/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16242/13//COMMIT_MSG@27
PS13, Line 27:   be converted to limits.
I think row_number is more commonly used in queries. Maybe a more generic name 
would help. I'm seen some literature use RANKING functions for row_number, 
rank, dense_rank, and REPORTING functions for sum,count,avg. Not sure how 
common that convention is though.


http://gerrit.cloudera.org:8080/#/c/16242/13//COMMIT_MSG@68
PS13, Line 68:  the
Might want to have JIRA for NTILE as well for queries like 95th percentile and 
above. Though a TopN wouldn't work in that case it would just be a filter 
pushdown.


http://gerrit.cloudera.org:8080/#/c/16242/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16242/16//COMMIT_MSG@28
PS16, Line 28: they c
enabled


http://gerrit.cloudera.org:8080/#/c/16242/13/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

http://gerrit.cloudera.org:8080/#/c/16242/13/be/src/exec/exec-node.cc@188
PS13, Line 188: PARTITIONED_TOPN
Random observation, N is usually the size of the whole set, K is usually some 
smaller subset. It's interesting we ended up with the nomenclature of TOPN 
instead of TOPK. Not worth changing it.


http://gerrit.cloudera.org:8080/#/c/16242/16/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

http://gerrit.cloudera.org:8080/#/c/16242/16/be/src/exec/topn-node.cc@197
PS16, Line 197: one
one one


http://gerrit.cloudera.org:8080/#/c/16242/16/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
File fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java:

http://gerrit.cloudera.org:8080/#/c/16242/16/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java@103
PS16, Line 103: logic
logical



--
To view, visit http://gerrit.cloudera.org:8080/16242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Gerrit-Change-Number: 16242
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 Sep 2020 04:07:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9979: part 2: partitioned top-n

2020-09-11 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16242 )

Change subject: IMPALA-9979: part 2: partitioned top-n
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16242/17/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q67.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q67.test:

http://gerrit.cloudera.org:8080/#/c/16242/17/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q67.test@69
PS17, Line 69: 100
The implicit type cast is missing for the literal.



--
To view, visit http://gerrit.cloudera.org:8080/16242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Gerrit-Change-Number: 16242
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 11 Sep 2020 16:48:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8291: Show constraints in DESCRIBE FORMATTED

2020-09-12 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16428 )

Change subject: IMPALA-8291: Show constraints in DESCRIBE FORMATTED
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16428/1/fe/src/compat-hive-3/java/org/apache/impala/compat/HiveMetadataFormatUtils.java
File 
fe/src/compat-hive-3/java/org/apache/impala/compat/HiveMetadataFormatUtils.java:

http://gerrit.cloudera.org:8080/#/c/16428/1/fe/src/compat-hive-3/java/org/apache/impala/compat/HiveMetadataFormatUtils.java@118
PS1, Line 118: if (!isFormatted) {
 :   for (int i = 0; i < fields.length; i++) {
 : Object value = StringEscapeUtils.escapeJava(fields[i]);
 : if (value != null) {
 :   tableInfo.append(value);
 : }
 : tableInfo.append((i == fields.length - 1) ? LINE_DELIM : 
FIELD_DELIM);
 :   }
> Looks like these are dead codes since isFormatted is always true. Are there
Yes it's just copied from Hive where it looks like dead code as well 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java#L582

I'll remove it and add a comment.


http://gerrit.cloudera.org:8080/#/c/16428/1/fe/src/compat-hive-3/java/org/apache/impala/compat/HiveMetadataFormatUtils.java@423
PS1, Line 423: .intern()
> nit: unneccessary intern
Yup, just copied from Hive like that 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/MetaDataFormatUtils.java#L187

What do you think is it better to keep it matching Hive or update it?



--
To view, visit http://gerrit.cloudera.org:8080/16428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676b69c465c46491f870d7fdc894e7474c030356
Gerrit-Change-Number: 16428
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Sat, 12 Sep 2020 18:19:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10167: Docs typo for DEFAULT TRANSACTIONAL TYPE

2020-09-14 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16447 )

Change subject: IMPALA-10167: Docs typo for DEFAULT_TRANSACTIONAL_TYPE
..


Patch Set 1:

Small docs typo fix.


--
To view, visit http://gerrit.cloudera.org:8080/16447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1d9c985c8c202c8716bb5102a23c1e70cf6b68c
Gerrit-Change-Number: 16447
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 14 Sep 2020 16:00:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10167: Docs typo for DEFAULT TRANSACTIONAL TYPE

2020-09-14 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16447


Change subject: IMPALA-10167: Docs typo for DEFAULT_TRANSACTIONAL_TYPE
..

IMPALA-10167: Docs typo for DEFAULT_TRANSACTIONAL_TYPE

Fix for typo referring to DEFAULT_TRANSACTIONAL_TYPE as
DEFAULT_TRANSACTION_TYPE.

Change-Id: Ib1d9c985c8c202c8716bb5102a23c1e70cf6b68c
---
M docs/topics/impala_default_transactional_type.xml
1 file changed, 3 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/16447/1
--
To view, visit http://gerrit.cloudera.org:8080/16447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib1d9c985c8c202c8716bb5102a23c1e70cf6b68c
Gerrit-Change-Number: 16447
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@103
PS10, Line 103: // Adjust an ndv of zero to 1 if stats indicate there are null 
values.
> When the numDistinctValues > 0, such adjustment is not performed. I wonder
Yes the intent of this patch per earlier notes is to only address the =0 cases 
as the the general fix is a bit involved at the moment.



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 10
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 17:28:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8291: Show constraints in DESCRIBE FORMATTED

2020-09-08 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16428


Change subject: IMPALA-8291: Show constraints in DESCRIBE FORMATTED
..

IMPALA-8291: Show constraints in DESCRIBE FORMATTED

Support for displaying primary and foreign key constraints in describe
formatted output. The output attempts to be as close to Hive's
implementation as possible.

Also includes constraint definitions for the TPC-DS test workload.

Testing:
  * Fresh load of testdata
  * Metadata query tests comparing the output between Impala and Hive

Change-Id: I676b69c465c46491f870d7fdc894e7474c030356
---
M 
fe/src/compat-hive-3/java/org/apache/impala/compat/HiveMetadataFormatUtils.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M testdata/datasets/tpcds/tpcds_schema_template.sql
M tests/metadata/test_metadata_query_statements.py
5 files changed, 784 insertions(+), 533 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16428/1
--
To view, visit http://gerrit.cloudera.org:8080/16428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I676b69c465c46491f870d7fdc894e7474c030356
Gerrit-Change-Number: 16428
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 


[Impala-ASF-CR] IMPALA-8291: Show constraints in DESCRIBE FORMATTED

2020-09-08 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16428 )

Change subject: IMPALA-8291: Show constraints in DESCRIBE FORMATTED
..


Patch Set 1:

Hi guys I tried to follow the standard of copying Hive code into the Metadata 
format utils. Let me know if the thought process has changed around this.

@David could you sanity check the tpc-ds constraints? I had to rearrange the 
order of loading tables to satisfy the constraint definitions.

Still doing another pass on tests.


--
To view, visit http://gerrit.cloudera.org:8080/16428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676b69c465c46491f870d7fdc894e7474c030356
Gerrit-Change-Number: 16428
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 08 Sep 2020 20:50:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#11).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest
  * TpcdsPlannerTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
6 files changed, 795 insertions(+), 784 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/11
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 11
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 11:

(10 comments)

Yeah I guess some of those comments in the tests came from different patches, 
all cleaned up now.

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@98
PS10, Line 98: adjustNumD
> Maybe renamed as getNumDistinctValuesAdjusted().
made it adjustNumDistinctValues, since this is a private method want to avoid 
the get/set verbs as not to conflate with standard conventions for public 
methods.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@191
PS10, Line 191:
> nit: seems like a move of the method in the module.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@211
PS10, Line 211: verifySelectCol("nullrows", "null_str",
> This comment can be removed.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@182
PS10, Line 182: // NDV(blanks) = 1, add 1 for nulls
  : // Bug: See IMPALA-7310, IMPALA-8094
  : //verifyNdvStmt("SELECT blanks FROM functional.nullrows", 
2);
> Seems like these lines can be removed.
Actually no this is a different issue, just adjusted the references.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@132
PS10, Line 132:  group_str h
> This comment is not accurate.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@136
PS10, Line 136: null_str is al
> Seems like the reference to c is not right here.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@138
PS10, Line 138: i
> same here.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140
PS10, Line 140: (g
> same
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140
PS10, Line 140: i
> same here
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@182
PS10, Line 182: = 1
> Maybe as // NDV(id) = 26, ndv(null_str) = 1, NDV(id)*ndv(null_str) = 26.
Done



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 11
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:11:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9974: Join elimination based on referential integrity.

2020-10-06 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16182


Change subject: IMPALA-9974: Join elimination based on referential integrity.
..

IMPALA-9974: Join elimination based on referential integrity.

Use defined referential dependency constraints during planning.

  1. For join ndv calculations involving a single column primary key we
 now use the table's row count instead of the column's ndv stats
 which are approximate and incorrect at times.
  2. Convert LEFT SEMI JOIN to INNER JOIN if the inner table's primary
 key is used as the join key.
  3. Convert LEFT OUTER JOIN to INNER JOIN if the join key uses a
 fk/pk relationship, the FK is not nullable.
  4. Prune INNER JOINS on fk/pk conditions where the join doesn't affect
 the output cardinality and there are no predicates or non-key
 columns in use on the inner side.
  5. For a confirmed fk/pk join we include a IS NOT NULL predicate on
 the foreign key columns.

Plan optimizations can be disabled with the query option 
DISABLE_INFORMATIONAL_CONSTRAINTS

Testing:
  - New constraint rewrite planner test
  - TPC-DS functional tests
  - TPC-DS Planner test changes

Some TPC-DS 30TB Performance Improvements:
  - Q72 74.3s -> 24.8s
  - Q82 13.5s -> 2.2s
  - Q84 19.33s -> 6.07s
  - Q85 29s -> 6.33s

Change-Id: Idd07d45e419175af7d62f6035ba51be841ebc074
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
A fe/src/main/java/org/apache/impala/analysis/TableConstraintRewriter.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/SqlConstraints.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/bin/compute-table-stats.sh
M testdata/data/child_table.txt
M testdata/data/parent_table.txt
M testdata/data/parent_table_2.txt
M testdata/datasets/functional/functional_schema_template.sql
A 
testdata/workloads/functional-planner/queries/PlannerTest/constraint-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q03.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test
M 

[Impala-ASF-CR] IMPALA-6671: [WIP] Skip locked tables from topic updates

2020-10-07 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16549 )

Change subject: IMPALA-6671: [WIP] Skip locked tables from topic updates
..


Patch Set 1:

(6 comments)

Thanks for working on this Vihang, it will be a life saver.

http://gerrit.cloudera.org:8080/#/c/16549/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16549/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1294
PS1, Line 1294: true
A freeform boolean is hard to read, best to add an inline comment , true /* 
tryLock */ or make it an Enum.


http://gerrit.cloudera.org:8080/#/c/16549/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1316
PS1, Line 1316: topicUpdateTblLockMaxWaitTimeMs
I could be misunderstanding the code, but isn't 7200 seconds too long to wait 
before skipping the update?


http://gerrit.cloudera.org:8080/#/c/16549/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1325
PS1, Line 1325:   return;
I hope the updateTblPendingVersion() rarely exceeds the number of attempts, but 
if it does throw an exception here, wouldn't it be better to try something else.


http://gerrit.cloudera.org:8080/#/c/16549/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1352
PS1, Line 1352:   if (attempt >= 
Table.NUMBER_OF_ATTEMPTS_TO_UPDATE_VERSION) {
  : throw new CatalogException("Exhausted max number of 
attempts "
  : + Table.NUMBER_OF_ATTEMPTS_TO_UPDATE_VERSION
  : + " to update the pending table version");
  :   }
Wouldn't we want to try something more graceful in this case. How would the end 
user recover from this exception.


http://gerrit.cloudera.org:8080/#/c/16549/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/16549/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2662
PS1, Line 2662: LOG.info(String
  : .format(
  : "%s the pending catalog version of table %s from %s 
to %s%s.",
  : (success ? "Bumped up" : "Attempt to bump up"), 
getFullName(),
  : expectedPendingVersion,
  : newCatalogVersion, success ? "" : " failed"));
LOG Debug? Isn't this too frequent.


http://gerrit.cloudera.org:8080/#/c/16549/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2677
PS1, Line 2677: getAndSet
Why set it to -1? I haven't been able to convince myself that setting it -1 
wouldn't prevent another caller from setting the catalog version to something 
older than intended.



--
To view, visit http://gerrit.cloudera.org:8080/16549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic657b96edbcdc94c6b906e7ca59291f4e4715655
Gerrit-Change-Number: 16549
Gerrit-PatchSet: 1
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 07 Oct 2020 15:31:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10262: RPM/DEB Packaging Support

2020-10-19 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16612


Change subject: IMPALA-10262: RPM/DEB Packaging Support
..

IMPALA-10262: RPM/DEB Packaging Support

cmake -DBUILD_PACKAGES=ON
make package

Change-Id: I535e24e151371f5f594c7a8a6d2d585d18cb694f
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/service/CMakeLists.txt
3 files changed, 87 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16612/1
--
To view, visit http://gerrit.cloudera.org:8080/16612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I535e24e151371f5f594c7a8a6d2d585d18cb694f
Gerrit-Change-Number: 16612
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 


[Impala-ASF-CR] (WIP) IMPALA-10262: RPM/DEB Packaging Support

2020-10-19 Thread Shant Hovsepian (Code Review)
Hello Quanlong Huang, Grant Henke, Tim Armstrong, Joe McDonnell, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16612

to look at the new patch set (#2).

Change subject: (WIP) IMPALA-10262: RPM/DEB Packaging Support
..

(WIP) IMPALA-10262: RPM/DEB Packaging Support

cmake -DBUILD_PACKAGES=ON
make package

Change-Id: I535e24e151371f5f594c7a8a6d2d585d18cb694f
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/service/CMakeLists.txt
3 files changed, 87 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16612/2
--
To view, visit http://gerrit.cloudera.org:8080/16612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I535e24e151371f5f594c7a8a6d2d585d18cb694f
Gerrit-Change-Number: 16612
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10262: RPM/DEB Packaging Support

2020-10-19 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16612 )

Change subject: IMPALA-10262: RPM/DEB Packaging Support
..


Patch Set 1:

Work in progress, just wanted to see if anyone has any objections otherwise 
I'll complete the work.

Installing to /opt/impala seems reasonable or is /usr better?

TODO:
Test that the packages are runnable.
Test RPM
Make version info a dependency in cmake.
When disabled make sure the build isn't slowed down due to the extra "install" 
step that is now available.


--
To view, visit http://gerrit.cloudera.org:8080/16612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I535e24e151371f5f594c7a8a6d2d585d18cb694f
Gerrit-Change-Number: 16612
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 19 Oct 2020 20:16:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-23 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#3).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
6 files changed, 766 insertions(+), 766 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/3
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT for SetOperations

2020-08-23 Thread Shant Hovsepian (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16350

to look at the new patch set (#2).

Change subject: IMPALA-10099: Push down DISTINCT for SetOperations
..

IMPALA-10099: Push down DISTINCT for SetOperations

INTERSECT/EXCEPT are not duplicate preserving operations. The distinct
aggregations can happen in each operand, the leftmost operand only, or
after all the operands in a separate aggregation step.

This change pushes the distinct aggregation down to the leftmost operand
in cases where there are no analytic functions, or when a distinct or
grouping operation already eliminates duplicates.

In general DISTINCT placement such as in this case should be done
throughout the entire plan tree in a cost based manner as described in
IMPALA-5260

Testing:
 * TpcdsPlannerTest
 * PlannerTest
 * TPC-DS 30TB Perf run for any affected queries
   - Q14-1 180s -> 150s
   - Q14-2 109s -> 90s
   - Q8 no significant change
 * SetOperation Planner Tests
 * Analyzer tests
 * Tpcds Functional Workload

Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
---
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/setoperation-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
6 files changed, 1,805 insertions(+), 1,806 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/16350/2
--
To view, visit http://gerrit.cloudera.org:8080/16350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Gerrit-Change-Number: 16350
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10095: Include query plan tests for all of TPC-DS

2020-08-25 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16345

to look at the new patch set (#7).

Change subject: IMPALA-10095: Include query plan tests for all of TPC-DS
..

IMPALA-10095: Include query plan tests for all of TPC-DS

Added TpcdsPlannerTest to include each TPC-DS query as a separate plan
test file. Removed the previous tpcds-all test file.

This means when running only PlannerTest no TPC-DS plans are checked,
however as part of a full frontend test run the TpcdsPlannerTest will be
included.

Runs with cardinality and resource checks, as well as using parquet
tables to include predicate pushdowns.

Change-Id: Ibaf40d8b783be1dc7b62ba3269feb034cb8047da
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A fe/src/test/java/org/apache/impala/planner/TpcdsPlannerTest.java
D testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q03.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q20.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q21.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q26.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q31.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q32.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q35a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q36.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q37.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q38.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q40.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q41.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q43.test
A 

[Impala-ASF-CR] IMPALA-10095: Include query plan tests for all of TPC-DS

2020-08-25 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16345

to look at the new patch set (#6).

Change subject: IMPALA-10095: Include query plan tests for all of TPC-DS
..

IMPALA-10095: Include query plan tests for all of TPC-DS

Added TpcdsPlannerTest to include each TPC-DS query as a separate plan
test file. Removed the previous tpcds-all test file.

This means when running only PlannerTest no TPC-DS plans are checked,
however as part of a full frontend test run the TpcdsPlannerTest will be
included.

Runs with cardinality and resource checks, as well as using parquet
tables to include predicate pushdowns.

Change-Id: Ibaf40d8b783be1dc7b62ba3269feb034cb8047da
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A fe/src/test/java/org/apache/impala/planner/TpcdsPlannerTest.java
D testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q03.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q20.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q21.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q26.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q31.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q32.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q35a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q36.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q37.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q38.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q40.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q41.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q43.test
A 

[Impala-ASF-CR] IMPALA-10095: Include query plan tests for all of TPC-DS

2020-08-25 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16345 )

Change subject: IMPALA-10095: Include query plan tests for all of TPC-DS
..


Patch Set 6:

Too many parallel CR had some things sneak into a git add that shouldn't have 
made it just yet.


--
To view, visit http://gerrit.cloudera.org:8080/16345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibaf40d8b783be1dc7b62ba3269feb034cb8047da
Gerrit-Change-Number: 16345
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 25 Aug 2020 23:26:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10095: Include query plan tests for all of TPC-DS

2020-08-26 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16345

to look at the new patch set (#8).

Change subject: IMPALA-10095: Include query plan tests for all of TPC-DS
..

IMPALA-10095: Include query plan tests for all of TPC-DS

Added TpcdsPlannerTest to include each TPC-DS query as a separate plan
test file. Removed the previous tpcds-all test file.

This means when running only PlannerTest no TPC-DS plans are checked,
however as part of a full frontend test run the TpcdsPlannerTest will be
included.

Runs with cardinality and resource checks, as well as using parquet
tables to include predicate pushdowns.

Change-Id: Ibaf40d8b783be1dc7b62ba3269feb034cb8047da
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A fe/src/test/java/org/apache/impala/planner/TpcdsPlannerTest.java
D testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q03.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q20.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q21.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q26.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q31.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q32.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q35a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q36.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q37.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q38.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q40.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q41.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q43.test
A 

[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-28 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG@7
PS7, Line 7: 10064
> Oops ! Changed it to 10064.
Done


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@45
PS7, Line 45: ) {
> Right..not needed.  I had a version earlier using the arguments then forgot
Done


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@65
PS7, Line 65: BinaryPredicate.IS_RANGE_PREDICATE.apply(bp);
:   if (!isRangeOp && !(bp.getOp() == 
BinaryPredicate.Operator.EQ)) continue;
:   SlotRef slotRef = bp.getBoundSlot();
:   if (slotRef == null || !bp.getChild(1).isConsta
> Makes sense to create a static api for this.  I followed the com.google.com
I was thinking just a simple method like BinaryPredicate::isEquivalence() but 
this works too. Thanks for the cleanup!


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@164
PS7, Line 164: &
> Done
Done


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@180
PS7, Line 180:
 :
 :
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 8
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 07:25:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-28 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#8).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest
  * TpcdsPlannerTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
M testdata/workloads/functional-query/queries/QueryTest/compute-stats-date.test
9 files changed, 789 insertions(+), 782 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/8
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 8
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-28 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 8:

This is getting a little uglier now. Turns out pushing the check down to the 
stats level exposed holes in the logic elsewhere. Nulls are included in the 
column ndv stats is it is a Boolean type and if the column is a partitioning 
column with a null value.

I'd be happy to keep this change in SlotRef where it's only affecting 
selectivity calculations or as in this updated patch for the partitioning case 
bypass the null adjustments.


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 8
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 28 Aug 2020 07:02:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10110: bloom filter target fpp query option

2020-08-28 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16377 )

Change subject: IMPALA-10110: bloom filter target fpp query option
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16377/4/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/16377/4/be/src/service/query-options.h@50
PS4, Line 50: RUNTIME_FILTER_ERR
> It's a bit opaque either way, but consistency makes sense. Made the change.
Consistently opaque!



--
To view, visit http://gerrit.cloudera.org:8080/16377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb123a0ea1e0e95d95df9837c1f0222fd60361f3
Gerrit-Change-Number: 16377
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 28 Aug 2020 17:50:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-28 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 8:

(1 comment)

> Patch Set 8:
>
> I wonder if the holes exposed by the current change can be remedied, for the 
> time being, by suppling a checkNull argument default to false which will 
> bypass the new logic. We call the new method with checkNull set to true in 
> SlotRef only. In the long run, we can fix these other issues.
>
> I am OK with reverting the logic to your previous version too if the chance 
> to hit the same problem for other expressions is low.

Yeah having a check in slotRef to then just call a special method in stats is 
just not leaking the abstraction even more. I'd rather than just keep the 
change localized in slotref.

http://gerrit.cloudera.org:8080/#/c/16349/8/testdata/workloads/functional-query/queries/QueryTest/compute-stats-date.test
File 
testdata/workloads/functional-query/queries/QueryTest/compute-stats-date.test:

http://gerrit.cloudera.org:8080/#/c/16349/8/testdata/workloads/functional-query/queries/QueryTest/compute-stats-date.test@32
PS8, Line 32: 'd','DATE',1,22,4,4,-1,-1
> Is this change right? I thought the column stats themselves were not meant
Yeah this was originally why we only had the correction in the slotref so it 
benefits selectivity but nothing else.

Having the correction in the stats class will show up everywhere stats are used.



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 8
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 29 Aug 2020 00:40:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-29 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#9).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest
  * TpcdsPlannerTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
7 files changed, 787 insertions(+), 775 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/9
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 9
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-19 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 5:

(2 comments)

> Patch Set 4:
>
> Agree with Aman's comments about the planner tests and the changes from the 
> official versions...

Added comments on the official variants there are only two at this point. I 
have kept the unofficial variants we used in the past along with the official 
versions just for the sake of catching an potential regressions.

Filed IMPALA-10095 for the planner tests.

http://gerrit.cloudera.org:8080/#/c/16280/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
File testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test:

http://gerrit.cloudera.org:8080/#/c/16280/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test@33
PS4, Line 33:   select  sum(sales)
> The TPCDS query is written as two queries in one file. We separate it out l
Done


http://gerrit.cloudera.org:8080/#/c/16280/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
File testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test:

http://gerrit.cloudera.org:8080/#/c/16280/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test@45
PS4, Line 45: where i_color = 'peach'
> Same as above, the TPC-DS specification has it written as two semi colon se
Done



--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 5
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 19 Aug 2020 14:28:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-19 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Fang-Yu Rao, David Rorke, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16280

to look at the new patch set (#5).

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..

IMPALA-10034: Add remaining TPC-DS queries to workload.

Include remaining TPC-DS queries to the testdata workload definition.

Q8 and Q38 were using non standard variants, those have been
replaced by the official query versions. Q35 is using an official
variant. Had to escape a table alias in Q90 as we treat 'AT' as a
reserved keyword.

Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
---
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q10a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q28.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q35a.test
D testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q44.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q49.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q66.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q87.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q90.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q93.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
18 files changed, 1,250 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/16280/5
--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 5
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-21 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Fang-Yu Rao, David Rorke, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16280

to look at the new patch set (#7).

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..

IMPALA-10034: Add remaining TPC-DS queries to workload.

Include remaining TPC-DS queries to the testdata workload definition.

Q8 and Q38 were using non standard variants, those have been
replaced by the official query versions. Q35 is using an official
variant. Had to escape a table alias in Q90 as we treat 'AT' as a
reserved keyword.

Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
---
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q10a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q28.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q35a.test
D testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q44.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q49.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q66.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q87.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q90.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q93.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
18 files changed, 1,252 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/16280/7
--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 7
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-21 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#2).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
6 files changed, 766 insertions(+), 766 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/2
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT for SetOperations

2020-08-21 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16350


Change subject: IMPALA-10099: Push down DISTINCT for SetOperations
..

IMPALA-10099: Push down DISTINCT for SetOperations

Testing:
 * TpcdsPlannerTest
 * TPC-DS 30TB Perf run for any affected queries
   - Q14-1 180s -> 150s
   - Q14-2 109s -> 90s
   - Q8 no significant change
 * SetOperation Planner Tests
 * Analyzer tests
 * Tpcds Functional Workload

Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
---
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/setoperation-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
5 files changed, 1,787 insertions(+), 1,788 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/16350/1
--
To view, visit http://gerrit.cloudera.org:8080/16350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Gerrit-Change-Number: 16350
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 


[Impala-ASF-CR] IMPALA-10095: Include query plan tests for all of TPC-DS

2020-08-20 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16345


Change subject: IMPALA-10095: Include query plan tests for all of TPC-DS
..

IMPALA-10095: Include query plan tests for all of TPC-DS

Added TpcdsPlannerTest to include each TPC-DS query as a separate plan
test file. Removed the previous tpcds-all test file.

This means when running only PlannerTest no TPC-DS plans are checked,
however as part of a full frontend test run the TpcdsPlannerTest will be
included.

Runs with cardinality and resource checks, as well as using parquet
tables to include predicate pushdowns.

Change-Id: Ibaf40d8b783be1dc7b62ba3269feb034cb8047da
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A fe/src/test/java/org/apache/impala/planner/TpcdsPlannerTest.java
D testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q03.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q20.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q21.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q26.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q31.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q32.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q35a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q36.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q37.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q38.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q40.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q41.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q43.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q44.test
A 

[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-21 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 1:

This is a bit of a lazy fix, handling only the case where ndv is 0 but there 
are null stats.

I tried updating the cardinality with null info in all cases created some big 
plan changes across the board, to do this change justice we'd need to fix 
cardinality and selectivity calculation to factor out the number of null values 
in cases where a predicate is null filtering. The IS NOT NULL predicate does 
this explicitly but we not other predicates.

I welcome people's input on prioritizing the general fix, just makes for LOTS 
of plan diffs to review. If we're okay with just this partial fix I'll make a 
separate JIRA for it.

Also still expecting more test changes, having trouble with loading data with 
Tez and memory consumption as well as some more Hbase issues.


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 21 Aug 2020 06:08:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7782: fix constant NOT IN subqueries that can return 0 rows

2020-08-19 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16338 )

Change subject: IMPALA-7782: fix constant NOT IN subqueries that can return 0 
rows
..


Patch Set 1: Code-Review+1

Nice catch.


--
To view, visit http://gerrit.cloudera.org:8080/16338
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I66c726f0f66ce2f609e6ba44057191f5929a67fc
Gerrit-Change-Number: 16338
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 20 Aug 2020 01:10:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-20 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Fang-Yu Rao, David Rorke, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16280

to look at the new patch set (#6).

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..

IMPALA-10034: Add remaining TPC-DS queries to workload.

Include remaining TPC-DS queries to the testdata workload definition.

Q8 and Q38 were using non standard variants, those have been
replaced by the official query versions. Q35 is using an official
variant. Had to escape a table alias in Q90 as we treat 'AT' as a
reserved keyword.

Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
---
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q10a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q28.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q35a.test
D testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q44.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q49.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q66.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q87.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q90.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q93.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
18 files changed, 1,250 insertions(+), 106 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/16280/6
--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-21 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16349


Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
4 files changed, 751 insertions(+), 731 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/1
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 


[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT in Set operations

2020-08-26 Thread Shant Hovsepian (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16350

to look at the new patch set (#3).

Change subject: IMPALA-10099: Push down DISTINCT in Set operations
..

IMPALA-10099: Push down DISTINCT in Set operations

INTERSECT/EXCEPT are not duplicate preserving operations. The distinct
aggregations can happen in each operand, the leftmost operand only, or
after all the operands in a separate aggregation step. Except for a
couple special cases we would use the last strategy most often.

This change pushes the distinct aggregation down to the leftmost operand
in cases where there are no analytic functions, or when a distinct or
grouping operation already eliminates duplicates.

In general DISTINCT placement such as in this case should be done
throughout the entire plan tree in a cost based manner as described in
IMPALA-5260

Testing:
 * TpcdsPlannerTest
 * PlannerTest
 * TPC-DS 30TB Perf run for any affected queries
   - Q14-1 180s -> 150s
   - Q14-2 109s -> 90s
   - Q8 no significant change
 * SetOperation Planner Tests
 * Analyzer tests
 * Tpcds Functional Workload

Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
---
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/setoperation-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
6 files changed, 1,806 insertions(+), 1,806 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/16350/3
--
To view, visit http://gerrit.cloudera.org:8080/16350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Gerrit-Change-Number: 16350
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-27 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#5).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest
  * TpcdsPlannerTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
7 files changed, 780 insertions(+), 774 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/5
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 5
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10046: Support constant propagation for eligible range predicates

2020-08-27 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10046: Support constant propagation for eligible range 
predicates
..


Patch Set 7:

(7 comments)

Nice addition.

http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16346/7//COMMIT_MSG@7
PS7, Line 7: 10046
Wrong JIRA?


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@45
PS7, Line 45: List conjuncts, BitSet candidates
Are these constructor arguments required? Since you keep some state in this 
class it might be cleaner to initialize an instance just for specific arguments 
here and then now require arguments for the methods below.


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@65
PS7, Line 65: (bp.getOp() == BinaryPredicate.Operator.LE ||
:   bp.getOp() == BinaryPredicate.Operator.LT ||
:   bp.getOp() == BinaryPredicate.Operator.GE ||
:   bp.getOp() == BinaryPredicate.Operator.GT);
*Suggestion* If you want to make  this check a static public method in 
BinaryPredicate it would be nice. You could clean up some code in HdfsScanNode 
that is similar for MinMax Predicates.


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@164
PS7, Line 164:
This can be static if you choose.


http://gerrit.cloudera.org:8080/#/c/16346/7/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@180
PS7, Line 180: } catch (Exception e) {
 :   throw new IllegalStateException("Failed analysis on new 
predicate", e);
 : }
Not sure that this is a very useful exception. You could do:
 newPred.analyzeNoThrow(analyzer)


http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/datasets/functional/schema_constraints.csv@22
PS7, Line 22: table_format:parquet
Usually these small test table are text types, any particular reason for making 
it parquet?


http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461
PS7, Line 461: timestamp_col <= '2010-12-01';
There is a NormalizeBinaryPredicate rule that should rewrite "const  
" into "  const" but I couldn't tell if it runs before or 
after propagateConstants() might be good to test that case as well.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 7
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 23:35:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT in Set operations

2020-08-27 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16350 )

Change subject: IMPALA-10099: Push down DISTINCT in Set operations
..


Patch Set 3:

Small optimization for the set operation rewrites.


--
To view, visit http://gerrit.cloudera.org:8080/16350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Gerrit-Change-Number: 16350
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 23:37:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-27 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#6).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest
  * TpcdsPlannerTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
7 files changed, 779 insertions(+), 776 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/6
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-27 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16349/5/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java:

http://gerrit.cloudera.org:8080/#/c/16349/5/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@210
PS5, Line 210: public long getNumDistinctValues()
> Nice!
thanks!


http://gerrit.cloudera.org:8080/#/c/16349/5/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@214
PS5, Line 214: }
> I talked to you directly about this
I agree, the logic is messy like this. Will fix this part.

More reason to handle the general case and do the adjustment during updates to 
the stats.



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 21:08:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10110: allow setting bloom filter fpp

2020-08-27 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16377 )

Change subject: IMPALA-10110: allow setting bloom filter fpp
..


Patch Set 4: Code-Review+1

(2 comments)

Non critical suggestions.
Thanks for the fix!

http://gerrit.cloudera.org:8080/#/c/16377/4/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/16377/4/be/src/service/query-options.h@50
PS4, Line 50: RUNTIME_FILTER_FPP
I don't feel strongly about this so either way. "FPP" is kind of an opaque 
technical term. Since this might be user facing and we've got a flag already 
called "max_filter_error_rate" which is wordier as a query option but 
incorporating FILTER_ERROR_RATE instead of FPP might make the option more self 
documenting.


http://gerrit.cloudera.org:8080/#/c/16377/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/16377/4/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@146
PS4, Line 146: max_filter_error_rate
Would you mind updating the flag doc to mention the query option as an override?
https://github.com/apache/impala/blob/03f2b559c31af7fc11165cf3b00876900e234663/be/src/runtime/runtime-filter-bank.cc#L56



--
To view, visit http://gerrit.cloudera.org:8080/16377
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb123a0ea1e0e95d95df9837c1f0222fd60361f3
Gerrit-Change-Number: 16377
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Thu, 27 Aug 2020 22:24:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10095: Include query plan tests for all of TPC-DS

2020-08-24 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16345

to look at the new patch set (#4).

Change subject: IMPALA-10095: Include query plan tests for all of TPC-DS
..

IMPALA-10095: Include query plan tests for all of TPC-DS

Added TpcdsPlannerTest to include each TPC-DS query as a separate plan
test file. Removed the previous tpcds-all test file.

This means when running only PlannerTest no TPC-DS plans are checked,
however as part of a full frontend test run the TpcdsPlannerTest will be
included.

Runs with cardinality and resource checks, as well as using parquet
tables to include predicate pushdowns.

Change-Id: Ibaf40d8b783be1dc7b62ba3269feb034cb8047da
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A fe/src/test/java/org/apache/impala/planner/TpcdsPlannerTest.java
D testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q03.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q20.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q21.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q26.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q31.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q32.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q35a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q36.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q37.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q38.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q40.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q41.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q43.test
A 

[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-24 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#4).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest
  * TpcdsPlannerTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
6 files changed, 766 insertions(+), 766 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/4
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-24 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 4:

(2 comments)

> Patch Set 3:
>
> (1 comment)
>
> I think this makes sense to me - the only question really would be whether 
> this should be broader and increase the NDV in all cases where we know there 
> are nulls. I think fixing this really bad case first is OK though.

Yes I tried the general fix but got some bad plans. The extra change given like 
booleans, we know how many NULL there are in the stats predicates with value 
checks that definitely filter out NULLs need should:
1. Not include the +1 for null in the selectivity.
2. Remove the null value count from cardinality.

That's just a much larger change in terms of plan diffs and corner cases, so 
wanted to get this worst case "base case" in sooner.

http://gerrit.cloudera.org:8080/#/c/16349/3/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/16349/3/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@96
PS3, Line 96: if (numDistinctValues_ == 0 && desc_.getIsNullable() && 
desc_.getStats() != null
> Just wonder if the new logic here should be moved to ::getNumDistinctValues
I was thinking about that but we'd want to check the slot nullability which is 
awkward to do from with the stats object. I'm going to leave it in place like 
this for now.


http://gerrit.cloudera.org:8080/#/c/16349/3/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@97
PS3, Line 97: && (desc_.getStats().hasNulls() ||
> It might be safer to assume that there are nulls in the absence of null sta
Yes it would be rare at the least but should be an easy check.



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 25 Aug 2020 02:49:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10095: Include query plan tests for all of TPC-DS

2020-08-24 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16345 )

Change subject: IMPALA-10095: Include query plan tests for all of TPC-DS
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16345/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test:

http://gerrit.cloudera.org:8080/#/c/16345/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test@1
PS3, Line 1: # TPCDS-Q6
> TPCDS
Done


http://gerrit.cloudera.org:8080/#/c/16345/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test:

http://gerrit.cloudera.org:8080/#/c/16345/3/testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test@1
PS3, Line 1:
> a instead of -1
Done



--
To view, visit http://gerrit.cloudera.org:8080/16345
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibaf40d8b783be1dc7b62ba3269feb034cb8047da
Gerrit-Change-Number: 16345
Gerrit-PatchSet: 5
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 25 Aug 2020 00:50:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10095: Include query plan tests for all of TPC-DS

2020-08-24 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16345

to look at the new patch set (#5).

Change subject: IMPALA-10095: Include query plan tests for all of TPC-DS
..

IMPALA-10095: Include query plan tests for all of TPC-DS

Added TpcdsPlannerTest to include each TPC-DS query as a separate plan
test file. Removed the previous tpcds-all test file.

This means when running only PlannerTest no TPC-DS plans are checked,
however as part of a full frontend test run the TpcdsPlannerTest will be
included.

Runs with cardinality and resource checks, as well as using parquet
tables to include predicate pushdowns.

Change-Id: Ibaf40d8b783be1dc7b62ba3269feb034cb8047da
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A fe/src/test/java/org/apache/impala/planner/TpcdsPlannerTest.java
D testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q03.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q20.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q21.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q26.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q30.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q31.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q32.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q35a.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q36.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q37.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q38.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q40.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q41.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
A testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q43.test
A 

[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-29 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#10).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest
  * TpcdsPlannerTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
7 files changed, 787 insertions(+), 775 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/10
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 10
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10112: Remove FpRateTooHigh() check for blom filter

2020-09-23 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16499 )

Change subject: IMPALA-10112: Remove FpRateTooHigh() check for blom filter
..


Patch Set 1:

(4 comments)

Thanks for implementing and testing this out Riza!

http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG@24
PS1, Line 24: inacurate
nit: inaccurate


http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG@9
PS1, Line 9: This patch remove FpRateTooHigh() check for bloom filter that can
   : disable filter if the observed false-positive probability (FPP) 
rate is
   : higher than FLAGS_max_filter_error_rate. Such filter with high FPP 
rate
   : is still worth to evaluate for several reasons:
   :
   : 1. Partition filters are probably still worth evaluating even if 
there
   :are false positives, because it's cheap and eliminating a 
partition
   :is still beneficial.
   : 2. Runtime filters are dynamically disabled on the scan side if 
they are
   :ineffective. An always true filter is also still being 
evaluated and
   :not entirely free.
   : 3. The disabling is fairly unlikely to kick in for partitioned 
joins
   :because it's only applied to a small subset of the filter, 
before the
   :Or() operation.
   : 4. FpRateTooHigh() use num_build_rows to approximate actual FPP 
rate of
   :resulting filter. This can be inacurate because it does not take
   :account of duplicate values of the filter key on the build side.
   :
   : This patch also remove some tests in test_runtime_filters.py that 
check
   : cancellation of filters having high FPP rate.
nit: The grammar here is a little hard to parse, might want to run through it 
again or pass it through a grammar checker?


http://gerrit.cloudera.org:8080/#/c/16499/1//COMMIT_MSG@33
PS1, Line 33: little to no performance regression
If you have the performance numbers handy it would be good to include them in 
the commit message to aid any future readers; however it's not critical if 
rerunning the benchmark is required.


http://gerrit.cloudera.org:8080/#/c/16499/1/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/16499/1/be/src/exec/partitioned-hash-join-builder.cc@a941
PS1, Line 941:
Is the always_true_filter dead code now or is it used elsewhere?



--
To view, visit http://gerrit.cloudera.org:8080/16499
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9f8f40764b4f6664cc81b0da428afea8e3588d4
Gerrit-Change-Number: 16499
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 24 Sep 2020 03:36:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10185 Use bool stats for selectivity calculations.

2020-09-24 Thread Shant Hovsepian (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16462

to look at the new patch set (#2).

Change subject: IMPALA-10185 Use bool stats for selectivity calculations.
..

IMPALA-10185 Use bool stats for selectivity calculations.

Factor in numTrues and numFalses stats when computing selectivity for
boolean columns.

Testing:
  * New test method in ExprCardinalityTest

Change-Id: I95c1c7c915bf6bca13fe006c0531c33988187d12
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
2 files changed, 39 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/16462/2
--
To view, visit http://gerrit.cloudera.org:8080/16462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I95c1c7c915bf6bca13fe006c0531c33988187d12
Gerrit-Change-Number: 16462
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 


[Impala-ASF-CR] IMPALA-10185 Use bool stats for selectivity calculations.

2020-09-24 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16462 )

Change subject: IMPALA-10185 Use bool stats for selectivity calculations.
..


Patch Set 2:

Just a small change nothing urgent.


--
To view, visit http://gerrit.cloudera.org:8080/16462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95c1c7c915bf6bca13fe006c0531c33988187d12
Gerrit-Change-Number: 16462
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 24 Sep 2020 06:23:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5022 part 1: Implement core functions of outer join simplification

2020-09-22 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022 part 1: Implement core functions of outer join 
simplification
..


Patch Set 22: Code-Review+1

Xianqing, very nice! This is a very helpful patch.


--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 22
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 22 Sep 2020 12:24:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10112: Remove FpRateTooHigh() check for bloom filter

2020-09-25 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16499 )

Change subject: IMPALA-10112: Remove FpRateTooHigh() check for bloom filter
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/16499
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9f8f40764b4f6664cc81b0da428afea8e3588d4
Gerrit-Change-Number: 16499
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 25 Sep 2020 22:25:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8291: Show constraints in DESCRIBE FORMATTED

2020-09-14 Thread Shant Hovsepian (Code Review)
Hello Quanlong Huang, Vihang Karajgaonkar, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16428

to look at the new patch set (#2).

Change subject: IMPALA-8291: Show constraints in DESCRIBE FORMATTED
..

IMPALA-8291: Show constraints in DESCRIBE FORMATTED

Support for displaying primary and foreign key constraints in describe
formatted output. The output attempts to be as close to Hive's
implementation as possible.

Also includes constraint definitions for the TPC-DS test workload.

Testing:
  * Fresh load of testdata
  * Metadata query tests comparing the output between Impala and Hive

Change-Id: I676b69c465c46491f870d7fdc894e7474c030356
---
M 
fe/src/compat-hive-3/java/org/apache/impala/compat/HiveMetadataFormatUtils.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M testdata/datasets/tpcds/tpcds_schema_template.sql
M tests/metadata/test_metadata_query_statements.py
5 files changed, 788 insertions(+), 533 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16428/2
--
To view, visit http://gerrit.cloudera.org:8080/16428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I676b69c465c46491f870d7fdc894e7474c030356
Gerrit-Change-Number: 16428
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9979: part 2: partitioned top-n

2020-09-14 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16242 )

Change subject: IMPALA-9979: part 2: partitioned top-n
..


Patch Set 17: Code-Review+1

(2 comments)

I don't feel as strongly about refactoring TopNNode in the be as other 
reviewers.

This is a very thorough implementation, bravo!

http://gerrit.cloudera.org:8080/#/c/16242/17/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

http://gerrit.cloudera.org:8080/#/c/16242/17/be/src/exec/topn-node.cc@541
PS17, Line 541:   sort(sorted_heaps.begin(), sorted_heaps.end(),
  :   [](const PartitionHeapMap::iterator& left,
  :   const PartitionHeapMap::iterator& right) {
  : int64_t left_discarded = 
left->second->num_tuples_discarded();
  : int64_t right_discarded = 
right->second->num_tuples_discarded();
  : if (left_discarded != right_discarded) {
  :   return left_discarded < right_discarded;
  : }
  : return left->second->num_tuples_added_since_eviction() <
  : right->second->num_tuples_added_since_eviction();
  :   }
Nice, very nice.


http://gerrit.cloudera.org:8080/#/c/16242/17/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
File fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java:

http://gerrit.cloudera.org:8080/#/c/16242/17/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java@82
PS17, Line 82: public static String RANK = "rank";
 :   public static String DENSERANK = "dense_rank";
 :   public static String ROWNUMBER = "row_number";
nit: a bit of an eye soar, would you mind moving the public members to the 
begining or end of the block.



--
To view, visit http://gerrit.cloudera.org:8080/16242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Gerrit-Change-Number: 16242
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 15 Sep 2020 01:57:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7097 Print EC info in the query plan and profile

2020-10-26 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16587 )

Change subject: IMPALA-7097 Print EC info in the query plan and profile
..


Patch Set 7: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/16587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ea378914624a714fde820d290b3b9c43325c6a1
Gerrit-Change-Number: 16587
Gerrit-PatchSet: 7
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Comment-Date: Mon, 26 Oct 2020 16:33:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.

2020-07-16 Thread Shant Hovsepian (Code Review)
Hello David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16123

to look at the new patch set (#6).

Change subject: IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.
..

IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.

INTERSECT and EXCEPT set operations are implemented as rewrites to
joins. Currently only the DISTINCT qualified operators are implemented,
not ALL qualified. The operator MINUS is supported as an alias for
EXCEPT.

We mimic Hive's non-standard implementation which treats all operators
with the same precedence, as opposed to the SQL Standard of giving
INTERSECT higher precedence.

A new class SetOperationStmt was created to encompass the previous
UnionStmt behavior. UnionStmt is preserved as a special case of union
only operands to ensure compatibility with previous union planning
behavior.

Tests:
* Added parser and analyzer tests.
* Ensured no test failures or plan changes for union tests.
* Added TPC-DS queries 14,38,87 to functional and planner tests.
* Added functional tests test_intersect test_except

Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
A fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/analysis/ValuesStmt.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-query/queries/QueryTest/except.test
A testdata/workloads/functional-query/queries/QueryTest/intersect.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q14-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q14-2.test
A testdata/workloads/tpcds/queries/tpcds-q14-1.test
A testdata/workloads/tpcds/queries/tpcds-q14-2.test
A testdata/workloads/tpcds/queries/tpcds-q38.test
A testdata/workloads/tpcds/queries/tpcds-q87.test
M testdata/workloads/tpch_nested/tpch_nested_core.csv
M tests/query_test/test_queries.py
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
29 files changed, 3,725 insertions(+), 790 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/16123/6
--
To view, visit http://gerrit.cloudera.org:8080/16123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
Gerrit-Change-Number: 16123
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](asf-site) Add link to slack channel on community

2020-07-19 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16024 )

Change subject: Add link to slack channel on community
..


Patch Set 1: Code-Review+1

> Patch Set 1:
>
> Could we just use the-asf.slack.com and create an #impala channel? It looks 
> like that space is already active with other projects like Beam and I think 
> the space is blessed by ASF infra.

The current slack channel has a decent number of people in it. Might not be 
worth the switch over to the-asf unless there is some kind of official policy?


--
To view, visit http://gerrit.cloudera.org:8080/16024
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc05665607cc35a967c01660b6f4890ddd7e6a40
Gerrit-Change-Number: 16024
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sun, 19 Jul 2020 12:04:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-03 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16280


Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..

IMPALA-10034: Add remaining TPC-DS queries to workload.

Include remaining TPC-DS queries to the testdata workload definition.

Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
---
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q28.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q35.test
D testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q44.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q49.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q66.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q87.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q90.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q93.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
17 files changed, 1,249 insertions(+), 104 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/16280/1
--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-03 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 1:

Adding the missing tpc-ds queries.

Easy spot to compare the query against 
https://github.com/cwida/tpcds-result-reproduction

Assuming decimalv2 will be the default soon, so only added templates for those 
queries. Also there are slight variances from the answer sets due to decimal 
rounding, I assume this is expected behavior but wouldn't mind a second set of 
eyes.


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 02:23:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-03 Thread Shant Hovsepian (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16280

to look at the new patch set (#2).

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..

IMPALA-10034: Add remaining TPC-DS queries to workload.

Include remaining TPC-DS queries to the testdata workload definition.

Q8 and Q38 were using non standard variants, those have been
replaced by the official query versions. Q35 is using an official
variant. Had to escape a table alias in Q90 as we treat 'AT' as a
reserved keyword.

Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
---
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q28.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q35.test
D testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q44.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q49.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q66.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q87.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q90.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q93.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
17 files changed, 1,249 insertions(+), 104 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/16280/2
--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-06 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Fang-Yu Rao, David Rorke, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16280

to look at the new patch set (#4).

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..

IMPALA-10034: Add remaining TPC-DS queries to workload.

Include remaining TPC-DS queries to the testdata workload definition.

Q8 and Q38 were using non standard variants, those have been
replaced by the official query versions. Q35 is using an official
variant. Had to escape a table alias in Q90 as we treat 'AT' as a
reserved keyword.

Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
---
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q28.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q35.test
D testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q44.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q49.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q66.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q87.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q90.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q93.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
17 files changed, 1,248 insertions(+), 104 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/16280/4
--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..


Patch Set 5:

(10 comments)

Hi Xianqing, thank you so much for this contribution! I'll need to do another 
pass and go over the tests but here are some initial comments.

http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG@17
PS5, Line 17: one null rejecting condition on the inner table.
Consider adding a query option like DISABLE_OUTER_TO_INNER_REWRITE so disable 
this optimization if needed as runtime.


http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG@34
PS5, Line 34: * Ran the full set of verifications in Impala Public Jenkins
Please try out TPC-DS Q49 there the LOJ queries in there should be rewritten.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3217
PS5, Line 3217: getWhereClauseConjuncts(
Technically this would include having clause conjuncts as well, so might be 
misleading to name this function getWhereClauseConuncts.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3227
PS5, Line 3227:   }
As a further optimization you could use getEquivClassesOnTuples() to also check 
for null filtering conditions that come as a result of a transitive 
relationship.

For example T1 LEFT OUTER JOIN T2 ON (T1.a = T2.a) JOIN T3 ON (T3.b=T2.b) WHERE 
T3.b > 10;


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3270
PS5, Line 3270: analyzeNoThrow
> For some common SQL functions,  we probably can directly test their existen
Agreed it would be good to have some static expressions that we know won't 
reject nulls for example.

col IS NULL
col1 IS DISTINCT FROM col2

for things like IN and COALESCE you would recursively check the children.

IF and CASE are trickier so you might want to call the BE or just skip those.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3427
PS5, Line 3427: // Recompute the graph since we may need to add 
value-transfer edges based on the
See later comment in Planner, but it might be better to return this and have 
the caller recompute the graph.


http://gerrit.cloudera.org:8080/#/c/16266/4/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/16266/4/fe/src/main/java/org/apache/impala/analysis/Expr.java@980
PS4, Line 980: his instanceof CompoundPredicate
 : && ((CompoundPredicate) this).getOp() == 
CompoundPredicate.Operator.OR
You could use Expr.IS_OR_PREDICATE(this) here.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Expr.java@978
PS5, Line 978:   public List getDisjunctiveConjuncts() {
There is something off about this interface. You assume the caller the first 
time this is called has verified that the predicate is an OR. For example is 
someone called this function with just a plan Expr then it would return the 
Expr back.

You might want to move the IS_OR_PREDICATE call from Analyzer.java#3245 into 
it's own wrapper function, which then calls this method.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@748
PS5, Line 748: // Transform outer join into inner join whenever possible
Might want to use some state in the analyzer to check if any Outer Joins exist 
in the query and only then call this function then. For example 
globalstate_.outerJountTupleIds.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@749
PS5, Line 749: analyzer.simplifyOuterJoins(selectStmt.getTableRefs());
It would be if you returned some indicator that the value transfer graph needs 
to be recomputed. Then recompute the graph here so you can make the time line 
event accordingly.

ctx_.getTimeline().markEvent("Recomputing value transfer graph")

Also if the SingleNodePlanner's valueTransferGraphNeedsUpdate_ was set to true 
you could likely reset it after you recompute the graph.



--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-07 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16280/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
File testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test:

http://gerrit.cloudera.org:8080/#/c/16280/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test@33
PS4, Line 33:   select  sum(sales)
> Is there any specific reason why we have two tests for query 23 that are al
The TPCDS query is written as two queries in one file. We separate it out like 
this to make the test diffing easier. The slight difference is just how TPC-DS 
designed this workload.


http://gerrit.cloudera.org:8080/#/c/16280/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
File testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test:

http://gerrit.cloudera.org:8080/#/c/16280/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test@45
PS4, Line 45: where i_color = 'peach'
> Is there any specific reason why we have two tests for query 24 that are al
Same as above, the TPC-DS specification has it written as two semi colon 
separated queries.



--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 22:25:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

2020-06-30 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16112 )

Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
..


Patch Set 10: Code-Review+1

(2 comments)

LGTM, nice approach of keeping the groupingExprs_ in the select stmt.

http://gerrit.cloudera.org:8080/#/c/16112/10/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/16112/10/fe/src/main/cup/sql-parser.cup@3076
PS10, Line 3076: // * GROUP BY a, b, c WITH CUBE - non-standard CUBE syntax 
supported by some systems
Add example of GROUP BY GROUPING SETS ?


http://gerrit.cloudera.org:8080/#/c/16112/10/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
File fe/src/main/java/org/apache/impala/analysis/GroupByClause.java:

http://gerrit.cloudera.org:8080/#/c/16112/10/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@121
PS10, Line 121:   private void addGroupingID(long id, List groupingExprs) 
throws AnalysisException {
Nit but might be cleaner to use the JDK BitSet class especially down the line 
when filling out the multiagg info.



--
To view, visit http://gerrit.cloudera.org:8080/16112
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Gerrit-Change-Number: 16112
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 30 Jun 2020 16:52:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 12: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Jul 2020 04:37:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8954: Uncorrelated scalar subqueries in the select list

2020-07-04 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16007 )

Change subject: IMPALA-8954: Uncorrelated scalar subqueries in the select list
..


Patch Set 8:

(1 comment)

> Patch Set 7: Code-Review+1
>
> (1 comment)
>
> LGTM. One suggestion below for testing.

http://gerrit.cloudera.org:8080/#/c/16007/7/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test:

http://gerrit.cloudera.org:8080/#/c/16007/7/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test@9036
PS7, Line 9036: |--90:EXCHANGE [UNPARTITIONED]
> This query seems to be a special case where both left and right inputs of t
Good point added an end user sample query which illustrates this.



--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 8
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 04 Jul 2020 13:40:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8954: Uncorrelated scalar subqueries in the select list

2020-07-04 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Fang-Yu Rao, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16007

to look at the new patch set (#8).

Change subject: IMPALA-8954: Uncorrelated scalar subqueries in the select list
..

IMPALA-8954: Uncorrelated scalar subqueries in the select list

Extend StmtRewriter with the ability to rewrite scalar subqueries in the
select list into cross joins. Currently the subquery must pass plan-time
checks to determine that it returns a single row which may miss cases
that may be valid at runtime or with more complex evaluation of the
predicate expressions in the planner. Support for correlated subqueries
will be a follow on change.

Testing:
  * Added new analyzer tests, updated previous subquery tests
  * test_queries.py::TestQueries::test_subquery
  * Added test_tpcds_q9 to e2e and planner tests

Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
---
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M testdata/datasets/tpcds/tpcds_schema_template.sql
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
M testdata/workloads/tpcds/queries/count.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test
A testdata/workloads/tpcds/queries/tpcds-q9.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
13 files changed, 1,572 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/16007/8
--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 8
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-06 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 7:

(3 comments)

Grouping Sets are usually useful when saved as temp or logical views, then 
sliced and diced to get to the right level.

Any potential edge case with saving a grouping set query as a logical view?

http://gerrit.cloudera.org:8080/#/c/16140/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16140/7//COMMIT_MSG@28
PS7, Line 28:  27 and 36 sho
https://github.com/cwida/tpcds-result-reproduction/blob/master/answer_sets_nulls_last/36.ans

https://github.com/cwida/tpcds-result-reproduction/blob/master/answer_sets_nulls_last/27.ans


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@394
PS7, Line 394: casingthe
nit: casing__space__the


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
File fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java:

http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java@150
PS7, Line 150:* analysis.
Mention in the comment that it's used for implementing grouping() as an example?



--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 04:59:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8954: Uncorrelated scalar subqueries in the select list

2020-07-04 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Fang-Yu Rao, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16007

to look at the new patch set (#9).

Change subject: IMPALA-8954: Uncorrelated scalar subqueries in the select list
..

IMPALA-8954: Uncorrelated scalar subqueries in the select list

Extend StmtRewriter with the ability to rewrite scalar subqueries in the
select list into cross joins. Currently the subquery must pass plan-time
checks to determine that it returns a single row which may miss cases
that may be valid at runtime or with more complex evaluation of the
predicate expressions in the planner. Support for correlated subqueries
will be a follow on change.

Testing:
  * Added new analyzer tests, updated previous subquery tests
  * test_queries.py::TestQueries::test_subquery
  * Added test_tpcds_q9 to e2e and planner tests

Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
---
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M testdata/datasets/tpcds/tpcds_schema_template.sql
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
M testdata/workloads/tpcds/queries/count.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test
A testdata/workloads/tpcds/queries/tpcds-q9.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
13 files changed, 1,572 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/16007/9
--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 9
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8954: Uncorrelated scalar subqueries in the select list

2020-07-04 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16007 )

Change subject: IMPALA-8954: Uncorrelated scalar subqueries in the select list
..


Patch Set 8:

(11 comments)

Addressed some comments.

http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1113
PS3, Line 1113:
> nit: implemented
Done


http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1117
PS3, Line 1117:  *
> Is this comment applicable for this patch ? Looks like this patch is suppor
Done


http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1135
PS3, Line 1135:  *
> nit: spelling 'gaurantee'
Done


http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1230
PS3, Line 1230:   // rewrite to a LOJ.
> If the outer query has 2 tables joined in the FROM clause, then we add this
Done


http://gerrit.cloudera.org:8080/#/c/16007/5/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16007/5/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1114
PS5, Line 1114: implemeted
> nit: implemented
Done


http://gerrit.cloudera.org:8080/#/c/16007/5/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1126
PS5, Line 1126: flatten
> nit: flattened
Done


http://gerrit.cloudera.org:8080/#/c/16007/5/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1136
PS5, Line 1136: gaurantee
> nit: guarantee
Done


http://gerrit.cloudera.org:8080/#/c/16007/7/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test:

http://gerrit.cloudera.org:8080/#/c/16007/7/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test@9036
PS7, Line 9036: |--90:EXCHANGE [UNPARTITIONED]
> On a related note, we probably *should* be doing a broadcast when the right
Ack


http://gerrit.cloudera.org:8080/#/c/16007/3/testdata/workloads/functional-query/queries/QueryTest/subquery.test
File testdata/workloads/functional-query/queries/QueryTest/subquery.test:

http://gerrit.cloudera.org:8080/#/c/16007/3/testdata/workloads/functional-query/queries/QueryTest/subquery.test@1044
PS3, Line 1044: select id, 1+(select min(id) from functional.alltypessmall)
> Would be useful to also add the Explain plan for some of these tests either
Done


http://gerrit.cloudera.org:8080/#/c/16007/3/testdata/workloads/functional-query/queries/QueryTest/subquery.test@1086
PS3, Line 1086:
> Not sure what this means - without grouping?
Meant that the parent query had aggregation, and the outer and inner 
aggregations weren't interfering. This is a good case for subquery coalescing / 
merging down the line.


http://gerrit.cloudera.org:8080/#/c/16007/3/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test
File testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test:

http://gerrit.cloudera.org:8080/#/c/16007/3/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test@3
PS3, Line 3: select case when (select count(*)
> Cool that  this query is now supported. Can we also add the plan for this u
Done



--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 8
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 04 Jul 2020 19:33:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8984: Uncorrelated scalar subqueries in the select list

2020-07-03 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Fang-Yu Rao, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16007

to look at the new patch set (#7).

Change subject: IMPALA-8984: Uncorrelated scalar subqueries in the select list
..

IMPALA-8984: Uncorrelated scalar subqueries in the select list

Extend StmtRewriter with the ability to rewrite scalar subqueries in the
select list into cross joins. Currently the subquery must pass plan-time
checks to determine that it returns a single row which may miss cases
that may be valid at runtime or with more complex evaluation of the
predicate expressions in the planner. Support for correlated subqueries
will be a follow on change.

Testing:
  * Added new analyzer tests, updated previous subquery tests
  * test_queries.py::TestQueries::test_subquery
  * Added test_tpcds_q9 to e2e and planner tests

Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
---
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M testdata/datasets/tpcds/tpcds_schema_template.sql
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
M testdata/workloads/tpcds/queries/count.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test
A testdata/workloads/tpcds/queries/tpcds-q9.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
12 files changed, 1,438 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/16007/7
--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 7
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8984: Uncorrelated scalar subqueries in the select list

2020-07-03 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16007

to look at the new patch set (#6).

Change subject: IMPALA-8984: Uncorrelated scalar subqueries in the select list
..

IMPALA-8984: Uncorrelated scalar subqueries in the select list

Extend StmtRewriter with the ability to rewrite scalar subqueries in the
select list into cross joins. Currently the subquery must pass plan-time
checks to determine that it returns a single row which may miss cases
that may be valid at runtime or with more complex evaluation of the
predicate expressions in the planner. Support for correlated subqueries
will be a follow on change.

Testing:
  * Added new analyzer tests, updated previous subquery tests
  * test_queries.py::TestQueries::test_subquery
  * Added test_tpcds_q9 to e2e and planner tests

Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
---
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M testdata/datasets/tpcds/tpcds_schema_template.sql
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
M testdata/workloads/tpcds/queries/count.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test
A testdata/workloads/tpcds/queries/tpcds-q9.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
12 files changed, 1,438 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/16007/6
--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8984: Uncorrelated scalar subqueries in the select list

2020-07-03 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16007 )

Change subject: IMPALA-8984: Uncorrelated scalar subqueries in the select list
..


Patch Set 6:

Fangu-Yu, FYI tweaked the reason table schema to be consistent with references 
to the column from other tables i.e. *_reason_sk


--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Jul 2020 22:14:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

2020-07-03 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16112 )

Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
..


Patch Set 13: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16112/10/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
File fe/src/main/java/org/apache/impala/analysis/GroupByClause.java:

http://gerrit.cloudera.org:8080/#/c/16112/10/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@121
PS10, Line 121:   private void addGroupingID(long id, List groupingExprs) 
throws AnalysisException {
> It's probably easier to leave as-is, if only because enumerating classes fo
Ack



--
To view, visit http://gerrit.cloudera.org:8080/16112
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Gerrit-Change-Number: 16112
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Jul 2020 22:04:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.

2020-07-04 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16052

to look at the new patch set (#5).

Change subject: IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.
..

IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.

Support rewriting subqueries in the HAVING clause by nesting the
aggregation query and pulling up the subquery predicates into the outer
WHERE clause.

Testing:
  * New analyzer tests
  * New functional subquery tests
  * Added Q23, Q24 and Q44 to the tpcds workload
  * Ran subquery rewrite tests

Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
A testdata/workloads/tpcds/queries/tpcds-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-q44.test
M tests/query_test/test_tpcds_queries.py
12 files changed, 1,192 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/16052/5
--
To view, visit http://gerrit.cloudera.org:8080/16052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
Gerrit-Change-Number: 16052
Gerrit-PatchSet: 5
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.

2020-07-04 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16052

to look at the new patch set (#4).

Change subject: IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.
..

IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.

Support rewriting subqueries in the HAVING clause by nesting the
aggregation query and pulling up the subquery predicates into the outer
WHERE clause.

Testing:
  * New analyzer tests
  * New functional subquery tests
  * Added Q23, Q24 and Q44 to the tpcds workload
  * Ran subquery rewrite tests

Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
A testdata/workloads/tpcds/queries/tpcds-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-q44.test
M tests/query_test/test_tpcds_queries.py
12 files changed, 1,192 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/16052/4
--
To view, visit http://gerrit.cloudera.org:8080/16052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
Gerrit-Change-Number: 16052
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT

2020-07-12 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16171 )

Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
..


Patch Set 5: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/16171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6
Gerrit-Change-Number: 16171
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 13 Jul 2020 01:39:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-891: INTERSECT and EXCEPT Support

2020-07-01 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16123


Change subject: IMPALA-891: INTERSECT and EXCEPT Support
..

IMPALA-891: INTERSECT and EXCEPT Support

[WIP]
Lots of code cleanup and intersect/except testing.
TPC-DS queries with intersect / except pass
Union tests pass.
TODO: separate methods between UnionStmt & SetOperationStmt
TODO: ValuesStmt in intersect / except

Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
A fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/analysis/ValuesStmt.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q14a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q14b.test
A testdata/workloads/tpcds/queries/tpcds-q14a.test
A testdata/workloads/tpcds/queries/tpcds-q14b.test
A testdata/workloads/tpcds/queries/tpcds-q38.test
A testdata/workloads/tpcds/queries/tpcds-q87.test
M tests/query_test/test_tpcds_queries.py
22 files changed, 2,243 insertions(+), 589 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/16123/1
--
To view, visit http://gerrit.cloudera.org:8080/16123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
Gerrit-Change-Number: 16123
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5226: handle single subquery in or predicate

2020-07-08 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16152 )

Change subject: IMPALA-5226: handle single subquery in or predicate
..


Patch Set 6: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16152/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16152/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@604
PS6, Line 604: We could implement by replacing or augment the
nit: having trouble parsing this sentence.


http://gerrit.cloudera.org:8080/#/c/16152/6/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test:

http://gerrit.cloudera.org:8080/#/c/16152/6/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test@3504
PS6, Line 3504: # Basic IN subquery in disjunct.
I'm pretty sure the behavior is correct, but would be nice to have a test where 
the subquery returns NULL.

a = 1 or b in (select NULL), the in predicate should return NULL then the IS 
NOT NULL predicate that is added will return false. In the or we'll then have a 
= 1 or false instead of a = 1 or null. That should be correct, I think the NOT 
IN case is where this breaks but we don't handle that rewrite.


http://gerrit.cloudera.org:8080/#/c/16152/6/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q45.test
File testdata/workloads/tpcds/queries/tpcds-decimal_v2-q45.test:

http://gerrit.cloudera.org:8080/#/c/16152/6/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q45.test@2
PS6, Line 2:  QUERY: TPCDS-Q45
This doesn't look like the qualification query from tpc-ds for q45.

The parameters are on page 115 of 
http://www.tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v2.13.0.pdf

https://github.com/cwida/tpcds-result-reproduction/blob/master/query_qualification/45.sql



--
To view, visit http://gerrit.cloudera.org:8080/16152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64588992901afd7cd885419a0b7f949b0b174976
Gerrit-Change-Number: 16152
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Jul 2020 22:55:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8984: Uncorrelated scalar subqueries in the select list

2020-06-25 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16007

to look at the new patch set (#4).

Change subject: IMPALA-8984: Uncorrelated scalar subqueries in the select list
..

IMPALA-8984: Uncorrelated scalar subqueries in the select list

Extend StmtRewriter with the ability to rewrite scalar subqueries in the
select list into cross joins. Currently the subquery must pass plan-time
checks to determine that it returns a single row which may miss cases
that may be valid at runtime or with more complex evaluation of the
predicate expressions in the planner. Support for correlated subqueries
will be a follow on change.

With this change Q9 of TPC-DS is supported, we now load the 'reasons'
table as part of the TPC-DS workload for use by Q9.

Testing:
  * Added new analyzer tests, updated previous subquery tests
  * test_queries.py::TestQueries::test_subquery

Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
---
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M testdata/datasets/tpcds/tpcds_schema_template.sql
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
M testdata/workloads/tpcds/queries/count.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test
A testdata/workloads/tpcds/queries/tpcds-q9.test
M tests/query_test/test_tpcds_queries.py
11 files changed, 1,455 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/16007/4
--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8984: Uncorrelated scalar subqueries in the select list

2020-06-25 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16007 )

Change subject: IMPALA-8984: Uncorrelated scalar subqueries in the select list
..


Patch Set 4:

(9 comments)

Some more tests and review comments addressed. Still want to get a good test 
run out of jenkins.

http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@292
PS3, Line 292: "Invariant violated: Only subqueries that are 
guaranteed to return a "
> nit:  "guaranteed"
Done


http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@937
PS3, Line 937:  * supported in the FROM clause, WHERE clause and SELECT 
list. The rewrite is
> Update this comment for SELECT clause.
Done


http://gerrit.cloudera.org:8080/#/c/16007/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1137
PS3, Line 1137:  *returned per group so a run time cardinality check 
must be applied. An exception
> nit: duplicate 'primary'
Done


http://gerrit.cloudera.org:8080/#/c/16007/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16007/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1117
PS4, Line 1117:  *
Done


http://gerrit.cloudera.org:8080/#/c/16007/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1117
PS4, Line 1117:  *
Done


http://gerrit.cloudera.org:8080/#/c/16007/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1135
PS4, Line 1135:  *
Done


http://gerrit.cloudera.org:8080/#/c/16007/4/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1230
PS4, Line 1230:   // rewrite to a LOJ.
I added a test for this. I know it feels weird but since the slotref for the 
subquery is marked as materialized and the other join queries get bound by the 
USING/ON clause, nothing explodes. Since there are scalar subqueries, the only 
weird situation is if the cardinality of all the joins where equal to 1 then it 
might get reordered but the results would still be correct.

If it were a correlated subquery then we've need to handle things more 
carefully, but that's for a later commit.


http://gerrit.cloudera.org:8080/#/c/16007/4/testdata/workloads/functional-query/queries/QueryTest/subquery.test
File testdata/workloads/functional-query/queries/QueryTest/subquery.test:

http://gerrit.cloudera.org:8080/#/c/16007/4/testdata/workloads/functional-query/queries/QueryTest/subquery.test@1044
PS4, Line 1044: select id, 1+(select min(id) from functional.alltypessmall)
Added the tpc-ds query, it's a pretty complex plan.


http://gerrit.cloudera.org:8080/#/c/16007/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test
File testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test:

http://gerrit.cloudera.org:8080/#/c/16007/4/testdata/workloads/tpcds/queries/tpcds-decimal_v2-q9.test@3
PS4, Line 3: select case when (select count(*)
Done



--
To view, visit http://gerrit.cloudera.org:8080/16007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibcf55d26889aa01d69bb85f18c9241dda095fb66
Gerrit-Change-Number: 16007
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 25 Jun 2020 19:38:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9784: Non correlated subqueries in HAVING.

2020-06-25 Thread Shant Hovsepian (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16052

to look at the new patch set (#2).

Change subject: IMPALA-9784: Non correlated subqueries in HAVING.
..

IMPALA-9784: Non correlated subqueries in HAVING.

Support rewriting subqueries in the HAVING clause by nesting the
aggregation query and pulling up the subquery predicates into the outer
WHERE clause.

Testing:
  * New analyzer tests
  * New functional subquery tests
  * Added Q23, Q24 and Q44 to the tpcds workload

Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
A testdata/workloads/tpcds/queries/tpcds-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-q44.test
M tests/query_test/test_tpcds_queries.py
11 files changed, 1,123 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/16052/2
--
To view, visit http://gerrit.cloudera.org:8080/16052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
Gerrit-Change-Number: 16052
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9784: Non correlated subqueries in HAVING.

2020-06-25 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16052 )

Change subject: IMPALA-9784: Non correlated subqueries in HAVING.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16052/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16052/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@466
PS2, Line 466: // TODO: IMPALA-5100 to cover all cases, we do let 
through runtime scalars with
I relaxed some of these rules to let through subqueries such as (select 
count(a) from t group by b where b=1).
Referenced the jira to enhance the scalar subquery planner checks to handle 
more expression evaluation but for now thought the tradeoff was better to let 
these queries through wrapped in a CardinalityCheckNode.

There are case where two different runtime scalar subqueries in nested query 
blocks could run and have runtime errors that interfere with each other since 
we don't have independent execution, but I checked and many other database 
(hive, vertica, vectorwise) have this kind of behavior. It feels like a 
worthwhile trade off to allow more queries to run where some might have a 
runtime error in an off chance when we'd just otherwise not let the query run 
at all. Also it's needed to support Q44 from TPC-DS.



--
To view, visit http://gerrit.cloudera.org:8080/16052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
Gerrit-Change-Number: 16052
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 25 Jun 2020 19:52:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7020: fix costing of non-trivial CAST expressions

2020-06-24 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16073 )

Change subject: IMPALA-7020: fix costing of non-trivial CAST expressions
..


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16073/3/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

http://gerrit.cloudera.org:8080/#/c/16073/3/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@285
PS3, Line 285: (inType.isVarchar() || inType.getPrimitiveType() == 
PrimitiveType.STRING)) {
> I think that would be a separate case in SortInfo.getMaterializedSortExprs(
It is a weird pattern, I'd expect lpad or substr to be used more often, but I 
did some query searching and using casts to truncate like this does happen, 
probably some hold over from older SQL behavior. Not critical to address.



--
To view, visit http://gerrit.cloudera.org:8080/16073
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f1a16fc45191a2eedf38cc243c70173d44806c6
Gerrit-Change-Number: 16073
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 25 Jun 2020 03:25:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-19 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16092 )

Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..


Patch Set 1:

(2 comments)

Added a field to indicate query completion.

http://gerrit.cloudera.org:8080/#/c/16092/1/www/query_plan.tmpl
File www/query_plan.tmpl:

http://gerrit.cloudera.org:8080/#/c/16092/1/www/query_plan.tmpl@140
PS1, Line 140:   var summary = json["summary"];
> Not sure about this bit. Isn't this set once the query starts running, not
Yes in my testing I was using summary being filled to detect query completion, 
turns it the timing of fetching the response made it seem like this was a good 
indicator, but the summary is always.

Instead added an attribute to the json response to indicate if the query was 
still in flight.


http://gerrit.cloudera.org:8080/#/c/16092/1/www/query_plan.tmpl@229
PS1, Line 229: var intervalId = setInterval(refresh, 1000);
> Probably refreshing every 1s is overkill anyway, since the status report in
I upped it to 2s only because higher values made the animation of the plan 
metrics and color flow very choppy.



--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 19 Jun 2020 21:59:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-19 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16092 )

Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16092/1/www/query_plan.tmpl
File www/query_plan.tmpl:

http://gerrit.cloudera.org:8080/#/c/16092/1/www/query_plan.tmpl@140
PS1, Line 140:   var summary = json["summary"];
> Not sure about this bit. Isn't this set once the query starts running, not
Done


http://gerrit.cloudera.org:8080/#/c/16092/1/www/query_plan.tmpl@229
PS1, Line 229: var intervalId = setInterval(refresh, 1000);
> I upped it to 2s only because higher values made the animation of the plan
Done



--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 19 Jun 2020 22:00:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-19 Thread Shant Hovsepian (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16092

to look at the new patch set (#2).

Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..

IMPALA-9866: Query plan debug page stuck in fetch and render loop.

Once a query is completed we stop fetching and rendering the plan. This
speeds interaction with large query plans in the web browser as well as
reduces some load on the query coordinator.

Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
---
M be/src/service/impala-http-handler.cc
M www/query_plan.tmpl
2 files changed, 8 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/16092/2
--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9866: Query plan debug page stuck in fetch and render loop.

2020-06-17 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16092


Change subject: IMPALA-9866: Query plan debug page stuck in fetch and render 
loop.
..

IMPALA-9866: Query plan debug page stuck in fetch and render loop.

Once a query is completed we stop fetching and rendering the plan. This
speeds interaction with large query plans in the web browser as well as
reduces some load on the query coordinator.

Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
---
M www/query_plan.tmpl
1 file changed, 6 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/16092/1
--
To view, visit http://gerrit.cloudera.org:8080/16092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie1f233c90d5f221813833af2e29be2250936d442
Gerrit-Change-Number: 16092
Gerrit-PatchSet: 1
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.

2020-06-28 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16052

to look at the new patch set (#3).

Change subject: IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.
..

IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.

Support rewriting subqueries in the HAVING clause by nesting the
aggregation query and pulling up the subquery predicates into the outer
WHERE clause.

Testing:
  * New analyzer tests
  * New functional subquery tests
  * Added Q23, Q24 and Q44 to the tpcds workload
  * Ran subquery rewrite tests

Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
---
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/subquery.test
A testdata/workloads/tpcds/queries/tpcds-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-q44.test
M tests/query_test/test_tpcds_queries.py
12 files changed, 1,194 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/16052/3
--
To view, visit http://gerrit.cloudera.org:8080/16052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
Gerrit-Change-Number: 16052
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.

2020-06-28 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16052 )

Change subject: IMPALA-9784, IMPALA-9905: Uncorrelated subqueries in HAVING.
..


Patch Set 3:

(4 comments)

CR comments addressed, still working on clean jenkins runs.

http://gerrit.cloudera.org:8080/#/c/16052/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16052/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@465
PS2, Line 465: // TODO: Remove this when independent subquery 
evaluation is implemented.
> Can we add a simple planner test for this rewrite? With the cardinality che
Done


http://gerrit.cloudera.org:8080/#/c/16052/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@466
PS2, Line 466: // TODO: IMPALA-5100 to cover all cases, we do let 
through runtime scalars with
> It might be helpful to file a separate JIRA with an example of this query,
IMPALA-1285 already existed to cover this case, however the way the JIRA is 
worded in such way to cover the case of detecting these situations at plan 
time. I had linked this JIRA with IMPALA-5100 which is the blanket issue for 
runtime scalar checks.

Adding IMPALA-9905 to cover the case of queries that can now run after this 
change.


http://gerrit.cloudera.org:8080/#/c/16052/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@469
PS2, Line 469: presence
> nit: presence
Done


http://gerrit.cloudera.org:8080/#/c/16052/2/testdata/workloads/functional-query/queries/QueryTest/subquery.test
File testdata/workloads/functional-query/queries/QueryTest/subquery.test:

http://gerrit.cloudera.org:8080/#/c/16052/2/testdata/workloads/functional-query/queries/QueryTest/subquery.test@1218
PS2, Line 1218:   from alltypestiny group by id
> nit: remove the functional database name here and below - the test framewor
Done



--
To view, visit http://gerrit.cloudera.org:8080/16052
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I124a58a09a1a47e1222a22d84b54fe7d07844461
Gerrit-Change-Number: 16052
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sun, 28 Jun 2020 18:16:27 +
Gerrit-HasComments: Yes


  1   2   >