[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16146 )

Change subject: IMPALA-9921: Change error messages in checking needsQuotes to 
TRACE level logs
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b
Gerrit-Change-Number: 16146
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 08 Jul 2020 04:40:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16146 )

Change subject: IMPALA-9921: Change error messages in checking needsQuotes to 
TRACE level logs
..

IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs

Impala planner uses the HiveLexer to check whether an ident needs to be
quoted in toSql results. However, HiveLexer will print error messages to
stderr which is redirected to impalad.ERROR, so they appear as ERROR
level logs. Actually, they just mean HiveLexer can't parse the ident so
they are not Hive keywords so don't need to be quoted. These error
messages don't mean anything wrong so shouldn't be ERROR level logs.

This patch overrides the HiveLexer used in ToSqlUtils to log the error
messages to TRACE level logs.

Tests
 * Manually verify the error messages don't appear in impalad.ERROR and
   are printed to TRACE level logs.

Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b
Reviewed-on: http://gerrit.cloudera.org:8080/16146
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
1 file changed, 15 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b
Gerrit-Change-Number: 16146
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 12: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Jul 2020 04:37:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16122 )

Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7
Gerrit-Change-Number: 16122
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 08 Jul 2020 04:07:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16122 )

Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
..

IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState

This is the final patch in the refactor of QuerySchedule for the
single admission controller work. It was kept separate for ease of
reviewing.

This patch renames QuerySchedule and its related classes
(FInstanceExecParams, BackendExecParams, and FragmentExecParams) to
use the name 'ScheduleState', reflecting the fact that they are used
as containers for intermediate info about a query during scheduling.

The messages that are included in the QuerySchedulePB struct retain
the 'ExecParams' name, reflecting the fact that they are used by the
coordinator to start execution.

Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7
Reviewed-on: http://gerrit.cloudera.org:8080/16122
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/scheduling/CMakeLists.txt
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
R be/src/scheduling/schedule-state.cc
R be/src/scheduling/schedule-state.h
M be/src/scheduling/scheduler-test-util.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
9 files changed, 631 insertions(+), 625 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7
Gerrit-Change-Number: 16122
Gerrit-PatchSet: 6
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[native-toolchain-CR] Remove colon from filename

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16139 )

Change subject: Remove colon from filename
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5
Gerrit-Change-Number: 16139
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Jul 2020 04:01:43 +
Gerrit-HasComments: No


[native-toolchain-CR] Remove colon from filename

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16139 )

Change subject: Remove colon from filename
..

Remove colon from filename

Some OSes don't allow cloning the repo if it has a filename
with a colon on it.  See
https://github.com/cloudera/native-toolchain/issues/65
for context.

Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5
Reviewed-on: http://gerrit.cloudera.org:8080/16139
Reviewed-by: Laszlo Gaal 
Tested-by: Tim Armstrong 
---
R 
source/thrift/thrift-0.9.0-patches/0006-Revert-Thrift-p5_IMPALA-3159-Add-SAN-Wildcard-SSL-cert-client-support-in-python.patch
1 file changed, 0 insertions(+), 0 deletions(-)

Approvals:
  Laszlo Gaal: Looks good to me, approved
  Tim Armstrong: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5
Gerrit-Change-Number: 16139
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Remove colon from filename

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16139 )

Change subject: Remove colon from filename
..


Patch Set 2:

I'm not sure - these patches are usually manually generated so not sure what 
happened here.


--
To view, visit http://gerrit.cloudera.org:8080/16139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5
Gerrit-Change-Number: 16139
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Jul 2020 04:02:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Jul 2020 02:56:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16103 )

Change subject: IMPALA-9294: Support DATE for min-max runtime filter
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6526/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c
Gerrit-Change-Number: 16103
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 08 Jul 2020 01:23:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter

2020-07-07 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/16103 )

Change subject: IMPALA-9294: Support DATE for min-max runtime filter
..

IMPALA-9294: Support DATE for min-max runtime filter

Implemented Date min-max filter and applied it to Kudu as other
min-max runtime filters.
Added new test cases for Date min-max filters.

Testing:
Passed all core tests.

Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/runtime/date-value.h
M be/src/util/min-max-filter-ir.cc
M be/src/util/min-max-filter-test.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M common/protobuf/common.proto
M common/thrift/Data.thrift
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test
M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
12 files changed, 295 insertions(+), 167 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/16103/8
--
To view, visit http://gerrit.cloudera.org:8080/16103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c
Gerrit-Change-Number: 16103
Gerrit-PatchSet: 8
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter

2020-07-07 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16103 )

Change subject: IMPALA-9294: Support DATE for min-max runtime filter
..


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test:

http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@168
PS3, Line 168: functional_kudu.date_tbl a
 : join [BROADCAST] functional_kudu.date_tbl b
> While its a little hard to tell and I think that we do have a test that tec
Will add a functional-query test as suggested


http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test:

http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@167
PS7, Line 167: # Query with date-typed join column
> I think this case can be left out - its only showing that we'll generate DA
will remove it


http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@206
PS7, Line 206: # Query with 3-way shuffer joins on date-typed join columns
> I think its reasonable to leave this case in as we don't already have a pla
Will update comments and change the type of one joined columns


http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@256
PS7, Line 256: # Query with 3-way join on date-typed join columns
> I don't think this case is testing anything different from the existing cas
will remove it


http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@312
PS7, Line 312: # Query with 3-way join on date-typed and int-typed join columns
> Same as above, I don't think this case is testing anything different from t
Will remove it.



--
To view, visit http://gerrit.cloudera.org:8080/16103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c
Gerrit-Change-Number: 16103
Gerrit-PatchSet: 7
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 08 Jul 2020 00:46:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16150 )

Change subject: IMPALA-3380: Add a timeout for statestore RPCs
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6525/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2
Gerrit-Change-Number: 16150
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 08 Jul 2020 00:04:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-07-07 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..


Patch Set 14:

> Uploaded patch set 14.


--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 14
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Jul 2020 00:01:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6524/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 14
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:55:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16150 )

Change subject: IMPALA-3380: Add a timeout for statestore RPCs
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6523/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2
Gerrit-Change-Number: 16150
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:47:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs

2020-07-07 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16150 )

Change subject: IMPALA-3380: Add a timeout for statestore RPCs
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16150/1/be/src/statestore/statestore-subscriber.cc
File be/src/statestore/statestore-subscriber.cc:

http://gerrit.cloudera.org:8080/#/c/16150/1/be/src/statestore/statestore-subscriber.cc@69
PS1, Line 69: DEFINE_int32(statestore_client_rpc_timeout_ms, 30, 
"(Advanced) The underlying TSocket "
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16150/1/tests/custom_cluster/test_rpc_timeout.py
File tests/custom_cluster/test_rpc_timeout.py:

http://gerrit.cloudera.org:8080/#/c/16150/1/tests/custom_cluster/test_rpc_timeout.py@230
PS1, Line 230: #
> flake8: E131 continuation line unaligned for hanging indent
Done



--
To view, visit http://gerrit.cloudera.org:8080/16150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2
Gerrit-Change-Number: 16150
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:43:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs

2020-07-07 Thread Sahil Takiar (Code Review)
Hello Thomas Tauber-Marshall, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16150

to look at the new patch set (#2).

Change subject: IMPALA-3380: Add a timeout for statestore RPCs
..

IMPALA-3380: Add a timeout for statestore RPCs

Adds the following startup flags for statestore subscribers:
'statestore_client_rpc_timeout_ms'. The timeout is set to 5 minutes by
default.

Testing:
* Adds some tests for catalog_client_rpc_timeout_ms that validate the
  timeout is used correctly, and that retries are triggered

Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2
---
M be/src/catalog/catalog-server.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/statestore/statestore-subscriber.cc
M tests/custom_cluster/test_rpc_timeout.py
5 files changed, 65 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/16150/2
--
To view, visit http://gerrit.cloudera.org:8080/16150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2
Gerrit-Change-Number: 16150
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16146 )

Change subject: IMPALA-9921: Change error messages in checking needsQuotes to 
TRACE level logs
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6105/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b
Gerrit-Change-Number: 16146
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:36:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16146 )

Change subject: IMPALA-9921: Change error messages in checking needsQuotes to 
TRACE level logs
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b
Gerrit-Change-Number: 16146
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:36:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-07-07 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..

IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

This work addresses the current limitation in computing the total row
count for a Hive table in a scan. The row count can be incorrectly
computed as 0, even though there exists data in the Hive table. This
is the stats corruption at table level. Similar stats corruption
exists for a partition. The row count of a table or a partition
sometime can also be -1 which indicates a missing stats situation.

In the fix, as long as no partition in a Hive table exhibits any
missing or corrupt stats, the total row count for the table is computed
from the row counts in all partitions. Otherwise, Impala looks at
the table level stats particularly the table row count.

In addition, if the table stats is missing or corrupted, Impala
estimates a row count for the table, if feasible. This row count is
the sum of the row count from the partitions with good stats, and
an estimation of the number of rows in the partitions with missing or
corrupt stats. Such estimation also applies when some partition
has missing or corrupt stats.

One way to observe the fix is through the explain of queries scanning
Hive tables with missing or corrupted stats. The cardinality for any
full scan should be a positive value (i.e. the estimated row count),
instead of 'unavailable'.  At the beginning of the explain output,
that table is still listed in the WARNING section for potentially
corrupt table statistics.

Testing:
1. Ran unit tests with queries documented in the case against Hive
   tables with the following configrations:
   a. No stats corruption in any partitions
   b. Stats corruption in some partitions
   c. Stats corruption in all partitions
2. Added two new tests in test_compute_stats.py:
   a. test_corrupted_stats_in_partitioned_Hive_tables
   b. test_corrupted_stats_in_unpartitioned_Hive_tables
3. Fixed failures in corrupt-stats.test
4. Ran "core" test

Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/workloads/functional-query/queries/QueryTest/corrupt-stats.test
M tests/metadata/test_compute_stats.py
3 files changed, 182 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/16098/14
--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 14
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16150 )

Change subject: IMPALA-3380: Add a timeout for statestore RPCs
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16150/1/be/src/statestore/statestore-subscriber.cc
File be/src/statestore/statestore-subscriber.cc:

http://gerrit.cloudera.org:8080/#/c/16150/1/be/src/statestore/statestore-subscriber.cc@69
PS1, Line 69: DEFINE_int32(statestore_client_rpc_timeout_ms, 30, 
"(Advanced) The underlying TSocket "
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/16150/1/tests/custom_cluster/test_rpc_timeout.py
File tests/custom_cluster/test_rpc_timeout.py:

http://gerrit.cloudera.org:8080/#/c/16150/1/tests/custom_cluster/test_rpc_timeout.py@230
PS1, Line 230: #
flake8: E131 continuation line unaligned for hanging indent



--
To view, visit http://gerrit.cloudera.org:8080/16150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2
Gerrit-Change-Number: 16150
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:27:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs

2020-07-07 Thread Sahil Takiar (Code Review)
Sahil Takiar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16150


Change subject: IMPALA-3380: Add a timeout for statestore RPCs
..

IMPALA-3380: Add a timeout for statestore RPCs

Adds the following startup flags for statestore subscribers:
'statestore_client_rpc_timeout_ms'. The timeout is set to 5 minutes by
default.

Testing:
* Adds some tests for catalog_client_rpc_timeout_ms that validate the
  timeout is used correctly, and that retries are triggered

Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2
---
M be/src/catalog/catalog-server.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/statestore/statestore-subscriber.cc
M tests/custom_cluster/test_rpc_timeout.py
5 files changed, 64 insertions(+), 20 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/16150/1
--
To view, visit http://gerrit.cloudera.org:8080/16150
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2
Gerrit-Change-Number: 16150
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 


[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16122 )

Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6104/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7
Gerrit-Change-Number: 16122
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:01:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState

2020-07-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16122 )

Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
..


Patch Set 4: Code-Review+2

carrying forward


--
To view, visit http://gerrit.cloudera.org:8080/16122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7
Gerrit-Change-Number: 16122
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:00:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16122 )

Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7
Gerrit-Change-Number: 16122
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 23:01:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6522/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 22:15:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9898: generate grouping set plans

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16128 )

Change subject: IMPALA-9898: generate grouping set plans
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6521/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16128
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197
Gerrit-Change-Number: 16128
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 22:14:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6520/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 22:03:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16134 )

Change subject: IMPALA-7923: DecimalValue should be marked as packed
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6102/


--
To view, visit http://gerrit.cloudera.org:8080/16134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed
Gerrit-Change-Number: 16134
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 22:04:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16140/12/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16140/12/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@324
PS12, Line 324: // all aggregation classes to the final output produced by 
the transposing aggregation.
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 21:50:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6103/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 21:49:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9898: generate grouping set plans

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16128 )

Change subject: IMPALA-9898: generate grouping set plans
..


Patch Set 11: Code-Review+1

rebased onto latest master


--
To view, visit http://gerrit.cloudera.org:8080/16128
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197
Gerrit-Change-Number: 16128
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 21:49:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 12: Code-Review+1

rebased onto latest master


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 21:49:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16140

to look at the new patch set (#12).

Change subject: IMPALA-9917: grouping() and grouping_id() support
..

IMPALA-9917: grouping() and grouping_id() support

Implements grouping() and grouping_id() builtins.
grouping_id() has both a no-arg version, which returns
a bit vector of all grouping exprs and a varargs version,
which returns a bit vector of the provided arguments.

Grouping is a keyword, so needs special handling in the
parser to be accepted as a function name.

These functions are implemented in the transpose agg
with a CASE expression similar to other aggregate functions,
but returning the grouping() or grouping_id() value for that
aggregation class instead of an aggregated value.

Testing:
* Added parser test for grouping keyword.
* Added analysis tests for the functions.
* Added basic planner test to show expressions generated
* Added some TPC-DS queries that use grouping() - queries
  80, 70 and 86 using reference .test files from Fang-Yu
  Rao. 27 and 36 were added with reference results from
  https://github.com/cwida/tpcds-result-reproduction
* Add targeted end-to-end tests.
* Added view compatibility test with Hive.

Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test
M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
19 files changed, 2,684 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/12
--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9898: generate grouping set plans

2020-07-07 Thread Tim Armstrong (Code Review)
Hello Aman Sinha, Quanlong Huang, Fang-Yu Rao, Qifan Chen, David Rorke, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16128

to look at the new patch set (#11).

Change subject: IMPALA-9898: generate grouping set plans
..

IMPALA-9898: generate grouping set plans

Integrates the parsing and analysis with plan generation.

Testing:
* Add analysis test to make sure we reject unsupported queries.
* Added targeted planner tests to ensure we generate the correct
  aggregation classes for a variety of cases.
* Add targeted end-to-end functional tests.

Added five TPC-DS queries that use ROLLUP, building on some work done
by Fang-Yu Rao. Some tweaks were required for these tests.
* Add an extra ORDER BY clause to q77 to make fully deterministic.
* Add backticks around `returns` to avoid reserved word.
* Add INTERVAL keyword to date/timestamp arithmetic.

We can run q80, too, but I haven't added or verified results yet -
that can be done in a follow-up.

Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q5.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
22 files changed, 4,208 insertions(+), 66 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16128/11
--
To view, visit http://gerrit.cloudera.org:8080/16128
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197
Gerrit-Change-Number: 16128
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16140/11/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16140/11/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@324
PS11, Line 324: // all aggregation classes to the final output produced by 
the transposing aggregation.
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 21:35:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16140

to look at the new patch set (#11).

Change subject: IMPALA-9917: grouping() and grouping_id() support
..

IMPALA-9917: grouping() and grouping_id() support

Implements grouping() and grouping_id() builtins.
grouping_id() has both a no-arg version, which returns
a bit vector of all grouping exprs and a varargs version,
which returns a bit vector of the provided arguments.

Grouping is a keyword, so needs special handling in the
parser to be accepted as a function name.

These functions are implemented in the transpose agg
with a CASE expression similar to other aggregate functions,
but returning the grouping() or grouping_id() value for that
aggregation class instead of an aggregated value.

Testing:
* Added parser test for grouping keyword.
* Added analysis tests for the functions.
* Added basic planner test to show expressions generated
* Added some TPC-DS queries that use grouping() - queries
  80, 70 and 86 using reference .test files from Fang-Yu
  Rao. 27 and 36 were added with reference results from
  https://github.com/cwida/tpcds-result-reproduction
* Add targeted end-to-end tests.
* Added view compatibility test with Hive.

Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test
M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
19 files changed, 2,684 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/11
--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@309
PS10, Line 309:   aggInfos_.get(0).getResultSmap() : new 
ExprSubstitutionMap();
> Should this also clone() the smap like in the 'if' case ?  If so, this assi
I was trying to avoid the clone() when it wasn't necessary but was probably 
overthinking it :) - I doubt it makes a measurable difference.


http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@328
PS10, Line 328: // all aggregation classes to he final output produced by 
the transposing aggregation.
> nit: 'he'->'the'
Done



--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 21:34:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16148 )

Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6519/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
Gerrit-Change-Number: 16148
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 21:16:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9834: De-flake TestQueryRetries on EC builds

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16149 )

Change subject: IMPALA-9834: De-flake TestQueryRetries on EC builds
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6518/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5c73c2cbd0ef369175856c41f36d4b0de4b8d71
Gerrit-Change-Number: 16149
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 21:14:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16112 )

Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
..

IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

Supports a single ROLLUP, CUBE or GROUPING SETS clause in
GROUP BY.

Also adds non-standard "WITH ROLLUP" and "WITH CUBE" syntax
that is supported by some other SQL dialects.

This implements basic parsing and validation of the query,
then raises an AnalysisException to report that it is not
supported so that incorrect plans will not be generated.

This patch adds a GroupByClause to each SelectStmt that
contains info about the grouping sets and the original
GROUP BY list. The grouping exprs are still represented
as a List in SelectStmt. Most of the logic, including
statement and expr rewrites, can operate on this list of
expressions without requiring special handling for grouping
sets.

Testing:
* Add Parser test.
* Add toSql() test.
* Added analysis tests to check that analysis accepts or rejects
  queries correctly.

Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Reviewed-on: http://gerrit.cloudera.org:8080/16112
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/cup/sql-parser.cup
A fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
10 files changed, 759 insertions(+), 23 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16112
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Gerrit-Change-Number: 16112
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16112 )

Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
..


Patch Set 17: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16112
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Gerrit-Change-Number: 16112
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 20:49:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

2020-07-07 Thread Sahil Takiar (Code Review)
Hello Thomas Tauber-Marshall, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16148

to look at the new patch set (#2).

Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
..

IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

Fixed two warnings reported by TSAN. One of them is a data race on
exec_request_. The other is a lock-inversion warning.

The data race on exec_request_ is present because the same exec_request_
is used for the original and retried query. The issue is when a query is
retried it needs to set a new query id in the exec_request_, so retrying
a query requires mutating the exec_request_. However, even after a query
is retried the exec_request_ can still be accessed for the original
query. Specifically, an exec status report from a fragment can still be
propagated even after the query has been cancelled (e.g.
ControlService::ReportExecStatus can still access the original
exec_request_). IMPALA-9199 attempted to be smart about re-using the
TExecRequest for retried queries, but given the race conditions it seems
like a pre-mature optimization. We can re-visit IMPALA-9502 if we think
it actually makes a perf difference.

The lock-inversion warning is between the sharded locks in
ShardedQueryMap (specifically the QueryDriverMap query_driver_map_ in
ImpalaServer) and QueryDriver::client_request_state_lock_. The correct
lock ordering is to acquire the sharded lock in query_driver_map_ and then
to acquire the client_request_state_lock_.
QueryDriver::RetryQueryFromThread reversed the ordering. I wasn't able
to actually produce a deadlock, but fixing the ordering issue was
trivial.

Testing:
* Ran core tests
* Manually checked that the query retry tests are TSAN clean

Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
2 files changed, 20 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/16148/2
--
To view, visit http://gerrit.cloudera.org:8080/16148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
Gerrit-Change-Number: 16148
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

2020-07-07 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16148 )

Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.h
File be/src/runtime/query-driver.h:

http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.h@251
PS1, Line 251:   /// The TExecRequest for the retried query. Created in 
'CreateRetriedClientRequestState'.
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.cc
File be/src/runtime/query-driver.cc:

http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.cc@320
PS1, Line 320:   parent_server_, *session, retry_exec_request_.get(), 
request_state->parent_driver());
> line too long (91 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/16148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
Gerrit-Change-Number: 16148
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 20:47:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9834: De-flake TestQueryRetries on EC builds

2020-07-07 Thread Sahil Takiar (Code Review)
Sahil Takiar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16149


Change subject: IMPALA-9834: De-flake TestQueryRetries on EC builds
..

IMPALA-9834: De-flake TestQueryRetries on EC builds

This patch makes the following changes to de-flake TestQueryRetries on
EC builds.

It skips any test that scans tpch.lineitem (referred to as a
the _shuffle_heavy_query in TestQueryRetries). This query runs on three
instances during regular builds (HDFS, S3, etc.), but only two instances
on EC builds. This causes some non-deterministism during the test becase
killing an impalad in the mini-cluster won't necessarily cause a retry
to be triggered.

It bumps up the timeout used when waiting for a query to be retried.

It improves the assertion in __get_query_id_from_profile so that it
dumps the full profile when the assertion fails. This should help
debuggability of any test failures that fail in this assertion.

Testing:
* Ran TestQueryRetries locally

Change-Id: Id5c73c2cbd0ef369175856c41f36d4b0de4b8d71
---
M tests/custom_cluster/test_query_retries.py
1 file changed, 11 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16149/1
--
To view, visit http://gerrit.cloudera.org:8080/16149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id5c73c2cbd0ef369175856c41f36d4b0de4b8d71
Gerrit-Change-Number: 16149
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 


[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16148 )

Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6517/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
Gerrit-Change-Number: 16148
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 20:22:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 10: Code-Review+1

(2 comments)

LGTM.

http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@309
PS10, Line 309:   aggInfos_.get(0).getResultSmap() : new 
ExprSubstitutionMap();
Should this also clone() the smap like in the 'if' case ?  If so, this 
assignment is common to both if-else and can be moved out.


http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@328
PS10, Line 328: // all aggregation classes to he final output produced by 
the transposing aggregation.
nit: 'he'->'the'



--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 20:02:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16148 )

Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.h
File be/src/runtime/query-driver.h:

http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.h@251
PS1, Line 251:   /// The TExecRequest for the retried query. Created in 
'CreateRetriedClientRequestState'.
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.cc
File be/src/runtime/query-driver.cc:

http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.cc@320
PS1, Line 320:   parent_server_, *session, retry_exec_request_.get(), 
request_state->parent_driver());
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
Gerrit-Change-Number: 16148
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 19:55:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

2020-07-07 Thread Sahil Takiar (Code Review)
Sahil Takiar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16148


Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
..

IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors

Fixed two warnings reported by TSAN. One of them is a data race on
exec_request_. The other is a lock-inversion warning.

The data race on exec_request_ is present because the same exec_request_
is used for the original and retried query. The issue is when a query is
retried it needs to set a new query id in the exec_request_, so retrying
a query requires mutating the exec_request_. However, even after a query
is retried the exec_request_ can still be accessed for the original
query. Specifically, an exec status report from a fragment can still be
propagated even after the query has been cancelled (e.g.
ControlService::ReportExecStatus can still access the original
exec_request_). IMPALA-9199 attempted to be smart about re-using the
TExecRequest for retried queries, but given the race conditions it seems
like a pre-mature optimization. We can re-visit IMPALA-9502 if we think
it actually makes a perf difference.

The lock-inversion warning is between the sharded locks in
ShardedQueryMap (specifically the QueryDriverMap query_driver_map_ in
ImpalaServer) and QueryDriver::client_request_state_lock_. The correct
lock ordering is to acquire the sharded lock in query_driver_map_ and then
to acquire the client_request_state_lock_.
QueryDriver::RetryQueryFromThread reversed the ordering. I wasn't able
to actually produce a deadlock, but fixing the ordering issue was
trivial.

Testing:
* Ran core tests
* Manually checked that the query retry tests are TSAN clean

Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
2 files changed, 17 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/16148/1
--
To view, visit http://gerrit.cloudera.org:8080/16148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61
Gerrit-Change-Number: 16148
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 


[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()

2020-07-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16095 )

Change subject: IMPALA-9633: Implement ds_hll_union()
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16095/4/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16095/4/be/src/exprs/aggregate-functions-ir.cc@1736
PS4, Line 1736: DS_SKETCH_CONFIG, DS_HLL_TYPE
Do these configs matter, or they are overwritten during deserialization? My 
understanding is that we should be able to union sketches with different 
configs. See 
https://datasketches.apache.org/docs/Architecture/SketchCriteria.html / 
Mergeable With Different Size Parameters


http://gerrit.cloudera.org:8080/#/c/16095/4/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java:

http://gerrit.cloudera.org:8080/#/c/16095/4/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@1336
PS4, Line 1336: Lists.newArrayList(Type.STRING), Type.STRING, 
Type.STRING,
  : prefix + 
"14DsHllUnionInitEPN10impala_udf15FunctionContextEPNS1_9StringValE",
  : prefix + 
"16DsHllUnionUpdateEPN10impala_udf15FunctionContextERKNS1_9StringValEPS4_",
  : prefix + 
"15DsHllUnionMergeEPN10impala_udf15FunctionContextERKNS1_9StringValEPS4_",
  : prefix + 
"19DsHllUnionSerializeEPN10impala_udf15FunctionContextERKNS1_9StringValE",
  : prefix + 
"18DsHllUnionFinalizeEPN10impala_udf15FunctionContextERKNS1_9StringValE",
  : true, false, true));
nit: indentation



--
To view, visit http://gerrit.cloudera.org:8080/16095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09
Gerrit-Change-Number: 16095
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 19:46:20 +
Gerrit-HasComments: Yes


[native-toolchain-CR] Remove colon from filename

2020-07-07 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16139 )

Change subject: Remove colon from filename
..


Patch Set 1: Code-Review+2

LGTM.
Would there be an easy way to fix the process that emits such a filename? Not 
to stop this patch, but as a future improvement/safeguard against a 
reoccurrence.


--
To view, visit http://gerrit.cloudera.org:8080/16139
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5
Gerrit-Change-Number: 16139
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 07 Jul 2020 19:30:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6516/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 19:27:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6515/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 19:11:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16140

to look at the new patch set (#10).

Change subject: IMPALA-9917: grouping() and grouping_id() support
..

IMPALA-9917: grouping() and grouping_id() support

Implements grouping() and grouping_id() builtins.
grouping_id() has both a no-arg version, which returns
a bit vector of all grouping exprs and a varargs version,
which returns a bit vector of the provided arguments.

Grouping is a keyword, so needs special handling in the
parser to be accepted as a function name.

These functions are implemented in the transpose agg
with a CASE expression similar to other aggregate functions,
but returning the grouping() or grouping_id() value for that
aggregation class instead of an aggregated value.

Testing:
* Added parser test for grouping keyword.
* Added analysis tests for the functions.
* Added basic planner test to show expressions generated
* Added some TPC-DS queries that use grouping() - queries
  80, 70 and 86 using reference .test files from Fang-Yu
  Rao. 27 and 36 were added with reference results from
  https://github.com/cwida/tpcds-result-reproduction
* Add targeted end-to-end tests.
* Added view compatibility test with Hive.

Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test
M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
19 files changed, 2,688 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/10
--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16140

to look at the new patch set (#9).

Change subject: IMPALA-9917: grouping() and grouping_id() support
..

IMPALA-9917: grouping() and grouping_id() support

Implements grouping() and grouping_id() builtins.
grouping_id() has both a no-arg version, which returns
a bit vector of all grouping exprs and a varargs version,
which returns a bit vector of the provided arguments.

Grouping is a keyword, so needs special handling in the
parser to be accepted as a function name.

These functions are implemented in the transpose agg
with a CASE expression similar to other aggregate functions,
but returning the grouping() or grouping_id() value for that
aggregation class instead of an aggregated value.

Testing:
* Added parser test for grouping keyword.
* Added analysis tests for the functions.
* Added basic planner test to show expressions generated
* Added some TPC-DS queries that use grouping() - queries
  80, 70 and 86 using reference .test files from Fang-Yu
  Rao. 27 and 36 should also be runnable but I did not
  have reference results readily available.
* Add targeted end-to-end tests.

Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test
M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
19 files changed, 2,688 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/9
--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 8:

I added a couple of basic view compatibility tests.


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 19:08:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 8:

Oh I missed the logical view comment, will think about that.


--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 18:45:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16140

to look at the new patch set (#8).

Change subject: IMPALA-9917: grouping() and grouping_id() support
..

IMPALA-9917: grouping() and grouping_id() support

Implements grouping() and grouping_id() builtins.
grouping_id() has both a no-arg version, which returns
a bit vector of all grouping exprs and a varargs version,
which returns a bit vector of the provided arguments.

Grouping is a keyword, so needs special handling in the
parser to be accepted as a function name.

These functions are implemented in the transpose agg
with a CASE expression similar to other aggregate functions,
but returning the grouping() or grouping_id() value for that
aggregation class instead of an aggregated value.

Testing:
* Added parser test for grouping keyword.
* Added analysis tests for the functions.
* Added basic planner test to show expressions generated
* Added some TPC-DS queries that use grouping() - queries
  80, 70 and 86 using reference .test files from Fang-Yu
  Rao. 27 and 36 should also be runnable but I did not
  have reference results readily available.
* Add targeted end-to-end tests.

Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
18 files changed, 2,659 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/8
--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 7:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/16140/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16140/7//COMMIT_MSG@28
PS7, Line 28:  27 and 36 sho
> https://github.com/cwida/tpcds-result-reproduction/blob/master/answer_sets_
Added the rollup variants of those two with the same parameters as the 
references results and compared to those reference results. They were the same 
up to rounding and formatting differences.


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@2
PS7, Line 2:
> nit: remove newline
Done


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@352
PS7, Line 352: groupingIdExprs.add(aggExpr);
> The groupingIdExprs list does not get used ?
Done


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@394
PS7, Line 394:   // special-casingthe degenerate case of these functions 
being used with a single
> nit: add space between 'casing', 'the'
Done


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@398
PS7, Line 398:   throw new AnalysisException("Aggregate function '" +
> It's probably ok to reject the query but strictly speaking, grouping() func
I looked at this again and it's not too hard to implement by adding smap 
entries, so I just went and did it - might as well be complete.


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@586
PS7, Line 586:   String name = aggExpr.getFnName().getFunction();
> 'name' variable does not get used.
Done


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@667
PS7, Line 667:* Return the return value for grouping() or grouping_id() for 
a particular aggregation
> nit: remove the second 'return'
Done


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
File fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java:

http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java@150
PS7, Line 150:* analysis.
> Mention in the comment that it's used for implementing grouping() as an exa
Done


http://gerrit.cloudera.org:8080/#/c/16140/7/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test:

http://gerrit.cloudera.org:8080/#/c/16140/7/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test@1169
PS7, Line 1169: # introduce NULLs.
> nit:  not clear what 'introduce nulls' means here.
I think this was a fat finger error. Removed it.



--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 18:44:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6100/


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 18:17:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter

2020-07-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16103 )

Change subject: IMPALA-9294: Support DATE for min-max runtime filter
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test:

http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@168
PS3, Line 168: functional_kudu.date_tbl a
 : join [BROADCAST] functional_kudu.date_tbl b
> It's working too. Will add a new test case.
While its a little hard to tell and I think that we do have a test that 
technically hits this path, it would be good to add a functional-query test 
that is specifically targeted at this, eg. in min_max_filters.test add a "Test 
case 5" with a query like one of the three-way shuffle joins that I'm asking 
you to remove from the functional-planner/min-max-runtime-filters.test


http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test:

http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@167
PS7, Line 167: # Query with date-typed join column
I think this case can be left out - its only showing that we'll generate DATE 
min-max filters now, but we already have tests for that in the functional-query 
tests and we don't already have planner tests for all of the other min-max 
filter types.


http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@206
PS7, Line 206: # Query with 3-way join on date-typed join columns
I think its reasonable to leave this case in as we don't already have a planner 
test for a three-way join with all Kudu tables, but lets rewrite this comment 
to say that's what we're testing, i.e. its not the fact that we have DATE 
columns here that makes this case different from the others (you might even 
change the type of one of the columns just for variety) its the fact that its a 
three way join with two different join nodes generating different runtime 
filters.


http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@256
PS7, Line 256: 00:SCAN KUDU [functional_kudu.date_tbl a]
I don't think this case is testing anything different from the existing case 
above labeled "Query with both Kudu and HDFS targets" and so it can be removed.


http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@312
PS7, Line 312:runtime filters: RF001[min_max] -> a.id_col, RF003[min_max] 
-> a.date_col
Same as above, I don't think this case is testing anything different from the 
existing case above labeled "Query with both Kudu and HDFS targets" and so it 
can be removed.



--
To view, visit http://gerrit.cloudera.org:8080/16103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c
Gerrit-Change-Number: 16103
Gerrit-PatchSet: 6
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 07 Jul 2020 18:07:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9898: generate grouping set plans

2020-07-07 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16128 )

Change subject: IMPALA-9898: generate grouping set plans
..


Patch Set 10:

(1 comment)

> Uploaded patch set 10.

http://gerrit.cloudera.org:8080/#/c/16128/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test:

http://gerrit.cloudera.org:8080/#/c/16128/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test@8835
PS6, Line 8835: 75.61K
> You're right about this - we apply CARDINALITY_FILTER in the planner tests
Good to know. Thanks!



--
To view, visit http://gerrit.cloudera.org:8080/16128
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197
Gerrit-Change-Number: 16128
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 18:03:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState

2020-07-07 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16122 )

Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7
Gerrit-Change-Number: 16122
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Jul 2020 17:16:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16134 )

Change subject: IMPALA-7923: DecimalValue should be marked as packed
..


Patch Set 3:

I was a little scared since Q1 is a decimal-intensive query, but seems to be 
fine when I tried to repro:

Report Generated on 2020-07-07
Run Description: "6c8a3dfc339e43a8992af2ff3429ba5940a061ec vs 
513c19bc0a750960b97f0d4cd14a9bdc8bbd2860"

Cluster Name: UNKNOWN
Lab Run Info: UNKNOWN
Impala Version:  impalad version 4.0.0-SNAPSHOT RELEASE ()
Baseline Impala Version: impalad version 4.0.0-SNAPSHOT RELEASE (2020-07-01)

+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 10.05   | +0.18% | 10.05  | +0.18% 
|
+--+---+-++++

+--+-+---++-++---++---++-+--+
| Workload | Query   | File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval |
+--+-+---++-++---++---++-+--+
| TPCH(30) | TPCH-Q1 | parquet / none / none | 10.05  | 10.03   |   +0.18%  
 |   1.16%   |   0.64%| 20|   -0.00%   | -0.04   | 0.61 |
+--+-+---++-++---++---++-+--+


--
To view, visit http://gerrit.cloudera.org:8080/16134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed
Gerrit-Change-Number: 16134
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 16:57:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16134 )

Change subject: IMPALA-7923: DecimalValue should be marked as packed
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed
Gerrit-Change-Number: 16134
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 16:57:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16134 )

Change subject: IMPALA-7923: DecimalValue should be marked as packed
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6102/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed
Gerrit-Change-Number: 16134
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 16:57:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16146 )

Change subject: IMPALA-9921: Change error messages in checking needsQuotes to 
TRACE level logs
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b
Gerrit-Change-Number: 16146
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 07 Jul 2020 15:42:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

2020-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16112 )

Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
..


Patch Set 16:

Hit IMPALA-9923


--
To view, visit http://gerrit.cloudera.org:8080/16112
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Gerrit-Change-Number: 16112
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 15:38:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16112 )

Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
..


Patch Set 17:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6101/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16112
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Gerrit-Change-Number: 16112
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 15:36:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16112 )

Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
..


Patch Set 17: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16112
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Gerrit-Change-Number: 16112
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 15:36:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16095 )

Change subject: IMPALA-9633: Implement ds_hll_union()
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6514/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09
Gerrit-Change-Number: 16095
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 15:31:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6513/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Jul 2020 15:29:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-07-07 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 11:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
File fe/src/main/java/org/apache/impala/catalog/FeFsTable.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java@376
PS10, Line 376:
> Could you add a TODO in this method for IMPALA-9883?
Done


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2010
PS10, Line 2010: getNumFileDescriptors();
> Please change this to getNumFileDescriptors() after we revise its implement
Done


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@56
PS10, Line 56: import org.apache.thrift.TException;
> nit: unused import
Thanks for pointing these out! Done.


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@255
PS10, Line 255:   }
> nit: 4 spaces indent
Done


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@409
PS10, Line 409: public ImmutableList getFileDescriptors() {
  :   if (insertFds_.isEmpty()) return fds_;
  :   List ret = Lists.newArrayListWithCapacity(
  :   insertFds_.size() + deleteFds_.size());
  :   ret.addAll(insertFds_);
  :   ret.addAll(deleteFds_);
  :   return ImmutableList.copyOf(ret);
  : }
  :
  : @Override
  : public ImmutableList 
getInsertFileDescriptors() {
  :   return insertFds_;
  : }
  :
> Need to update these
Done


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@133
PS10, Line 133: !fileDescriptors_.isEmpty() ||
> nit: It's inefficient to construct the list for this. Maybe change to
Done


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@138
PS10, Line 138:
> nit: It's inefficient to construct the list for this. Maybe change to
Instead of the ternary operator, I rather just sum up the sizes to keep it more 
readable.


http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@45
PS9, Line 45: import org.apache.impala.analysis.NullLiteral;
> Looks like IntelliJ is more intelligent in this :D
Yeah :D It's a shame vscode fails on such a trivial task. But apart from this 
it's a great IDE.


http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1518
PS9, Line 1518:   feTable.getMetaStoreTable().getParameters()));
> PS10 looks good to me. I think the insert files and delete files are used s
I'm OK with this if you think it's good.

Yeah, I was also thinking about merging fileDescriptors_ and 
insertFileDescriptors_. My only concern is that it would be a bit weird for 
genDeleteDeltaPartition(), because it would put the delete delta descriptors 
into the regular descriptors.


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1536
PS10, Line 1536: hdfsTblRef.getDesc(), conjuncts, 
insertDeltaPartitions, hdfsTblRef,
> line too long (93 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
File fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java@497
PS10, Line 497: () {
> not *Fail* anymore
Done



[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()

2020-07-07 Thread Gabor Kaszab (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16095

to look at the new patch set (#3).

Change subject: IMPALA-9633: Implement ds_hll_union()
..

IMPALA-9633: Implement ds_hll_union()

This function receives a set of sketches produced by ds_hll_sketch()
and merges them into a single sketch.

An example usage is to create a sketch for each partition of a table,
write these sketches to a separate table and based on which partition
the user is interested of the relevant sketches can be union-ed
together to get an estimate. E.g.:
  SELECT
  ds_hll_estimate(ds_hll_union(sketch_col))
  FROM sketch_tbl
  WHERE partition_col=1 OR partition_col=5;

Testing:
  - Apart from the automated tests I added to this patch I also
tested ds_hll_union() on a bigger dataset to check that
serialization, deserialization and merging steps work well. I
took TPCH25.linelitem, created a number of sketches with grouping
by l_shipdate and called ds_hll_union() on those sketches.

Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09
---
M be/src/exprs/CMakeLists.txt
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
A be/src/exprs/datasketches-common.cc
A be/src/exprs/datasketches-common.h
M be/src/exprs/datasketches-functions-ir.cc
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test
8 files changed, 232 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/16095/3
--
To view, visit http://gerrit.cloudera.org:8080/16095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09
Gerrit-Change-Number: 16095
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16082/11/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/16082/11/tests/custom_cluster/test_local_catalog.py@30
PS11, Line 30: from tests.common.skip import (SkipIfHive2, SkipIfCatalogV2, 
SkipIfS3, SkipIfABFS,
flake8: F401 'tests.common.skip.SkipIfCatalogV2' imported but unused



--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Jul 2020 15:01:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-07-07 Thread Zoltan Borok-Nagy (Code Review)
Hello Aman Sinha, Quanlong Huang, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16082

to look at the new patch set (#11).

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..

IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive 
types)

Hive ACID supports row-level DELETE and UPDATE operations on a table.
It achieves it via assigning a unique row-id for each row, and
maintaining two sets of files in a table. The first set is in the
base/delta directories, they contain the INSERTed rows. The second set
of files are in the delete-delta directories, they contain the DELETEd
rows.

(UPDATE operations are implemented via DELETE+INSERT.)

In the filesystem it looks like e.g.:
 * full_acid/delta_001_001_/_0
 * full_acid/delta_002_002_/_0
 * full_acid/delete_delta_003_003_/_0

During scanning we need to return INSERTed rows minus DELETEd rows.
This patch implements it by creating an ANTI JOIN between the INSERT and
DELETE sets. It is a planner-only modification. Every HDFS SCAN
that scans full ACID tables (that also have deleted rows) are converted
to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE
deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is
created above them.

Later we can add support for other distribution modes if the performance
requires it. E.g. if we have too many deleted rows then probably we are
better off with PARTITIONED distribution mode. We could estimate the
number of deleted rows by sampling the delete delta files.

The current patch only works for primitive types. I.e. we cannot select
nested data if the table has deleted rows.

Testing:
 * added planner test
 * added e2e tests

Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
A testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test
M tests/custom_cluster/test_local_catalog.py
M tests/query_test/test_acid.py
27 files changed, 1,460 insertions(+), 146 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16082/11
--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] POC: Remote codegen (Milestone 1)

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14735 )

Change subject: POC: Remote codegen (Milestone 1)
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/6512/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14735
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f
Gerrit-Change-Number: 14735
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 14:16:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16000 )

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..

IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()

These functions can be used to get cardinality estimates of data
using HLL algorithm from Apache DataSketches. ds_hll_sketch()
receives a dataset, e.g. a column from a table, and returns a
serialized HLL sketch in string format. This can be written to a
table or be fed directly to ds_hll_estimate() that returns the
cardinality estimate for that sketch.

Comparing to ndv() these functions bring more flexibility as once we
fed data to the sketch it can be written to a table and next time we
can save scanning through the dataset and simply return the estimate
using the sketch. This doesn't come for free, however, as perfomance
measurements show that ndv() is 2x-3.5x faster than sketching. On the
other hand if we query the estimate from an existing sketch then the
runtime is negligible.
Another flexibility with these sketches is that they can be merged
together so e.g. if we had saved a sketch for each of the partitions
of a table then they can be combined with each other based on the
query without touching the actual data.
DataSketches HLL is sensitive for the order of the data fed to the
sketch and as a result running these algorithms in Impala gets
non-deterministic results within the error bounds of the algorithm.
In terms of correctness DataSketches HLL is most of the time in 2%
range from the correct result but there are occasional spikes where
the difference is bigger but never goes out of the range of 5%.
Even though the DataSketches HLL algorithm could be parameterized
currently this implementation hard-codes these parameters and use
HLL_4 and lg_k=12.

For more details about Apache DataSketches' HLL implementation see:
https://datasketches.apache.org/docs/HLL/HLL.html

Testing:
 - Added some tests running estimates for small datasets where the
   amount of data is small enough to get the correct results.
 - Ran manual tests on TPCH25.lineitem to compare perfomance with
   ndv(). Depending on data characteristics ndv() appears 2x-3.5x
   faster. The lower the cardinality of the dataset the bigger the
   difference between the 2 algorithms is.
 - Ran manual tests on TPCH25.lineitem and
   functional_parquet.alltypes to compare correctness with ndv(). See
   results above.

Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Reviewed-on: http://gerrit.cloudera.org:8080/16000
Tested-by: Impala Public Jenkins 
Reviewed-by: Csaba Ringhofer 
---
M be/src/codegen/impala-ir.cc
M be/src/exprs/CMakeLists.txt
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
A be/src/exprs/datasketches-functions-ir.cc
A be/src/exprs/datasketches-functions.h
M be/src/exprs/datasketches-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M testdata/data/README
A testdata/data/hll_sketches_from_hive.parquet
A testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test
A tests/query_test/test_datasketches.py
17 files changed, 505 insertions(+), 8 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Csaba Ringhofer: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 17
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16000 )

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..


Patch Set 16: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 16
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 14:08:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16000 )

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..


Patch Set 16: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 16
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 14:01:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] POC: Remote codegen (Milestone 1)

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14735 )

Change subject: POC: Remote codegen (Milestone 1)
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14735/5/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/14735/5/be/src/codegen/llvm-codegen.h@346
PS5, Line 346:   /// If 'async' is true, start executing 'FinalizeModule' in a 
separate thread and return.
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/14735/5/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

http://gerrit.cloudera.org:8080/#/c/14735/5/be/src/service/fe-support.cc@242
PS5, Line 242: status = codegen->FinalizeModule(false, false, nullptr); /// 
TODO: Check whether the arguments are ok.
line too long (106 > 90)


http://gerrit.cloudera.org:8080/#/c/14735/5/testdata/bin/server.py
File testdata/bin/server.py:

http://gerrit.cloudera.org:8080/#/c/14735/5/testdata/bin/server.py@207
PS5, Line 207: if __name__ == "__main__":
flake8: E305 expected 2 blank lines after class or function definition, found 1



--
To view, visit http://gerrit.cloudera.org:8080/14735
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f
Gerrit-Change-Number: 14735
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 13:56:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] POC: Remote codegen (Milestone 1)

2020-07-07 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/14735 )

Change subject: POC: Remote codegen (Milestone 1)
..

POC: Remote codegen (Milestone 1)

Optional remote codegen, set by a query option. Can be used together
with async.

The Impala executor does everything as usual until the point when it
would optimize and compile the LLVM module. Instead, it sends the LLVM
bitcode to the compile server which optimizes it and sends back an
object file (handled in memory, not written to disk). The executor
blocks until it receives the compiled object file.

Note that the customization of the LLVM module (replacing function
calls) is still done by the executor.

The server is a python script that calls llc to compile the code.
It will later be replaced probably by a special impalad that uses the
LLVM API to compile the code.

Communication is not secure.

Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f
---
A be/src/codegen/LLVMSmallVectorMemoryBuffer.h
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
A be/src/codegen/simple_message.h
M be/src/runtime/fragment-state.cc
M be/src/service/fe-support.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M testdata/bin/kill-all.sh
M testdata/bin/run-all.sh
A testdata/bin/server.py
13 files changed, 611 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/14735/5
--
To view, visit http://gerrit.cloudera.org:8080/14735
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f
Gerrit-Change-Number: 14735
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 13:08:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6100/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 11
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 13:08:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6511/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 11:56:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/exprs/timestamp-functions.cc
File be/src/exprs/timestamp-functions.cc:

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/exprs/timestamp-functions.cc@154
PS7, Line 154: // The purpose of making this .cc only is to avoid moving the 
code of
> I'd rather say something to indicate that the purpose of making this .cc on
Done


http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/runtime/datetime-simple-date-format-parser.cc
File be/src/runtime/datetime-simple-date-format-parser.cc:

http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/runtime/datetime-simple-date-format-parser.cc@56
PS7, Line 56: Tokenize(_DATE_TIME_CTX[i], PARSE);
> It's no longer needed to give the 3rd parameter in case it's true, right? L
Done



--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 11:27:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps

2020-07-07 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/15866 )

Change subject: IMPALA-9531: Dropped support for dateless timestamps
..

IMPALA-9531: Dropped support for dateless timestamps

Removed the support for dateless timestamps.
During dateless timestamp casts if the format doesn't contain
date part we get an error during tokenization of the format.
If the input str doesn't contain a date part then we get null result.

Examples:
select cast('01:02:59' as timestamp);
This will come back as NULL value.

select to_timestamp('01:01:01', 'HH:mm:ss');
select cast('01:02:59' as timestamp format 'HH12:MI:SS');
select cast('12 AM' as timestamp FORMAT 'AM.HH12');
These will come back with a parsing errors.

Casting from a table will generate similar results.

Testing:
Modified the previous tests related to dateless timestamps.
Added test to read fromtables which are still containing dateless
timestamps and covered timestamp to string path when no date tokens
are requested in the output string.

Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
---
M be/src/benchmarks/convert-timestamp-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/date-parse-util.cc
M be/src/runtime/date-test.cc
M be/src/runtime/datetime-iso-sql-format-tokenizer.cc
M be/src/runtime/datetime-parser-common.cc
M be/src/runtime/datetime-parser-common.h
M be/src/runtime/datetime-simple-date-format-parser.cc
M be/src/runtime/datetime-simple-date-format-parser.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M bin/rat_exclude_files.txt
M common/function-registry/impala_functions.py
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/data/README
A testdata/data/dateless_timestamps.parq
A testdata/data/dateless_timestamps.txt
M testdata/data/lazy_timestamp.csv
M testdata/workloads/functional-query/queries/QueryTest/date.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test
A 
testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M 
testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test
M tests/data_errors/test_data_errors.py
M tests/query_test/test_cast_with_format.py
M tests/query_test/test_scanners.py
34 files changed, 277 insertions(+), 231 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/10
--
To view, visit http://gerrit.cloudera.org:8080/15866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322
Gerrit-Change-Number: 15866
Gerrit-PatchSet: 10
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16112 )

Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
..


Patch Set 16: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6097/


--
To view, visit http://gerrit.cloudera.org:8080/16112
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f
Gerrit-Change-Number: 16112
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 10:28:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed

2020-07-07 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16134 )

Change subject: IMPALA-7923: DecimalValue should be marked as packed
..


Patch Set 3:

I did the test on parquet/none/none, scale factor 15. Copied the results to the 
Jira issue.
TPCH-Q1 became a regression.


--
To view, visit http://gerrit.cloudera.org:8080/16134
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed
Gerrit-Change-Number: 16134
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 10:20:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)

2020-07-07 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16082 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified 
tables (primitive types)
..


Patch Set 10:

(9 comments)

Thanks for making the change! I'm still looking into the join related logic and 
tests.

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
File fe/src/main/java/org/apache/impala/catalog/FeFsTable.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java@376
PS10, Line 376: getFilesSample
Could you add a TODO in this method for IMPALA-9883?


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2010
PS10, Line 2010: getFileDescriptors().size()
Please change this to getNumFileDescriptors() after we revise its 
implementation.


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@56
PS10, Line 56: import org.apache.orc.FileMetadata;
nit: unused import


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@255
PS10, Line 255:
nit: 4 spaces indent


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java:

http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@133
PS10, Line 133: !getFileDescriptors().isEmpty()
nit: It's inefficient to construct the list for this. Maybe change to

 return !fileDescriptors_.isEmpty() || !insertFileDescriptors_.isEmpty() || 
!deleteFileDescriptors_.isEmpty().


http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@138
PS10, Line 138: getFileDescriptors().size()
nit: It's inefficient to construct the list for this. Maybe change to

 return fileDescriptors_.isEmpty()? insertFileDescriptors_.size() + 
deleteFileDescriptors_.size() : fileDescriptors_.size();


http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@45
PS9, Line 45: import org.apache.impala.analysis.NullLiteral;
> Yeah, I use my IDE to do that for me. Interestingly, even if I delete this
Looks like IntelliJ is more intelligent in this :D


http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1518
PS9, Line 1518:   feTable.getMetaStoreTable().getParameters()));
> I implemented it in PS10, but I'm afraid that the change became a bit too i
PS10 looks good to me. I think the insert files and delete files are used 
separately so it makes sense to store them in different fileds. However, It'd 
be good to ask more feedbacks on this decision.

Maybe we can merge fileDescriptors_ and insertFileDescriptors_ to one field 
(since only one of them is valid at a time) to reduce the complexity.


http://gerrit.cloudera.org:8080/#/c/16082/10/tests/custom_cluster/test_local_catalog.py
File tests/custom_cluster/test_local_catalog.py:

http://gerrit.cloudera.org:8080/#/c/16082/10/tests/custom_cluster/test_local_catalog.py@467
PS10, Line 467:   def test_full_acid_support(self):
Could you add a test here for reading functional_orc_def.alltypes_deleted_rows 
in LocalCatalog mode?



--
To view, visit http://gerrit.cloudera.org:8080/16082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Gerrit-Change-Number: 16082
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Jul 2020 09:59:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16000 )

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6510/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 15
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 09:25:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16000 )

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..


Patch Set 16:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6099/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 16
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 08:57:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16000 )

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..


Patch Set 15:

Apparently, test_ddl.py/test_functions_ddl failed with IllegalStateException 
because for unsupported functions there were some unset members.


--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 15
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 08:57:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Gabor Kaszab (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16000

to look at the new patch set (#15).

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..

IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()

These functions can be used to get cardinality estimates of data
using HLL algorithm from Apache DataSketches. ds_hll_sketch()
receives a dataset, e.g. a column from a table, and returns a
serialized HLL sketch in string format. This can be written to a
table or be fed directly to ds_hll_estimate() that returns the
cardinality estimate for that sketch.

Comparing to ndv() these functions bring more flexibility as once we
fed data to the sketch it can be written to a table and next time we
can save scanning through the dataset and simply return the estimate
using the sketch. This doesn't come for free, however, as perfomance
measurements show that ndv() is 2x-3.5x faster than sketching. On the
other hand if we query the estimate from an existing sketch then the
runtime is negligible.
Another flexibility with these sketches is that they can be merged
together so e.g. if we had saved a sketch for each of the partitions
of a table then they can be combined with each other based on the
query without touching the actual data.
DataSketches HLL is sensitive for the order of the data fed to the
sketch and as a result running these algorithms in Impala gets
non-deterministic results within the error bounds of the algorithm.
In terms of correctness DataSketches HLL is most of the time in 2%
range from the correct result but there are occasional spikes where
the difference is bigger but never goes out of the range of 5%.
Even though the DataSketches HLL algorithm could be parameterized
currently this implementation hard-codes these parameters and use
HLL_4 and lg_k=12.

For more details about Apache DataSketches' HLL implementation see:
https://datasketches.apache.org/docs/HLL/HLL.html

Testing:
 - Added some tests running estimates for small datasets where the
   amount of data is small enough to get the correct results.
 - Ran manual tests on TPCH25.lineitem to compare perfomance with
   ndv(). Depending on data characteristics ndv() appears 2x-3.5x
   faster. The lower the cardinality of the dataset the bigger the
   difference between the 2 algorithms is.
 - Ran manual tests on TPCH25.lineitem and
   functional_parquet.alltypes to compare correctness with ndv(). See
   results above.

Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
---
M be/src/codegen/impala-ir.cc
M be/src/exprs/CMakeLists.txt
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
A be/src/exprs/datasketches-functions-ir.cc
A be/src/exprs/datasketches-functions.h
M be/src/exprs/datasketches-test.cc
M be/src/exprs/scalar-expr-evaluator.cc
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M testdata/data/README
A testdata/data/hll_sketches_from_hive.parquet
A testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test
A tests/query_test/test_datasketches.py
17 files changed, 505 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/16000/15
--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 15
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support

2020-07-07 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16140 )

Change subject: IMPALA-9917: grouping() and grouping_id() support
..


Patch Set 7:

(7 comments)

Mostly minor comments.

http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@2
PS7, Line 2:
nit: remove newline


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@352
PS7, Line 352: groupingIdExprs.add(aggExpr);
The groupingIdExprs list does not get used ?


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@394
PS7, Line 394:   // special-casingthe degenerate case of these functions 
being used with a single
nit: add space between 'casing', 'the'


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@398
PS7, Line 398:   throw new AnalysisException("Aggregate function '" +
It's probably ok to reject the query but strictly speaking, grouping() function 
can occur with just a simple GROUP BY also without grouping sets.  Hive 
actually supports it. All grouping() values show up as 0 in that case.


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@586
PS7, Line 586:   String name = aggExpr.getFnName().getFunction();
'name' variable does not get used.


http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@667
PS7, Line 667:* Return the return value for grouping() or grouping_id() for 
a particular aggregation
nit: remove the second 'return'


http://gerrit.cloudera.org:8080/#/c/16140/7/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test:

http://gerrit.cloudera.org:8080/#/c/16140/7/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test@1169
PS7, Line 1169: # introduce NULLs.
nit:  not clear what 'introduce nulls' means here.



--
To view, visit http://gerrit.cloudera.org:8080/16140
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6
Gerrit-Change-Number: 16140
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Jul 2020 07:02:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16000 )

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..


Patch Set 14: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 14
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 06:16:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()

2020-07-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16000 )

Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate()
..


Patch Set 14:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6098/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe
Gerrit-Change-Number: 16000
Gerrit-PatchSet: 14
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Jul 2020 06:16:39 +
Gerrit-HasComments: No