[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 05 Aug 2020 05:58:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16286 )

Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6229/


--
To view, visit http://gerrit.cloudera.org:8080/16286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08
Gerrit-Change-Number: 16286
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 05:29:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984

2020-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16288 )

Change subject: IMPALA-10047: Revert core piece of IMPALA-6984
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16288
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
Gerrit-Change-Number: 16288
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 03:40:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16252 )

Change subject: IMPALA-9988 (part 2): Integrate ldap filters and 
impala.doas.user
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
Gerrit-Change-Number: 16252
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 03:32:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16288 )

Change subject: IMPALA-10047: Revert core piece of IMPALA-6984
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6793/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16288
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
Gerrit-Change-Number: 16288
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 01:14:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add logging when query unregisters

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16285 )

Change subject: Add logging when query unregisters
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6228/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
Gerrit-Change-Number: 16285
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 01:03:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add logging when query unregisters

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16285 )

Change subject: Add logging when query unregisters
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
Gerrit-Change-Number: 16285
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 01:03:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16286 )

Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6229/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08
Gerrit-Change-Number: 16286
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 01:03:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16286 )

Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08
Gerrit-Change-Number: 16286
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 01:03:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add logging when query unregisters

2020-08-04 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16285 )

Change subject: Add logging when query unregisters
..


Patch Set 2: Code-Review+2

Carrying forward Tim's +2


--
To view, visit http://gerrit.cloudera.org:8080/16285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
Gerrit-Change-Number: 16285
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 01:02:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add logging when query unregisters

2020-08-04 Thread Bikramjeet Vig (Code Review)
Hello Sahil Takiar, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16285

to look at the new patch set (#2).

Change subject: Add logging when query unregisters
..

Add logging when query unregisters

This adds a log line which is printed when a query is successfully
unregistered by the async unregister thread pool. Added only for
additional observability.

Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
---
M be/src/service/impala-server.cc
1 file changed, 6 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/16285/2
--
To view, visit http://gerrit.cloudera.org:8080/16285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
Gerrit-Change-Number: 16285
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16286 )

Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6792/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08
Gerrit-Change-Number: 16286
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 00:50:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add logging when query unregisteres

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16285 )

Change subject: Add logging when query unregisteres
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6791/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
Gerrit-Change-Number: 16285
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 00:50:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16288 )

Change subject: IMPALA-10047: Revert core piece of IMPALA-6984
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6227/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16288
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
Gerrit-Change-Number: 16288
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 00:50:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984

2020-08-04 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16288


Change subject: IMPALA-10047: Revert core piece of IMPALA-6984
..

IMPALA-10047: Revert core piece of IMPALA-6984

Performance testing on TPC-DS found a peformance regression
on short queries due to delayed exec status reports. Further
testing traced this back to IMPALA-6984's behavior of
cancelling backends on EOS. The coordinator log show that
CancelBackends() call intermittently taking 10 seconds due
to timing out in the RPC layer.

As a temporary workaround, this reverts the core part of
IMPALA-6984 that added that CancelBackends() call for EOS.
It leaves the rest of IMPALA-6984 intact, as other code has built
on top of it.

Testing:
 - Core job
 - Performance tests

Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
(cherry picked from commit b91f3c0e064d592f3cdf2a2e089ca6546133ba55)
---
M be/src/runtime/coordinator.cc
1 file changed, 1 insertion(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/16288/1
--
To view, visit http://gerrit.cloudera.org:8080/16288
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
Gerrit-Change-Number: 16288
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6226/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 05 Aug 2020 00:44:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node

2020-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16286 )

Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08
Gerrit-Change-Number: 16286
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 00:35:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] Add logging when query unregisteres

2020-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16285 )

Change subject: Add logging when query unregisteres
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16285/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16285/1//COMMIT_MSG@7
PS1, Line 7: Add logging when query unregisteres
nit: unregisters



--
To view, visit http://gerrit.cloudera.org:8080/16285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
Gerrit-Change-Number: 16285
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 05 Aug 2020 00:33:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10037: Remove flaky test mt dop scan node

2020-08-04 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16286


Change subject: IMPALA-10037: Remove flaky test_mt_dop_scan_node
..

IMPALA-10037: Remove flaky test_mt_dop_scan_node

This test has inherent flakiness due to it relying on instances
fetching scan ranges from a shared queue. Therefore, this patch
removes the test since it was just a sanity check but its flakiness
outweighed its usefulness.

Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08
---
M tests/query_test/test_mt_dop.py
1 file changed, 1 insertion(+), 42 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/16286/1
--
To view, visit http://gerrit.cloudera.org:8080/16286
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1625872189ea7ac2d4e4d035956f784b6e18eb08
Gerrit-Change-Number: 16286
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Add logging when query unregisteres

2020-08-04 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16285


Change subject: Add logging when query unregisteres
..

Add logging when query unregisteres

This adds a log line which is printed when a query is successfully
unregistered by the async unregister thread pool. Added only for
additional observability.

Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
---
M be/src/service/impala-server.cc
1 file changed, 6 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/16285/1
--
To view, visit http://gerrit.cloudera.org:8080/16285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I09be63afbee6b338a952a9b12321e028be9d7cb0
Gerrit-Change-Number: 16285
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-04 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 2:

For the variants that are added, could you add a comment about what the change 
from official version is.
Also, we should add all the tpcds queries to the PlannerTest's tpcds-all.test 
such that we can track the Explains. Is there a separate JIRA for that ?


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 23:30:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16227 )

Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate
..

IMPALA-9984: Implement codegen for TupleIsNullPredicate

This commit implements proper codegen for TupleIsNullPredicate.

Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
Reviewed-on: http://gerrit.cloudera.org:8080/16227
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exprs/CMakeLists.txt
A be/src/exprs/tuple-is-null-predicate-ir.cc
M be/src/exprs/tuple-is-null-predicate.cc
M be/src/exprs/tuple-is-null-predicate.h
6 files changed, 152 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
Gerrit-Change-Number: 16227
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16227 )

Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
Gerrit-Change-Number: 16227
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 23:10:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 2: Code-Review+1

LGTM. I can +2 but wanted to give others a chnace to have a look.


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 23:00:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16263 )

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py
File docker/setup_build_context.py:

http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py@87
PS3, Line 87: .py
> I'm fine with excluding all the python files from this GCC directory. I don
+1, if we're copying them into the container, it's a mistake



--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 22:58:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16269 )

Change subject: IMPALA-9909: Print body of http error code in Impala Shell.
..

IMPALA-9909: Print body of http error code in Impala Shell.

Make Impala Shell closer to Impyla by printing the body of any http
error code message received when using hs2-over-http. The common case is
that there is nothing in the body, in which case the behavior is
unchanged.

TESTING
 Added a test for the new functionality.
 Ran all end-to-end tests.

Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff
Reviewed-on: http://gerrit.cloudera.org:8080/16269
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M shell/ImpalaHttpClient.py
M tests/shell/test_shell_interactive.py
2 files changed, 90 insertions(+), 25 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff
Gerrit-Change-Number: 16269
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16269 )

Change subject: IMPALA-9909: Print body of http error code in Impala Shell.
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff
Gerrit-Change-Number: 16269
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 04 Aug 2020 22:33:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16252 )

Change subject: IMPALA-9988 (part 2): Integrate ldap filters and 
impala.doas.user
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6225/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
Gerrit-Change-Number: 16252
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 22:32:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16252 )

Change subject: IMPALA-9988 (part 2): Integrate ldap filters and 
impala.doas.user
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6790/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
Gerrit-Change-Number: 16252
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 21:51:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-04 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16263 )

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..


Patch Set 3: Code-Review+1

(2 comments)

This makes sense to me. I can't think of any reason not to strip the debug 
symbols here, and it's great to save the space.

http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py
File docker/setup_build_context.py:

http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py@87
PS3, Line 87: .py
> Do we need to spell "-gdb.py" out here?
I'm fine with excluding all the python files from this GCC directory. I don't 
expect us to need any.


http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py@91
PS3, Line 91:   check_call([STRIP, "--strip-debug", libstdcpp_so, "-o",
:   os.path.join(LIB_DIR, 
os.path.basename(libstdcpp_so))])
Nit: I think it would be good to factor out this strip call into a function 
similar to symlink_file_into_dir().



--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 21:51:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6222/


--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 04 Aug 2020 21:41:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

2020-08-04 Thread Thomas Tauber-Marshall (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16252

to look at the new patch set (#2).

Change subject: IMPALA-9988 (part 2): Integrate ldap filters and 
impala.doas.user
..

IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

This patch fixes the integration between LDAP filters and proxy
users by ensuring that the 'impala.doas.user' HS2 config option is
considered when applying filters. This requires deferring checking the
filters until the OpenSession() call.

This patch also introduces new flags --ldap_bind_dn and
--ldap_bind_password which must be specified in order to use LDAP
filters, unless the LDAP server is set up to allow anonymous binds.

These config options are modeled after equivalent options in Hue:
https://github.com/cloudera/hue/blob/master/desktop/conf.dist/hue.ini#L425

Testing:
- Added a test that uses the 'impala.doas.user' config with LDAP
  filters.

Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
---
M be/src/rpc/authentication.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/ldap-util.cc
M be/src/util/ldap-util.h
M be/src/util/webserver.cc
M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java
M fe/src/test/java/org/apache/impala/customcluster/LdapImpalaShellTest.java
M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java
M fe/src/test/java/org/apache/impala/testutil/LdapUtil.java
11 files changed, 200 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/16252/2
--
To view, visit http://gerrit.cloudera.org:8080/16252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
Gerrit-Change-Number: 16252
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..


Patch Set 27:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6789/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 20:40:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-08-04 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#27). ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..

IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

This work addresses the current limitation in computing the total row
count for a Hive table in a scan. The row count can be incorrectly
computed as 0, even though there exists data in the Hive table. This
is the stats corruption at table level. Similar stats corruption
exists for a partition. The row count of a table or a partition
sometime can also be -1 which indicates a missing stats situation.

In the fix, as long as no partition in a Hive table exhibits any
missing or corrupt stats, the total row count for the table is computed
from the row counts in all partitions. Otherwise, Impala looks at
the table level stats particularly the table row count.

In addition, if the table stats is missing or corrupted, Impala
estimates a row count for the table, if feasible. This row count is
the sum of the row count from the partitions with good stats, and
an estimation of the number of rows in the partitions with missing or
corrupt stats. Such estimation also applies when some partition
has missing or corrupt stats.

One way to observe the fix is through the explain of queries scanning
Hive tables with missing or corrupted stats. The cardinality for any
full scan should be a positive value (i.e. the estimated row count),
instead of 'unavailable'.  At the beginning of the explain output,
that table is still listed in the WARNING section for potentially
corrupt table statistics.

Testing:
1. Ran unit tests with queries documented in the case against Hive
   tables with the following configrations:
   a. No stats corruption in any partitions
   b. Stats corruption in some partitions
   c. Stats corruption in all partitions
2. Added two new tests in test_compute_stats.py:
   a. test_corrupted_stats_in_partitioned_Hive_tables
   b. test_corrupted_stats_in_unpartitioned_Hive_tables
3. Fixed failures in corrupt-stats.test
4. Ran "core" test

Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/corrupt-stats.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/metadata/test_compute_stats.py
M tests/metadata/test_explain.py
13 files changed, 236 insertions(+), 82 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/16098/27
--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..


Patch Set 5:

(1 comment)

Add one more comment. Thanks!

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3245
PS5, Line 3245: IS_OR_PREDICATE.appl
If e is a conjunct, I think we also need to subject it to the intersection test.



--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 04 Aug 2020 19:47:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..


Patch Set 5:

(10 comments)

Hi Xianqing, thank you so much for this contribution! I'll need to do another 
pass and go over the tests but here are some initial comments.

http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG@17
PS5, Line 17: one null rejecting condition on the inner table.
Consider adding a query option like DISABLE_OUTER_TO_INNER_REWRITE so disable 
this optimization if needed as runtime.


http://gerrit.cloudera.org:8080/#/c/16266/5//COMMIT_MSG@34
PS5, Line 34: * Ran the full set of verifications in Impala Public Jenkins
Please try out TPC-DS Q49 there the LOJ queries in there should be rewritten.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3217
PS5, Line 3217: getWhereClauseConjuncts(
Technically this would include having clause conjuncts as well, so might be 
misleading to name this function getWhereClauseConuncts.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3227
PS5, Line 3227:   }
As a further optimization you could use getEquivClassesOnTuples() to also check 
for null filtering conditions that come as a result of a transitive 
relationship.

For example T1 LEFT OUTER JOIN T2 ON (T1.a = T2.a) JOIN T3 ON (T3.b=T2.b) WHERE 
T3.b > 10;


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3270
PS5, Line 3270: analyzeNoThrow
> For some common SQL functions,  we probably can directly test their existen
Agreed it would be good to have some static expressions that we know won't 
reject nulls for example.

col IS NULL
col1 IS DISTINCT FROM col2

for things like IN and COALESCE you would recursively check the children.

IF and CASE are trickier so you might want to call the BE or just skip those.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3427
PS5, Line 3427: // Recompute the graph since we may need to add 
value-transfer edges based on the
See later comment in Planner, but it might be better to return this and have 
the caller recompute the graph.


http://gerrit.cloudera.org:8080/#/c/16266/4/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/16266/4/fe/src/main/java/org/apache/impala/analysis/Expr.java@980
PS4, Line 980: his instanceof CompoundPredicate
 : && ((CompoundPredicate) this).getOp() == 
CompoundPredicate.Operator.OR
You could use Expr.IS_OR_PREDICATE(this) here.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Expr.java@978
PS5, Line 978:   public List getDisjunctiveConjuncts() {
There is something off about this interface. You assume the caller the first 
time this is called has verified that the predicate is an OR. For example is 
someone called this function with just a plan Expr then it would return the 
Expr back.

You might want to move the IS_OR_PREDICATE call from Analyzer.java#3245 into 
it's own wrapper function, which then calls this method.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@748
PS5, Line 748: // Transform outer join into inner join whenever possible
Might want to use some state in the analyzer to check if any Outer Joins exist 
in the query and only then call this function then. For example 
globalstate_.outerJountTupleIds.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@749
PS5, Line 749: analyzer.simplifyOuterJoins(selectStmt.getTableRefs());
It would be if you returned some indicator that the value transfer graph needs 
to be recomputed. Then recompute the graph here so you can make the time line 
event accordingly.

ctx_.getTimeline().markEvent("Recomputing value transfer graph")

Also if the SingleNodePlanner's valueTransferGraphNeedsUpdate_ was set to true 
you could likely reset it after you recompute the graph.



--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16227 )

Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
Gerrit-Change-Number: 16227
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 18:00:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16227 )

Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6224/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
Gerrit-Change-Number: 16227
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 18:00:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate

2020-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16227 )

Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
Gerrit-Change-Number: 16227
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 17:59:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16227 )

Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6788/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
Gerrit-Change-Number: 16227
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 17:58:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9984: Implement codegen for TupleIsNullPredicate

2020-08-04 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/16227 )

Change subject: IMPALA-9984: Implement codegen for TupleIsNullPredicate
..

IMPALA-9984: Implement codegen for TupleIsNullPredicate

This commit implements proper codegen for TupleIsNullPredicate.

Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exprs/CMakeLists.txt
A be/src/exprs/tuple-is-null-predicate-ir.cc
M be/src/exprs/tuple-is-null-predicate.cc
M be/src/exprs/tuple-is-null-predicate.h
6 files changed, 152 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/16227/4
--
To view, visit http://gerrit.cloudera.org:8080/16227
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I410aa7ec762ca16f455bd7da1dce763c1a7b156e
Gerrit-Change-Number: 16227
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16269 )

Change subject: IMPALA-9909: Print body of http error code in Impala Shell.
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6223/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff
Gerrit-Change-Number: 16269
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 04 Aug 2020 17:19:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16269 )

Change subject: IMPALA-9909: Print body of http error code in Impala Shell.
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff
Gerrit-Change-Number: 16269
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 04 Aug 2020 17:19:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6221/


--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 04 Aug 2020 17:02:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9909: Print body of http error code in Impala Shell.

2020-08-04 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16269 )

Change subject: IMPALA-9909: Print body of http error code in Impala Shell.
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16269
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabc45eda0b87ca694b8359148cda6a7c1d5a8fff
Gerrit-Change-Number: 16269
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 04 Aug 2020 16:59:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator

2020-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16240 )

Change subject: WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for 
large read/write pages in GroupingAggregator
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16240/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16240/5//COMMIT_MSG@17
PS5, Line 17: To be specific, we save extra reservation for writing a large 
page. It's
I'll need to look in more detail but I think the overal approach makes sense.


http://gerrit.cloudera.org:8080/#/c/16240/5//COMMIT_MSG@35
PS5, Line 35: This patch also fixes the wrong assumption that non-streaming
Maybe I missed something when I initially did this, but I didn't think we need 
to be able to fit all the hash tables in memory because we could repartition 
until we can fit a single partition in memory.

I think this change is probably fine anyway, to avoid repartitioning, because 
the increase in reservation is very small.



--
To view, visit http://gerrit.cloudera.org:8080/16240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775
Gerrit-Change-Number: 16240
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 16:52:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 16:46:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6222/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 04 Aug 2020 16:38:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6219/


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 16:38:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6787/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 04 Aug 2020 15:42:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-04 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16263 )

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..


Patch Set 3:

(2 comments)

Looks good to me.

http://gerrit.cloudera.org:8080/#/c/16263/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16263/3//COMMIT_MSG@9
PS3, Line 9: so
Just wonder if some other .so files in toolchain are worth the stripping effort.

[11:30:03 qchen@qifan-10229: Impala] find . -name lib*so -exec file {} \; | 
grep "not stripped"
./toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/lib/libthriftz-0.11.0.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
not stripped
./toolchain/toolchain-packages-gcc7.5.0/thrift-0.11.0-p2/lib/libthrift-0.11.0.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
not stripped
./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-asserts-p2/lib/clang/5.0.1/lib/linux/libclang_rt.asan-x86_64.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
with debug_info, not stripped
./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-asserts-p2/lib/clang/5.0.1/lib/linux/libclang_rt.dyndd-x86_64.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically 
linked, with debug_info, not stripped
./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-asserts-p2/lib/clang/5.0.1/lib/linux/libclang_rt.ubsan_standalone-x86_64.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
with debug_info, not stripped
./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-asserts-p2/lib/clang/5.0.1/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
with debug_info, not stripped
./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-p2/lib/clang/5.0.1/lib/linux/libclang_rt.asan-x86_64.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
with debug_info, not stripped
./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-p2/lib/clang/5.0.1/lib/linux/libclang_rt.dyndd-x86_64.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically 
linked, with debug_info, not stripped
./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-p2/lib/clang/5.0.1/lib/linux/libclang_rt.ubsan_standalone-x86_64.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
with debug_info, not stripped
./toolchain/toolchain-packages-gcc7.5.0/llvm-5.0.1-p2/lib/clang/5.0.1/lib/linux/libclang_rt.ubsan_standalone_cxx-x86_64.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
with debug_info, not stripped
./toolchain/toolchain-packages-gcc7.5.0/gdb-7.9.1-p1/lib/libinproctrace.so: ELF 
64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not 
stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libsureware.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
not stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libcswift.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
not stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/lib4758cca.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
not stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libaep.so: 
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not 
stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libcapi.so: 
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not 
stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libubsec.so: 
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not 
stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libatalla.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
not stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libpadlock.so:
 ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, 
not stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libnuron.so: 
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not 
stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libchil.so: 
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not 
stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libgmp.so: 
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not 
stripped
./toolchain/toolchain-packages-gcc7.5.0/openssl-1.0.2l/lib/engines/libgost.so: 
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), 

[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-04 Thread Joe McDonnell (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16278

to look at the new patch set (#2).

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..

IMPALA-10005: Fix Snappy decompression for non-block filesystems

Snappy-compressed text always uses THdfsCompression::SNAPPY_BLOCKED
type compression in the backend. However, for non-block filesystems,
the frontend is incorrectly passing THdfsCompression::SNAPPY instead.
On debug builds, this leads to a DCHECK when trying to read
Snappy-compressed text. On release builds, it fails to decompress
the data.

This fixes the frontend to always pass THdfsCompression::SNAPPY_BLOCKED
for Snappy-compressed text.

This reworks query_test/test_compressed_formats.py to provide better
coverage:
 - Changed the RC and Seq test cases to verify that the file extension
   doesn't matter. Added Avro to this case as well.
 - Fixed the text case to use appropriate extensions (fixing IMPALA-9004)
 - Changed the utility function so it doesn't use Hive. This allows it
   to be enabled on non-HDFS filesystems like S3.
 - Changed the test to use unique_database and allow parallel execution.
 - Changed the test to run in the core job, so it now has coverage on
   the usual S3 test configuration. It is reasonably quick (1-2 minutes)
   and runs in parallel.

Testing:
 - Exhaustive job
 - Core s3 job
 - Changed the frontend to force it to use the code for non-block
   filesystems (i.e. the TFileSplitGeneratorSpec code) and
   verified that it is now able to read Snappy-compressed text.

Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
---
M fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java
M tests/query_test/test_compressed_formats.py
2 files changed, 132 insertions(+), 84 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/16278/2
--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..


Patch Set 5:

(14 comments)

Thanks for the work.

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3233
PS5, Line 3233: where clause
Suggest to remove to make the comment more precise.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3234
PS5, Line 3234: .
Suggest to add additional comment here to describe the use of the method, such 
as: This method identifies null-rejecting predicates which are the requirements 
to convert an outer-join to an inner join.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3240
PS5, Line 3240: contains
nit. containing


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3242
PS5, Line 3242: t1.v1
you mean t2.v2?


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3254
PS5, Line 3254: intersect.isEmpty()
For any disConjunct, when "ids intersect disConjunct != disConjunct", then 
disConjuncts should be skipped. The test here seems not sufficient.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3264
PS5, Line 3264: ULL input, eg, ISNULL(), IFNULL(), ZEROIFNULL().
We may need to reject UDFs as these functions can maintain a state which could 
allow the function to return different outputs for a given input. That is, we 
can not guarantee that such a UDF would not produce a NULL given a NULL input.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3270
PS5, Line 3270: analyzeNoThrow
For some common SQL functions,  we probably can directly test their existence 
and bypass the evaluation logic, assuming the evaluation during compile time is 
relatively expensive.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3278
PS5, Line 3278: if (!isTr
It is a good idea to add a comment here.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3289
PS5, Line 3289: ex);
We probably should return false here.


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3361
PS5, Line 3361: inner
null-filling table


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3362
PS5, Line 3362: inner
null-filling


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3364
PS5, Line 3364: inner
null-filling


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3365
PS5, Line 3365: null filtering
null-rejecting


http://gerrit.cloudera.org:8080/#/c/16266/5/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3375
PS5, Line 3375: case INNER_JOIN: {
  : break;
Probably can be moved to the last 'default' section (of switch).



--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 04 Aug 2020 15:10:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6786/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 22
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 13:34:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-04 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#22). ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..

IMPALA-9989 Improve admission control pool stats logging

This work addresses the current limitation in admission controller by
appending the last known memory consumption statistics about a pool or
a host to the existing memory exhaustion message. The message is
logged in impalad.INFO when a query is queued or timed out due to
memory pressure on the pool or on the host.

This new memory consumption statistics covers the following content:
  topN_query_stats ::=
queries: a list of query Ids for up to 5 queries with top memory
 consumptions
total_mem_consumed: total memory consumed by these topN queries
percentage_mem_consumed_per_pool: total memory consumed divided
  by pool memory usage (if
  feasible to report)
  all_query_stats ::=
min: the minimal memory consumption of all running queries
max: the maximal memory consumption of all running queries
total: the total memory consumption of all running queries
average: the average memory consumption of all running queries
 (if feasible to report)

  pool_stats_per_host ::=
:  
  pool_stats::=
List of 

  host_stats_per_pool ::=
: 
  host_stats::=
List of 

  memory_consumption_statistics ::=
 | 

pool_stats describes memory consumption in all pools in a host
and is useful in analyzing memory exhaustion in that host.
host_stats describes the memory consumption for all hosts in a pool
and is useful in analyzing memory exhaustion in that pool.

Example of pool_stats_per_host:

   pool_name=root.queueD:
 topN_query_stats:
queries=[
   0003:0012,
   0003:0011
],
total_mem_consumed=18.00 MB
fraction_of_pool_total_mem=0.19
 all_query_stats:
num_running=20,
min=1.00 MB,
max=9.00 MB,
total_mem_consumed=95.00 MB,
average=4.75 MB

Example of host_stats_per_pool:

   host_name=host2:25000:
 topN_query_stats:
queries=[
   00020002:0001,
   00020002:0002,
   00020002:,
   00020002:0004
],
total_mem_consumed=55.00 MB

When a query request is queued due to memory exhaustion, the above
memory_consumption_statistics is loggerd when the logging is set
at level 2 or higher.

When a query request is timed out due to memory exhaustion, the above
memory_consumption_statistics is reported when the logging is set
at level 1 or higher.

Testing:
1. Added a new test TopNQueryCheck in admission-controller-test.cc to
simulate queries running in 4 pools in 3 hosts. This new test identifies
the following:
  a. Top 5 queries among 4 pools in host 0;
  a. Top 5 queries among 4 pools in host 1;
  c. Top 5 queries among 3 hosts for a pool.
2. Core tests.

Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
---
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/util/container-util.h
M common/thrift/StatestoreService.thrift
7 files changed, 828 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/22
--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 22
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6785/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 17
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 04 Aug 2020 13:08:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-04 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 17:

(4 comments)

Hi anjalinorwood, thanks for your review!

http://gerrit.cloudera.org:8080/#/c/16143/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16143/16//COMMIT_MSG@29
PS16, Line 29: We achieved this function by treating the iceberg table as normal
> Are there plans to support read of Iceberg table as a partitioned table? Th
Yes, you are right, partitioned table is useful for query plan. But we may not 
consider this in our first version, since it's hard to treat Iceberg table as 
an partitioned hdfs table.
But we will definitely do this in next version, including: compute incremental 
stats/query plan optimization for iceberg and so on. This patch is just a 
simple version to scan iceberg table.


http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
File fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java:

http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java@80
PS16, Line 80: }
> The double negative is pretty hard to parse.
Done


http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@139
PS16, Line 139: if (!(predicate.getChild(0) instanceof SlotRef)) return 
false;
> Can the predicate be of the form: '10 = p1'? In that case, should there be
yes, it is. I've already test by debug, sql like this: select * from table 
where 0=id can also pushdown predicate to iceberg.


http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/16143/16/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@114
PS16, Line 114: if ("PARQUET".equalsIgnoreCase(format)) return 
TIcebergFileFormat.PARQUET;
> Rest of the code seems to support Iceberg ORC file format. This code does n
I supported ORC format in original version, but when I test scan iceberg table 
with ORC, I found exception: https://issues.apache.org/jira/browse/IMPALA-9967
So I removed this, maybe supported more file format in other patch.



--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 17
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 04 Aug 2020 12:40:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-04 Thread wangsheng (Code Review)
Hello Zoltan Borok-Nagy, Anonymous Coward (606), Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16143

to look at the new patch set (#17).

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..

IMPALA-9741: Support querying Iceberg table by impala

This patch mainly realizes the querying of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When querying iceberg table, we pushdown
partition column predicates to iceberg to decide which data files
need to be scanned, and then transfer this information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in test_scanners.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet
A 

[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-04 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 2: Code-Review+1

LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 04 Aug 2020 12:12:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6784/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 04 Aug 2020 12:10:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6783/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 12:04:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6221/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 04 Aug 2020 11:51:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP

2020-08-04 Thread Xianqing He (Code Review)
Xianqing He has abandoned this change. ( http://gerrit.cloudera.org:8080/15614 )

Change subject: WIP
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/15614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I567bdcad0bcdbfeb539ed590e509533228cb528c
Gerrit-Change-Number: 15614
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6220/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 11:44:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Zoltan Borok-Nagy (Code Review)
Hello Aman Sinha, Gabor Kaszab, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16228

to look at the new patch set (#6).

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..

IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex 
types)

This implements scanning full ACID tables that contain complex types.
The same technique works that we use for primitive types. I.e. we add
a LEFT ANTI JOIN on top of the Hdfs scan node in order to subtract
the deleted rows from the inserted rows.

However, there were some types of queries where we couldn't do that.
These are the queries that scan the nested collection items directly.

E.g.: SELECT item FROM complextypestbl.int_array;

The above query only creates a single tuple descriptor that holds the
collection items. Since this tuple descriptor is not at the table-level,
we cannot add slot references to the hidden ACID column which are at the
top level of the table schema.

To resolve this I added a statement rewriter that rewrites the above
statement to the following:

  SELECT item FROM complextypestbl $a$1, $a$1.int_array;

Now in this example we'll have two tuple descriptors, one for the
table-level, and one for the collection item. So we can add the ACID
slot refs to the table-level tuple descriptor. The rewrite is
implemented by the new AcidRewriter class.

Testing
* Added planner tests to PlannerTest/acid-scans.test
* E2E query tests to QueryTest/full-acid-complex-type-scans.test
* E2E tests for rowid-generation: QueryTest/full-acid-rowid.test

Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
A 
testdata/workloads/functional-query/queries/QueryTest/full-acid-complex-type-scans.test
M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test
M tests/query_test/test_acid.py
13 files changed, 923 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16228/6
--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Xianqing He (Code Review)
Xianqing He has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..

IMPALA-5022: Outer join simplification

As a general rule, an outer join can be converted to an inner join if there is a
condition on the inner table that filters out non‑matching rows. In a left 
outer join,
the right table is the inner table, while it is the left table in a right outer 
join.
In a full outer join, both tables are inner tables. Conditions that are FALSE 
for
nulls are referred to as null filtering conditions, and these are the 
conditions that
enable the outer‑to‑inner join conversion to be made.

An outer join can be converted to an inner join if the WHERE clause contains at 
least
one null rejecting condition on the inner table.

For example,
1. A LEFT JOIN B ON A.id = B.id WHERE B.v > 10
= A INNER JOIN B ON A.id = B.id WHERE B.v > 10
2. A RIGHT JOIN B ON A.id = B.id WHERE A.v > 10
= A INNER JOIN B ON A.id = B.id WHERE B.v > 10
3. A FULL JOIN B ON A.id = B.id WHERE A.v > 10
= A LEFT JOIN B ON A.id = B.id WHERE A.v > 10
4. A FULL JOIN B ON A.id = B.id WHERE B.v > 10
= A RIGHT JOIN B ON A.id = B.id WHERE B.v > 10
5. A FULL JOIN B ON A.id = B.id WHERE A.v > 10 AND B.v > 10
= A INNER JOIN B ON A.id = B.id WHERE A.v > 10 AND B.v > 10

Tests:
* Update the baseline plan Tests
* Add some plan tests in outer-joins.test
* Ran the full set of verifications in Impala Public Jenkins

Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
21 files changed, 1,544 insertions(+), 967 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/16266/5
--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 5
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xianqing He 


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6218/


-- 
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 11:35:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6219/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 2
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 11:32:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6782/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 10:49:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6781/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 10:42:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Zoltan Borok-Nagy (Code Review)
Hello Aman Sinha, Gabor Kaszab, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16228

to look at the new patch set (#5).

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..

IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex 
types)

This implements scanning full ACID tables that contain complex types.
The same technique works that we use for primitive types. I.e. we add
a LEFT ANTI JOIN on top of the Hdfs scan node in order to subtract
the deleted rows from the inserted rows.

However, there were some types of queries where we couldn't do that.
These are the queries that scan the nested collection items directly.

E.g.: SELECT item FROM complextypestbl.int_array;

The above query only creates a single tuple descriptor that holds the
collection items. Since this tuple descriptor is not at the table-level,
we cannot add slot references to the hidden ACID column which are at the
top level of the table schema.

To resolve this I added a statement rewriter that rewrites the above
statement to the following:

  SELECT item FROM complextypestbl $a$1, $a$1.int_array;

Now in this example we'll have two tuple descriptors, one for the
table-level, and one for the collection item. So we can add the ACID
slot refs to the table-level tuple descriptor. The rewrite is
implemented by the new AcidRewriter class.

Testing
* Added planner tests to PlannerTest/acid-scans.test
* E2E query tests to QueryTest/full-acid-complex-type-scans.test
* E2E tests for rowid-generation: QueryTest/full-acid-rowid.test

Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
A 
testdata/workloads/functional-query/queries/QueryTest/full-acid-complex-type-scans.test
M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test
M tests/query_test/test_acid.py
12 files changed, 922 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16228/5
--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6218/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 10:23:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 5:

PS5 is a rebase.


--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 10:23:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16228 )

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@385
PS2, Line 385: require
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1516
PS2, Line 1516: for (int i = 0; i < stmt.fromClause_.size(); ++i) {
  : TableRef tblRef = stmt.fromClause_.get(i);
> nit: you can iterate over fromClause_.getTableRefs() and then you can use a
splitCollectionRef() needs the index of the table ref.


http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1541
PS2, Line 1541: int tableRefIdx
> Instead of the index you can use the CollectionTableRef itself as a param.
Yeah, I need the index to modify the FromClause.


http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1556
PS2, Line 1556: Preconditions.checkSta
> Could you add a comment what is at position '0' here? (I guess in L1553 it'
Done


http://gerrit.cloudera.org:8080/#/c/16228/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1576
PS2, Line 1576:   return rawTblPath;
> Shouldn't this function belong to TableRef as a static member function?
Done


http://gerrit.cloudera.org:8080/#/c/16228/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/16228/3/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@1508
PS3, Line 1508:*   SELECT item FROM complextypestbl $a$1, $a$1.int_array;
> I need to understand the current complex types support (independent of ACID
Complex types are evaluated in a subplan, like the following:

 | 01:SUBPLAN
 | |  row-size=16B cardinality=25.68K
 | |
 | |--04:NESTED LOOP JOIN [CROSS JOIN]
 | |  |  row-size=16B cardinality=10
 | |  |
 | |  |--02:SINGULAR ROW SRC
 | |  | row-size=12B cardinality=1
 | |  |
 | |  03:UNNEST [$a$1.int_array int_array]
 | | row-size=0B cardinality=10
 | |
 | 00:SCAN HDFS [functional_orc_def.complextypestbl $a$1]
 |HDFS partitions=1/1 files=2 size=4.04KB
 |predicates: !empty($a$1.int_array)
 |row-size=12B cardinality=2.57K

The left side of the SUBPLAN is the "input". The right side is the "subplan 
tree", it processes rows one-by-one from the "input". And the subplan will emit 
rows produced by the "subplan tree". So in this case the right side's SINGULAR 
ROW SRC node and UNNEST node will be fed by (nested) rows coming from SCAN 
HDFS. UNNEST will create a row for each collection item, SINGULAR ROW SRC just 
holds the current row, and the NESTED LOOP JOIN will produce the unnested/flat 
rows.

So it won't do huge CROSS JOINs, but yeah, this rewrite definitely adds some 
overhead. But only to some type of queries, i.e. queries that only refer to the 
items of a collection. I think the majority of complex type queries are not 
like that, so they won't be affected.



--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 04 Aug 2020 10:17:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)

2020-08-04 Thread Zoltan Borok-Nagy (Code Review)
Hello Aman Sinha, Gabor Kaszab, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16228

to look at the new patch set (#4).

Change subject: IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified 
tables (complex types)
..

IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex 
types)

This implements scanning full ACID tables that contain complex types.
The same technique works that we use for primitive types. I.e. we add
a LEFT ANTI JOIN on top of the Hdfs scan node in order to subtract
the deleted rows from the inserted rows.

However, there were some types of queries where we couldn't do that.
These are the queries that scan the nested collection items directly.

E.g.: SELECT item FROM complextypestbl.int_array;

The above query only creates a single tuple descriptor that holds the
collection items. Since this tuple descriptor is not at the table-level,
we cannot add slot references to the hidden ACID column which are at the
top level of the table schema.

To resolve this I added a statement rewriter that rewrites the above
statement to the following:

  SELECT item FROM complextypestbl $a$1, $a$1.int_array;

Now in this example we'll have two tuple descriptors, one for the
table-level, and one for the collection item. So we can add the ACID
slot refs to the table-level tuple descriptor. The rewrite is
implemented by the new AcidRewriter class.

Testing
* Added planner tests to PlannerTest/acid-scans.test
* E2E query tests to QueryTest/full-acid-complex-type-scans.test
* E2E tests for rowid-generation: QueryTest/full-acid-rowid.test

Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test
M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
A 
testdata/workloads/functional-query/queries/QueryTest/full-acid-complex-type-scans.test
M testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test
M tests/query_test/test_acid.py
12 files changed, 924 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16228/4
--
To view, visit http://gerrit.cloudera.org:8080/16228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Gerrit-Change-Number: 16228
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6780/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 04 Aug 2020 10:05:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16259 )

Change subject: IMPALA-9963: Implement ds_kll_n() function
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6779/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Gerrit-Change-Number: 16259
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 04 Aug 2020 10:05:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-04 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16283/1/be/src/exprs/datasketches-functions.h
File be/src/exprs/datasketches-functions.h:

http://gerrit.cloudera.org:8080/#/c/16283/1/be/src/exprs/datasketches-functions.h@47
PS1, Line 47: not, the
> nit: missing comma
Thx, done.



--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 04 Aug 2020 09:37:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-04 Thread Gabor Kaszab (Code Review)
Hello Adam Tamas, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16283

to look at the new patch set (#2).

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..

IMPALA-10018: Implement ds_kll_rank() function

ds_kll_rank() receives two parameters: a STRING that represents a
serialized DataSketches KLL sketch and a float to provide a probing
value in the sketch.
Returns a DOUBLE that is the rank of the given probing value in the
range of [0,1]. E.g. a return value of 0.2 means that the probing value
given as parameter is greater than the 20% of all the values in the
sketch. Note, this is an approximate calculation.

Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
---
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test
4 files changed, 76 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/16283/2
--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function

2020-08-04 Thread Gabor Kaszab (Code Review)
Hello Adam Tamas, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16259

to look at the new patch set (#4).

Change subject: IMPALA-9963: Implement ds_kll_n() function
..

IMPALA-9963: Implement ds_kll_n() function

This function receives a serialized Apache DataSketches KLL sketch
and returns how many input values were fed into this sketch.

Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
---
M be/src/exprs/datasketches-common.h
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test
5 files changed, 56 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/16259/4
--
To view, visit http://gerrit.cloudera.org:8080/16259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Gerrit-Change-Number: 16259
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-04 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 1:

(1 comment)

Hi Gabor,
Thank you for the good work with the KLL functions.
Apart from a general nit, it looks good to me.

http://gerrit.cloudera.org:8080/#/c/16283/1/be/src/exprs/datasketches-functions.h
File be/src/exprs/datasketches-functions.h:

http://gerrit.cloudera.org:8080/#/c/16283/1/be/src/exprs/datasketches-functions.h@47
PS1, Line 47: not then
nit: missing comma
As far as I see, it is missing in every comment where this sentence is used.



--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 04 Aug 2020 08:39:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6778/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 04 Aug 2020 08:19:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16240 )

Change subject: WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for 
large read/write pages in GroupingAggregator
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6777/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775
Gerrit-Change-Number: 16240
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Aug 2020 08:15:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-04 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16283


Change subject: IMPALA-10018: Implement ds_kll_rank() function
..

IMPALA-10018: Implement ds_kll_rank() function

ds_kll_rank() receives two parameters: a STRING that represents a
serialized DataSketches KLL sketch and a float to provide a probing
value in the sketch.
Returns a DOUBLE that is the rank of the given probing value in the
range of [0,1]. E.g. a return value of 0.2 means that the probing value
given as parameter is greater than the 20% of all the values in the
sketch. Note, this is an approximate calculation.

Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
---
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test
4 files changed, 71 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/16283/1
--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 


[Impala-ASF-CR] WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write pages in GroupingAggregator

2020-08-04 Thread Quanlong Huang (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16240

to look at the new patch set (#5).

Change subject: WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for 
large read/write pages in GroupingAggregator
..

WIP IMPALA-9955,IMPALA-9957: Fix not enough reservation for large read/write 
pages in GroupingAggregator

The minimum requirement for a spillable operator is ((min_buffers -2) *
default_buffer_size) + 2 * max_row_size. In the min reservation, we only
reserve space for two large pages, one for reading, the other for
writing. However, to make the non-streaming GroupingAggregator work
correctly, we have to manage these extra reservations carefully. So it
won't run out of the min reservation when it actually needs to spill a
large page, or when it actually needs to read a large page.

To be specific, we save extra reservation for writing a large page. It's
only used when we run out of unused reservation and fail to increase the
reservation to fit the large page. Currently there are two cases in
non-streaming GroupingAggregator. One case is when we start to spill a
partition and a serialize stream is needed to write some large pages.
The other case is when we have spilled all partitions in a repartition
process and need to write a large page to a spilled partition. Note that
each spilled partition in the repartition process still keeps the
default_page_size worth of reservation for writing a default page. We
can only restore the extra reservation when a partition is actually
writing a large page, and then reclaim it after the writing.

The same for extra reservation for reading a large page. In the
repartition process, we may read large pages from the input stream (from
a previous spilled partition). When it needs to pin the current large
page, we restore the extra reservation, and then reclaim it when the
attached row batch is reset.

This patch also fixes the wrong assumption that non-streaming
GroupingAggregator only requires one buffer reservation for the hash
tables. The minimal spillable buffer size is 64KB, while the minimal
requirement of a non-streaming GroupingAggregator's hash tables is
num_buckets(1024) * bucket_size(16) * partition_fanout(16) = 256KB.
We should reserve more buffers when the spillable buffer size is small.
Fix some planner test failures due to this change.

Tests:
 - Add tests in test_spilling.py to verify GroupingAggregator works in
   min reservation.

Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/grouping-aggregator-ir.cc
M be/src/exec/grouping-aggregator-partition.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.cc
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-query/queries/QueryTest/spilling-large-rows.test
17 files changed, 486 insertions(+), 176 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16240/5
--
To view, visit http://gerrit.cloudera.org:8080/16240
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3d9c3a2e7f0da60071b920dec979729e86459775
Gerrit-Change-Number: 16240
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5022: Outer join simplification

2020-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022: Outer join simplification
..


Patch Set 4:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6217/


--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 4
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 04 Aug 2020 07:44:51 +
Gerrit-HasComments: No