[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8357/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 07:04:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6965/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 06:55:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 06:53:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@123
PS5, Line 123:  // If maxAllowedScratchLimit < minMemReservat
> nit: Shouldn't this be maxAllowedScratchLimit < minMemReservationBytes ?
Thanks! Sorry for missing this.


http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@125
PS5, Line 125: If maxAllowedScratchLimit < maxMemReserva
> nit: Similar update needed here ?
Done



--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 06:44:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Riza Suminto (Code Review)
Hello Aman Sinha, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17166

to look at the new patch set (#6).

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..

IMPALA-10565: Adjust result spooling memory based on scratch_limit

IMPALA-9856 enables result spooling by default. Result spooling depends
on the ability to spill its entire BufferedTupleStream to disk once it
hits maximum memory reservation. However, if the query option
scratch_limit is set lower than max_spilled_result_spooling_mem, the
query might fail in the middle of execution due to insufficient scratch
space. This patch adds planner change to consider scratch_limit and
scratch_dirs query option when computing resource used by result
spooling. The algorithm is as follow:

* If scratch_dirs is empty or scratch_limit < minMemReservationBytes
  required to use BufferedPlanRootSink, we set spool_query_results to
  false and fallback to use BlockingPlanRootSink.

* If scratch_limit > minMemReservationBytes but still fairly low, we
  lower the max_result_spooling_mem (default is 100MB) and
  max_spilled_result_spooling_mem (default is 1GB) to fit scratch_limit.

* if scratch_limit > max_spilled_result_spooling_mem, do nothing.

Testing:
- Add TestScratchLimit::test_result_spooling_and_varying_scratch_limit
- Verify that spool_query_results query option is disabled in
  TestScratchDir::test_no_dirs
- Pass exhaustive tests.

Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
---
M be/src/service/query-options-test.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test
M tests/custom_cluster/test_scratch_disk.py
M tests/query_test/test_scratch_limit.py
8 files changed, 143 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17166/6
--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 5: Code-Review+1

(2 comments)

Couple of nits. Rest of it LGTM.

http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@123
PS5, Line 123:  // If scratch_limit < maxAllowedScratchLimit,
nit: Shouldn't this be maxAllowedScratchLimit < minMemReservationBytes ?


http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@125
PS5, Line 125: If scratch_limit < maxAllowedScratchLimit
nit: Similar update needed here ?



--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 06:39:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6964/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sat, 13 Mar 2021 05:37:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
..


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6962/


--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sat, 13 Mar 2021 05:29:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8355/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 04:17:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17181 )

Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor 
membership.
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/8356/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/17181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5
Gerrit-Change-Number: 17181
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Sat, 13 Mar 2021 04:10:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17181 )

Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor 
membership.
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/8354/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/17181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5
Gerrit-Change-Number: 17181
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Sat, 13 Mar 2021 04:06:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.

2021-03-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17181 )

Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor 
membership.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17181/1/tests/hs2/test_hs2.py
File tests/hs2/test_hs2.py:

http://gerrit.cloudera.org:8080/#/c/17181/1/tests/hs2/test_hs2.py@746
PS1, Line 746:
> flake8: E501 line too long (101 > 90 characters)
Done



--
To view, visit http://gerrit.cloudera.org:8080/17181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5
Gerrit-Change-Number: 17181
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Sat, 13 Mar 2021 04:00:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.

2021-03-12 Thread Aman Sinha (Code Review)
Hello Thomas Tauber-Marshall, Kurt Deschler, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17181

to look at the new patch set (#2).

Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor 
membership.
..

IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.

This patch adds an interface to ImpalaServer to retrieve the
current executor membership snapshot from impalad. This involves
sending a thrift request to impalad and receiving a thrift
response. Refactored some code in exec-env into a separate
function in the impala namespace which makes it easier
to populate the needed information for an external frontend.

Testing:
 - Ran selected tests for sanity check (no impact is expected
   since this is adding a new interface):
- Frontend tests (PlannerTest, CardinalityTest)
- Backend tests under custom_cluster/test_executor_groups.py
 - Manually tested with external frontend to ensure it gets
   the executor membership snapshot

Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5
---
M be/src/runtime/exec-env.cc
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.h
M common/thrift/ImpalaService.thrift
M tests/hs2/test_hs2.py
7 files changed, 119 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17181/2
--
To view, visit http://gerrit.cloudera.org:8080/17181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5
Gerrit-Change-Number: 17181
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@120
PS4, Line 120:
> ACK.
Done


http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142
PS4, Line 142:   queryOptions.setSpool_query_results(false);
> Yeah, that will make the code cleaner.  Agree about the first point about u
Done



--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 03:59:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Riza Suminto (Code Review)
Hello Aman Sinha, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17166

to look at the new patch set (#5).

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..

IMPALA-10565: Adjust result spooling memory based on scratch_limit

IMPALA-9856 enables result spooling by default. Result spooling depends
on the ability to spill its entire BufferedTupleStream to disk once it
hits maximum memory reservation. However, if the query option
scratch_limit is set lower than max_spilled_result_spooling_mem, the
query might fail in the middle of execution due to insufficient scratch
space. This patch adds planner change to consider scratch_limit and
scratch_dirs query option when computing resource used by result
spooling. The algorithm is as follow:

* If scratch_dirs is empty or scratch_limit < minMemReservationBytes
  required to use BufferedPlanRootSink, we set spool_query_results to
  false and fallback to use BlockingPlanRootSink.

* If scratch_limit > minMemReservationBytes but still fairly low, we
  lower the max_result_spooling_mem (default is 100MB) and
  max_spilled_result_spooling_mem (default is 1GB) to fit scratch_limit.

* if scratch_limit > max_spilled_result_spooling_mem, do nothing.

Testing:
- Add TestScratchLimit::test_result_spooling_and_varying_scratch_limit
- Verify that spool_query_results query option is disabled in
  TestScratchDir::test_no_dirs
- Pass exhaustive tests.

Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
---
M be/src/service/query-options-test.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test
M tests/custom_cluster/test_scratch_disk.py
M tests/query_test/test_scratch_limit.py
8 files changed, 143 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17166/5
--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17181 )

Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor 
membership.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17181/1/tests/hs2/test_hs2.py
File tests/hs2/test_hs2.py:

http://gerrit.cloudera.org:8080/#/c/17181/1/tests/hs2/test_hs2.py@746
PS1, Line 746: e
flake8: E501 line too long (101 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/17181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5
Gerrit-Change-Number: 17181
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 13 Mar 2021 03:55:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.

2021-03-12 Thread Aman Sinha (Code Review)
Aman Sinha has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17181


Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor 
membership.
..

IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.

This patch adds an interface to ImpalaServer to retrieve the
current executor membership snapshot from impalad. This involves
sending a thrift request to impalad and receiving a thrift
response. Refactored some code in exec-env into a separate
function in the impala namespace which makes it easier
to populate the needed information for an external frontend.

Testing:
 - Ran selected tests for sanity check (no impact is expected
   since this is adding a new interface):
- Frontend tests (PlannerTest, CardinalityTest)
- Backend tests under custom_cluster/test_executor_groups.py
 - Manually tested with external frontend to ensure it gets
   the executor membership snapshot

Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5
---
M be/src/runtime/exec-env.cc
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.h
M common/thrift/ImpalaService.thrift
M tests/hs2/test_hs2.py
7 files changed, 118 insertions(+), 38 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17181/1
--
To view, visit http://gerrit.cloudera.org:8080/17181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5
Gerrit-Change-Number: 17181
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha 


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8353/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 13 Mar 2021 03:34:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6963/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 13 Mar 2021 03:17:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-12 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 7:

> Patch Set 6: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/

Updated the patch to not allow enabling row-filtering but disabling column 
masking.


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 13 Mar 2021 03:15:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-12 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16976

to look at the new patch set (#7).

Change subject: IMPALA-9234: Support Ranger row filtering policies
..

IMPALA-9234: Support Ranger row filtering policies

Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default. Enabling row-filtering requires
--enable_column_masking=true since it depends on the column masking
implementation.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java
M 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test
M tests/authorization/test_ranger.py
23 files changed, 935 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/7
--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142
PS4, Line 142:   maxMemReservationBytes = scratchLimit - 
maxRowBufferSize;
> If scratch_limit is unbounded, the maxMemReservationBytes calculation in li
Yeah, that will make the code cleaner.  Agree about the first point about 
unbounded scratch_limit.



--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 02:05:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@120
PS4, Line 120: scratch_limit < minMemReservationBytes
> Update this and the one below to account for the extra maxRowBufferSize
ACK.


http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142
PS4, Line 142:   maxMemReservationBytes = scratchLimit - 
maxRowBufferSize;
> For this adjustment for maxRowBufferSize, can we not just do it up front (o
If scratch_limit is unbounded, the maxMemReservationBytes calculation in line 
114 is OK. Little overspill will not fail the query.
In contrary, if scratch_limit is bounded, just a little overspill will 
terminate the query because scratch_limit is strictly enforced.

What if I tidy up the comparison a bit so that it looks simpler? We define

  long maxAllowedScratchLimit = scratchLimit - maxRowBufferSize;

Instead of comparing against scratchLimit, these should compare against 
maxAllowedScratchLimit;



--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 01:47:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@120
PS4, Line 120: scratch_limit < minMemReservationBytes
Update this and the one below to account for the extra maxRowBufferSize


http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142
PS4, Line 142:   maxMemReservationBytes = scratchLimit - 
maxRowBufferSize;
For this adjustment for maxRowBufferSize, can we not just do it up front (on 
line 114) since we know that maxMemReservationBytes should always be 
conservative such that it leaves a cushion for maxRowBufferSize. It should 
simplify the logic and presumably not cause other side effects (unless I am 
missing something).



--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 13 Mar 2021 01:14:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10551: Add result sink support for external frontends

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17144 )

Change subject: IMPALA-10551: Add result sink support for external frontends
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8352/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072
Gerrit-Change-Number: 17144
Gerrit-PatchSet: 7
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Sat, 13 Mar 2021 00:02:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10551: Add result sink support for external frontends

2021-03-12 Thread John Sherman (Code Review)
John Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17144 )

Change subject: IMPALA-10551: Add result sink support for external frontends
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc@796
PS6, Line 796:   // All instances must have reported their final statuses 
before finalization, which is a
 :   // post-condition of Wait. Result sink file clean up is the 
responsibility of the
 :   // external frontend
> This is a copy/paste from FinalizeHdfsDml(), so the second sentence doesn't
Done


http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc@802
PS6, Line 802:   RETURN_IF_ERROR(UpdateExecState(Status::OK(), nullptr, 
FLAGS_hostname));
> If there is an error from execution, it would show up here and this would r
I agree that retry_failed_queries might be problematic and we might want to 
recommend that an external frontend not enable the feature.

If I am reading the coordinator.cc code correct though, we do not retry queries 
with a result sink. Are there other areas I should be concerned about? I see 
some usage in query-driver.cc


http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc@807
PS6, Line 807:
> Nit: This "0" is the table id. I'm guessing 0 is a special constant for the
I moved it to a named constant within this method since it is the only usage of 
it. I can also move it to DescriptorTbl if your prefer it there.


http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc@810
PS6, Line 810: result_sink_table_id, obj_p
> I think this should be "0" (i.e. the special table id used in the CreateHdf
I removed the table_id portion from this message/



--
To view, visit http://gerrit.cloudera.org:8080/17144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072
Gerrit-Change-Number: 17144
Gerrit-PatchSet: 7
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 12 Mar 2021 23:43:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10551: Add result sink support for external frontends

2021-03-12 Thread John Sherman (Code Review)
Hello Aman Sinha, Thomas Tauber-Marshall, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17144

to look at the new patch set (#7).

Change subject: IMPALA-10551: Add result sink support for external frontends
..

IMPALA-10551: Add result sink support for external frontends

- The intended purpose of these changes is to allow external frontends
  to receive query results via files rather than streaming the results
  through the thrift interface.
- External frontends are expected to provide an FeFsTable implementation
  that describes the desired location to store results.
- External frontends are responsible for managing the files after the
  query is completed.
- Testing has been manual and through an implementation of an external
  frontend.

Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072
Reviewed-by: Aman Sinha 
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/query-exec-params.cc
M be/src/runtime/query-exec-params.h
M common/thrift/DataSinks.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
9 files changed, 99 insertions(+), 12 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17144/7
--
To view, visit http://gerrit.cloudera.org:8080/17144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072
Gerrit-Change-Number: 17144
Gerrit-PatchSet: 7
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 12 Mar 2021 23:18:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6962/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 11
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 12 Mar 2021 23:18:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage

2021-03-12 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
..


Patch Set 10:

(1 comment)

> Patch Set 10: Code-Review+2
>
> (1 comment)
>
> Given the analysis in IMPALA-10563, it seems fine to disable those test cases 
> for now.
>
> See my note about IMPALA-10579. I think it is ok to include this partial fix, 
> as it seems better than what we have right now. If IMPALA-10579 was landing 
> very soon, I would be ok with removing this piece of the fix and relying on 
> IMPALA-10579.
>
> This change makes sense to me, and it is good to have the GCS support land.

Thanks Joe's review! IMPALA-10579 (https://gerrit.cloudera.org/c/17171/) will 
take some time to land. So let's have the conservative fix for GCS first.

http://gerrit.cloudera.org:8080/#/c/17121/10/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/17121/10/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@713
PS10, Line 713:   /**
  :* Wrapper around FileSystem.listStatusIterator() to make 
sure the path exists.
  :*
  :* @throws FileNotFoundException if p does not 
exist
  :* @throws IOException if any I/O error occurredd
  :*/
  :   public static RemoteIterator 
listStatusIterator(FileSystem fs, Path p)
  :   throws IOException {
  : RemoteIterator iterator = 
fs.listStatusIterator(p);
  : // Some FileSystem implementations like 
GoogleHadoopFileSystem doesn't check
  : // existence of the start path when creating the 
RemoteIterator. Instead, their
  : // iterators throw the FileNotFoundException in the first 
call of hasNext() when
  : // the start path doesn't exist. Here we call hasNext() to 
ensure start path exists.
  : iterator.hasNext();
  : return iterator;
> This code will be replaced by IMPALA-10579.
Yeah, exactly! For IMPALA-10579 (https://gerrit.cloudera.org/c/17171/), I plan 
to test the patch on Ozone, S3 and ABFS so it will take some time.

The changes in this patch is conservative so we can assure it won't impact 
other filesystems. (I have verified it on HDFS and GCS)



--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 12 Mar 2021 23:17:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8351/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 12 Mar 2021 20:12:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 4:

(8 comments)

My exhaustive test run last night reveal that scratch_limit might still get 
violated if maxMemReservationBytes is equal to scratch_limit. This is because 
the content of SpillableRowBatchQueue can be slightly higher than 
maxMemReservationBytes when it decide to spill.

To anticipate that, I lower the spooling mem config a little further here in 
Patch Set 4.

http://gerrit.cloudera.org:8080/#/c/17166/3/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/17166/3/be/src/service/query-options.cc@1104
PS3, Line 1104:   // max_spilled_result_spooling_mem (a value of 0 means memory 
is unbounded).
> I just figured out in ParseUtil::ParseMemSpec() that -1 for memory query op
Done


http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@77
PS3, Line 77:* If SPOOL_QUERY_RESULTS is true, then the ResourceProfile 
sets a min/max resevation,
> Some of the method level comment should be updated to reflect the behavior
Done


http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@92
PS3, Line 92:
> nit: typo ?
Done


http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@110
PS3, Line 110:   long bufferSize = 
queryOptions.getDefault_spillable_buffer_size();
 :   long maxRowBufferSize = 
PlanNode.computeMaxSpillableBufferSize(
> It sounds like an existing bug.  If you can create a test case for it can y
I filed IMPALA-10583. Will work on that next.


http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@126
PS3, Line 126:
> Suggest rewording:  'to >=' minMemReservationBytes
Done


http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@126
PS3, Line 126:
> nit: 'increasing'
Done


http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142
PS3, Line 142:   maxMemReservationBytes = scratchLimit - 
maxRowBufferSize;
> Would be useful to add a trace level log message here as well.
Done


http://gerrit.cloudera.org:8080/#/c/17166/3/testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test
File testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test:

http://gerrit.cloudera.org:8080/#/c/17166/3/testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test@2
PS3, Line 2:  QUERY
> Could you add 1 tests with empty scratch dirs ?
Since scratch_dirs is a backend flag, I piggy back the test under 
TestScratchDir::test_no_dirs



--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 12 Mar 2021 20:00:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Riza Suminto (Code Review)
Hello Aman Sinha, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17166

to look at the new patch set (#4).

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..

IMPALA-10565: Adjust result spooling memory based on scratch_limit

IMPALA-9856 enables result spooling by default. Result spooling depends
on the ability to spill its entire BufferedTupleStream to disk once it
hits maximum memory reservation. However, if the query option
scratch_limit is set lower than max_spilled_result_spooling_mem, the
query might fail in the middle of execution due to insufficient scratch
space. This patch adds planner change to consider scratch_limit and
scratch_dirs query option when computing resource used by result
spooling. The algorithm is as follow:

* If scratch_dirs is empty or scratch_limit < minMemReservationBytes
  required to use BufferedPlanRootSink, we set spool_query_results to
  false and fallback to use BlockingPlanRootSink.

* If scratch_limit > minMemReservationBytes but still fairly low, we
  lower the max_result_spooling_mem (default is 100MB) and
  max_spilled_result_spooling_mem (default is 1GB) to fit scratch_limit.

* if scratch_limit > max_spilled_result_spooling_mem, do nothing.

Testing:
- Add TestScratchLimit::test_result_spooling_and_varying_scratch_limit
- Verify that spool_query_results query option is disabled in
  TestScratchDir::test_no_dirs
- Pass exhaustive tests.

Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
---
M be/src/service/query-options-test.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test
M tests/custom_cluster/test_scratch_disk.py
M tests/query_test/test_scratch_limit.py
8 files changed, 140 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17166/4
--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 12 Mar 2021 19:33:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17130 )

Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67
Gerrit-Change-Number: 17130
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 12 Mar 2021 19:28:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17130 )

Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
..

IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables

This patch adds support for CREATE TABLE AS SELECT statements
for Iceberg tables.

CTAS statements work like the following in Impala:

1. Analysis of the whole CTAS statement
2. Divide CTAS to CREATE stmt and INSERT stmt
3. Create temporary in-memory target table from the CREATE stmt
4. Analyse the INSERT statement by using the temporary target table
5. If everything is OK so far, create the target table
6. Execute the INSERT query

For Iceberg tables the non-trivial thing was to create the temporary
target table without actually creating it via Iceberg API. I've created
a new class 'IcebergCtasTarget' that mimics an FeIceberg table. It can be
used with catalog V1 and V2 as well.

Testing
 * e2e CTAS tests in iceberg-ctas.test
 * SHOW CREATE TABLE stmts in show-create-table.test

Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67
Reviewed-on: http://gerrit.cloudera.org:8080/17130
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
A fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
A testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
M tests/metadata/test_show_create_table.py
M tests/query_test/test_iceberg.py
18 files changed, 686 insertions(+), 46 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67
Gerrit-Change-Number: 17130
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage

2021-03-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
..


Patch Set 10: Code-Review+2

(1 comment)

Given the analysis in IMPALA-10563, it seems fine to disable those test cases 
for now.

See my note about IMPALA-10579. I think it is ok to include this partial fix, 
as it seems better than what we have right now. If IMPALA-10579 was landing 
very soon, I would be ok with removing this piece of the fix and relying on 
IMPALA-10579.

This change makes sense to me, and it is good to have the GCS support land.

http://gerrit.cloudera.org:8080/#/c/17121/10/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/17121/10/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@713
PS10, Line 713:   /**
  :* Wrapper around FileSystem.listStatusIterator() to make 
sure the path exists.
  :*
  :* @throws FileNotFoundException if p does not 
exist
  :* @throws IOException if any I/O error occurredd
  :*/
  :   public static RemoteIterator 
listStatusIterator(FileSystem fs, Path p)
  :   throws IOException {
  : RemoteIterator iterator = 
fs.listStatusIterator(p);
  : // Some FileSystem implementations like 
GoogleHadoopFileSystem doesn't check
  : // existence of the start path when creating the 
RemoteIterator. Instead, their
  : // iterators throw the FileNotFoundException in the first 
call of hasNext() when
  : // the start path doesn't exist. Here we call hasNext() to 
ensure start path exists.
  : iterator.hasNext();
  : return iterator;
This code will be replaced by IMPALA-10579.

I'm guessing that the thought here is that this is better than what we have, 
and the fuller fix will come from IMPALA-10579.



--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 12 Mar 2021 18:42:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17099 )

Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages 
that contain UTF-8 characters
..

IMPALA-10523: Fix impala-shell crash in printing error messages that contain 
UTF-8 characters

In Python2, print() converts all non-keyword arguments to strings like
str() does and writes them to the stream. str() on QueryStateException
returns its value(i.e. error message) which could be in unicode type.
Python2 will implicitly encode it to str type using the default
encoding, 'ascii'. This could result in UnicodeEncodeError when there
are non-ascii characters in the error message.

This patch explicitly encodes the error message using 'utf-8' encoding
if it's in unicode type and the shell is run in Python2.

Tests:
 - Add test in test_shell_interactive.py

Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Reviewed-on: http://gerrit.cloudera.org:8080/17099
Reviewed-by: Tamas Mate 
Reviewed-by: Laszlo Gaal 
Tested-by: Impala Public Jenkins 
---
M shell/impala_shell.py
M tests/shell/test_shell_interactive.py
2 files changed, 16 insertions(+), 1 deletion(-)

Approvals:
  Tamas Mate: Looks good to me, but someone else must approve
  Laszlo Gaal: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Gerrit-Change-Number: 17099
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 


[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17099 )

Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages 
that contain UTF-8 characters
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Gerrit-Change-Number: 17099
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 12 Mar 2021 18:19:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10367: Impala-shell internal error - UnboundLocalError, local variable 'retry msg' referenced before assign

2021-03-12 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17172 )

Change subject: IMPALA-10367: Impala-shell internal error - UnboundLocalError, 
local variable 'retry_msg' referenced before assign
..


Patch Set 1: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/17172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I50a08a62a332de759022d0a4862e74f5a81945d9
Gerrit-Change-Number: 17172
Gerrit-PatchSet: 1
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 12 Mar 2021 17:50:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10546: Add ImpalaServer interface to retrieve BackendConfig from impalad

2021-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17116 )

Change subject: IMPALA-10546: Add ImpalaServer interface to retrieve 
BackendConfig from impalad
..

IMPALA-10546: Add ImpalaServer interface to retrieve BackendConfig from impalad

This patch add a new interface ImpalaServer::GetBackendConfig() that
returns the current TBackendGflags from impalad.

Testing:
Called new interface from external frontend. Verified that
TBackendGflags were populated correctly.

Reviewed-by: John Sherman 
Change-Id: I14a3cee29f1fc91f4431b7ea89053bb3fbfa5e69
Reviewed-on: http://gerrit.cloudera.org:8080/17116
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins 
---
M be/src/catalog/catalog.cc
M be/src/rpc/hs2-http-test.cc
M be/src/service/frontend.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.h
M be/src/util/backend-gflag-util.cc
M be/src/util/backend-gflag-util.h
M be/src/util/logging-support.cc
M common/thrift/ImpalaService.thrift
M tests/hs2/test_hs2.py
10 files changed, 64 insertions(+), 6 deletions(-)

Approvals:
  Thomas Tauber-Marshall: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I14a3cee29f1fc91f4431b7ea89053bb3fbfa5e69
Gerrit-Change-Number: 17116
Gerrit-PatchSet: 17
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements

2021-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17104 )

Change subject: IMPALA-10535: Add interface to ImpalaServer for execution of 
externally compiled statements
..

IMPALA-10535: Add interface to ImpalaServer for execution of externally 
compiled statements

The ExecutePlannedStatement interface allows an externally supplied
TExecRequest to be executed by impalad. The TExecRequest must be fully
populated and will be sent directly to the backend for execution.

The following fields in the TExecRequest are updated by the coordinator:
- Hostname
- KRPC address
- Local Timezone

In order to add the interface to ImpalaInternalService.thrift, several of
the thrift classes were moved to Query.thrift to avoid a circular
dependency with Frontend.thrift.

Added functionality to format and dump TExecRequest structures to path
specified in debug flag dump_exec_request_path.

A start timestamp field has been added to TExecRequest to represent the
interval in the query profile between when the request was sent by the
external frontend and handled by the backend.

A local timestamp field has been added to the Ping result struct to
return the current backend timestamp. This is used by the external to
frontend to populate the start timestamp.

Also included is a change to avoid generating silent AnalysisExceptions
during table resolution.

Tested with TExecRequest structures populated by external frontend.
Local timezone change tested withe INT64 TIMESTAMP datatype

Reviewed-by: John Sherman 
Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa
Reviewed-on: http://gerrit.cloudera.org:8080/17104
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Thomas Tauber-Marshall 
---
M be/generated-sources/gen-cpp/CMakeLists.txt
M be/src/rpc/hs2-http-test.cc
M be/src/runtime/debug-options.h
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/thrift/CMakeLists.txt
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A common/thrift/Query.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
21 files changed, 959 insertions(+), 760 deletions(-)

Approvals:
  Thomas Tauber-Marshall: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa
Gerrit-Change-Number: 17104
Gerrit-PatchSet: 14
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-10522: Support external use of frontend libraries

2021-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17115 )

Change subject: IMPALA-10522: Support external use of frontend libraries
..

IMPALA-10522: Support external use of frontend libraries

This patch enables the Impala frontend jar and dependent library
libfesupport.so to be used by an external Java frontend.

Calling FeSupport.setExternalFE() will cause external frontend
initialization mode to be used during FeSupport.loadLibrary(). This
mode builds upon logic that is used to initialize the frontend jar for
unit tests.

Initialization in external frontend mode differs as follows:

- Skip instantiating Frontend object and it's dependents
- Skip loading libhdfs
- Skip starting JVM Pause monitor
- Disable Minidumper
- Initialize TimezoneDatabase for external frontends
- Disable redirect of stderr/stdout to libfesupport.so glog
- Log messages from libfesupport.so to stderr
- Use libfesupport.so for JNI symbol look up

Null check were added in places where objects were assumed to be
instantiated but are now skipped during initialization.

Additional change:
1) Add libfesupport.lib path to JAVA_LIBRARY_PATH in test driver

Testing: - Initialized frontend jar from external frontend
 - Verified that frontend Java objects can be used externally without
   issues
 - Verified that exceptions thrown from Impala Java or libfesupport
   can be caught or propagated correctly by the external frontend
 - Manual verification of minicluster logs
 - Ran queries with external frontend

Co-authored-by: John Sherman 
Co-authored-by: Aman Sinha 

Change-Id: I4e3a84721ba196ec00773ce2923b19610b90edd9
Reviewed-on: http://gerrit.cloudera.org:8080/17115
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Thomas Tauber-Marshall 
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/lib-cache.cc
M be/src/runtime/lib-cache.h
M be/src/service/fe-support.cc
M be/src/util/jni-util.cc
M fe/src/main/java/org/apache/impala/service/FeSupport.java
M testdata/bin/run-hive-server.sh
12 files changed, 99 insertions(+), 46 deletions(-)

Approvals:
  Thomas Tauber-Marshall: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17115
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4e3a84721ba196ec00773ce2923b19610b90edd9
Gerrit-Change-Number: 17115
Gerrit-PatchSet: 10
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements

2021-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17104 )

Change subject: IMPALA-10535: Add interface to ImpalaServer for execution of 
externally compiled statements
..


Patch Set 13: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa
Gerrit-Change-Number: 17104
Gerrit-PatchSet: 13
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 12 Mar 2021 17:49:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10522: Support external use of frontend libraries

2021-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17115 )

Change subject: IMPALA-10522: Support external use of frontend libraries
..


Patch Set 9: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17115
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4e3a84721ba196ec00773ce2923b19610b90edd9
Gerrit-Change-Number: 17115
Gerrit-PatchSet: 9
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Fri, 12 Mar 2021 17:49:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements

2021-03-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has removed a vote on this change.

Change subject: IMPALA-10535: Add interface to ImpalaServer for execution of 
externally compiled statements
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/17104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa
Gerrit-Change-Number: 17104
Gerrit-PatchSet: 13
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-10494: Making use of the min/max column stats to improve min/max filters

2021-03-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17075 )

Change subject: IMPALA-10494: Making use of the min/max column stats to improve 
min/max filters
..


Patch Set 19:

(8 comments)

Did a quick walkthrough, will look into it in detail next week.

http://gerrit.cloudera.org:8080/#/c/17075/19//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17075/19//COMMIT_MSG@19
PS19, Line 19: show_column_minmax_stats
Do we need this query option? I mean if we have min/max stats then we'd 
probably want to show them.


http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/filter-context.cc
File be/src/exec/filter-context.cc:

http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/filter-context.cc@447
PS19, Line 447: (
nit: parentheses are not needed as the '.' member access takes precedence over 
the '&' adress-of operator,


http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/filter-context.cc@492
PS19, Line 492: (
nit: parentheses not needed


http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/hdfs-scanner.h@348
PS19, Line 348: uint8_t enabled_for_rowgroup;
Why do we need this flag? If enabled_for_rowgroup is false, then the min/max 
filter is completely turned off, right? In that case we shouldn't even send 
them to the scanner, or am I missing something?


http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/parquet/hdfs-parquet-scanner.cc@662
PS19, Line 662: }
nit: EvaluateOverlapForRowGroup() is already quite long, maybe this code could 
go into a separate function.


http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/partitioned-hash-join-builder.cc@950
PS19, Line 950:
nit: indentation


http://gerrit.cloudera.org:8080/#/c/17075/19/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java:

http://gerrit.cloudera.org:8080/#/c/17075/19/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@50
PS19, Line 50: LOG
seems unused


http://gerrit.cloudera.org:8080/#/c/17075/19/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
File fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java:

http://gerrit.cloudera.org:8080/#/c/17075/19/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java@278
PS19, Line 278:*/
It would be good to handle Iceberg tables that use Parquet data files.



--
To view, visit http://gerrit.cloudera.org:8080/17075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df
Gerrit-Change-Number: 17075
Gerrit-PatchSet: 19
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 12 Mar 2021 17:02:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17088 )

Change subject: IMPALA-10520: Implement ds_theta_intersect() function
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Gerrit-Change-Number: 17088
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 12 Mar 2021 16:13:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17088 )

Change subject: IMPALA-10520: Implement ds_theta_intersect() function
..

IMPALA-10520: Implement ds_theta_intersect() function

This function receives a set of serialized Apache DataSketches Theta
sketches produced by ds_theta_sketch() and intersects them into a
single sketch.

An example usage is to create a sketch for each partition of a table,
write these sketches to a separate table and intersect them to get
estimates based on the partitions the user is interested in related
sketches. E.g.:
  SELECT
  ds_theta_estimate(ds_theta_intersect(sketch_col))
  FROM sketch_tbl
  WHERE partition_col=1 OR partition_col=5;

Testing:
  - Apart from the automated tests I added to this patch I also
tested ds_theta_intersect() on a bigger dataset to check that
serialization, deserialization and merging steps work well. I
took TPCH25.linelitem, created a number of sketches with grouping
by l_shipdate and called ds_theta_intersect() on those sketches

Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Reviewed-on: http://gerrit.cloudera.org:8080/17088
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
4 files changed, 182 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Gerrit-Change-Number: 17088
Gerrit-PatchSet: 6
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit

2021-03-12 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17166 )

Change subject: IMPALA-10565: Adjust result spooling memory based on 
scratch_limit
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java:

http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@122
PS3, Line 122:   if (scratchLimit > -1) {
> Should this check be scratchLimit > 0 since -1 or 0 mean unbounded right ?
Ignore this comment.  I  was probably thinking about the memory setting (side 
effects of late night review).



--
To view, visit http://gerrit.cloudera.org:8080/17166
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Gerrit-Change-Number: 17166
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 12 Mar 2021 15:37:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation

2021-03-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16842 )

Change subject: IMPALA-10377: Improve the accuracy of resource estimation
..

IMPALA-10377: Improve the accuracy of resource estimation

PlanNode does not consider some factors when estimating memory,
this will cause a large error rate

AggregationNode
1.MemoryEstimate = Ndv * (AvgRowSize + SizeOfBucket)
2.When estimating the Ndv of merge aggregation, Ndv should be
  divided only once.
3.If there is no grouping exprs, MemoryEstimate =
  MIN_PLAIN_AGG_MEM

SortNode
1.MemoryEstimate = Cardinality * AvgRowSize. Memory used when
  there is enough memory

HashJoinNode
1.MemoryEstimate= DataRows + Buckets + DuplicateNodes,
  DataRows = RightTableCardinality * AvgRowSize,
  Buckets= roundUpToPowerOf2(RightTableCardinality) *
   SizeOfBucket,
  DuplicateNodes = (RightTableCardinality - RightNdv) *
SizeOfDuplicateNode

KuduScanNode
1.MemoryEstimate = Columns * BytesPerColumn * MaxScannerThreads,
  Columns are scanned in query, not all the columns of the table

UnitTest
1.CardinalityTest adds test cases to test memory estimation.
  Modify existing test cases related to memory estimation

Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
Reviewed-on: http://gerrit.cloudera.org:8080/16842
Reviewed-by: Zoltan Borok-Nagy 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hash-table.h
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/preagg-bytes-limit.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/result-spooling.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test
M 

[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
..


Patch Set 10: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 12 Mar 2021 14:16:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16976 )

Change subject: IMPALA-9234: Support Ranger row filtering policies
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc
Gerrit-Change-Number: 16976
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 12 Mar 2021 13:57:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16842 )

Change subject: IMPALA-10377: Improve the accuracy of resource estimation
..


Patch Set 22: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
Gerrit-Change-Number: 16842
Gerrit-PatchSet: 22
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: liuyao 
Gerrit-Comment-Date: Fri, 12 Mar 2021 13:57:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8350/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 12 Mar 2021 13:54:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17130 )

Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67
Gerrit-Change-Number: 17130
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 12 Mar 2021 13:45:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17130 )

Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6960/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67
Gerrit-Change-Number: 17130
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 12 Mar 2021 13:45:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17081/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

http://gerrit.cloudera.org:8080/#/c/17081/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java@78
PS3, Line 78:
> Perhaps a better name could include the serviceID of the catalog instance s
I've added an abstract method getAcidUserId() to this class.

It actually revealed that we never actually logged the user by the Transaction 
object.



--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 12 Mar 2021 13:37:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-03-12 Thread Zoltan Borok-Nagy (Code Review)
Hello Vihang Karajgaonkar, Gabor Kaszab, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17081

to look at the new patch set (#4).

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..

IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

ALTER TABLE ADD PARTITION should bump the write id for ACID tables.
Both for INSERT-only and full ACID tables.

For transational tables we are adding partitions in an ACID
transaction in the following sequence:

1. open transaction
2. allocate write id for table
3. add partitions to HMS table
4. commit transaction

However, please note that table metadata modifications are
independent of ACID transactions. I.e. if add partitions succeed,
but we cannot commit the transaction, then we the newly added
partitions won't get removed.

So why are we opening a txn then? We are doing it in order to bump
the write id in a best-effort way. This aids table metadata caching,
so by looking at the table write id we can determine if the cached
table metadata is up-to-date.

Testing:
 * added e2e test

Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
---
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Transaction.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/query_test/test_acid.py
6 files changed, 117 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17081/4
--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17099 )

Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages 
that contain UTF-8 characters
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6959/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Gerrit-Change-Number: 17099
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 12 Mar 2021 12:31:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10558: Implement ds theta exclude() function

2021-03-12 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17153 )

Change subject: IMPALA-10558: Implement ds_theta_exclude() function
..


Patch Set 2:

(8 comments)

Thanks for these changes! I had some comments mostly nits and around test 
coverage.

http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc
File be/src/exprs/datasketches-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@128
PS2, Line 128:   // a_not_b
nit: this comment is not needed as doesn't give extra info


http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@131
PS2, Line 131: datasketches::theta_sketch::unique_ptr first_sketch_ptr;
 :   if (!first_serialized_sketch.is_null && 
first_serialized_sketch.len > 0) {
 : try {
 :   first_sketch_ptr = datasketches::theta_sketch::deserialize(
 :   (void*)first_serialized_sketch.ptr, 
first_serialized_sketch.len);
 : } catch (const std::exception&) {
 :   LogSketchDeserializationError(ctx);
 :   return StringVal::null();
 : }
 :   }
This part seems pretty identical to the section L141-150. Can you move it to a 
function to avoid code repetition?


http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@155
PS2, Line 155: first_sketch_ptr.operator bool()
I'm not sure I understand the condition in this format :) Could you please 
explain what goes on here?


http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions.h
File be/src/exprs/datasketches-functions.h:

http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions.h@73
PS2, Line 73: 'serialized_sketch'
Could you mention both sketch params?


http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
File 
testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test:

http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@330
PS2, Line 330: for A is an empty sketch.
When A is empty and B is null.


http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@331
PS2, Line 331: select ds_theta_estimate(ds_theta_exclude(ds_theta_sketch(f2), 
null))
Could you please add another test where A is null and B is empty? (the opposite 
of this one)


http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@332
PS2, Line 332: from functional_parquet.emptytable;
Another test would be where A and B are both empty.


http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@379
PS2, Line 379: 
I miss a test where the result of an a-not-b is a non-empty sketch (where the 
estimate is greater than zero).



--
To view, visit http://gerrit.cloudera.org:8080/17153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05119fd8c652c07ff248a99e44b0da3541e46ca3
Gerrit-Change-Number: 17153
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 12 Mar 2021 12:12:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation

2021-03-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16842 )

Change subject: IMPALA-10377: Improve the accuracy of resource estimation
..


Patch Set 22: Code-Review+2

Yeah, it probably was an intermittent infrastructure issue.


--
To view, visit http://gerrit.cloudera.org:8080/16842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
Gerrit-Change-Number: 16842
Gerrit-PatchSet: 22
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: liuyao 
Gerrit-Comment-Date: Fri, 12 Mar 2021 12:03:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation

2021-03-12 Thread liuyao (Code Review)
liuyao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16842 )

Change subject: IMPALA-10377: Improve the accuracy of resource estimation
..


Patch Set 22:

I failed the automation test, and It doesn't look like my code caused the 
failure. I used Gerrit rebase my code and rerun test.


--
To view, visit http://gerrit.cloudera.org:8080/16842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
Gerrit-Change-Number: 16842
Gerrit-PatchSet: 22
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: liuyao 
Gerrit-Comment-Date: Fri, 12 Mar 2021 11:32:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9470: Use Parquet Bloom filters - Part 1

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17026 )

Change subject: IMPALA-9470: Use Parquet Bloom filters - Part 1
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8349/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287
Gerrit-Change-Number: 17026
Gerrit-PatchSet: 17
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 12 Mar 2021 11:21:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations

2021-03-12 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17177 )

Change subject: IMPALA-10582: Fix wrong summary numbers in the webpage of 
catalogd operations
..


Patch Set 1: Code-Review+1

Hi Quanlong, nice catch. LGTM.


--
To view, visit http://gerrit.cloudera.org:8080/17177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c5361d981832d6f28db5f203a2c2538fe8ebb5e
Gerrit-Change-Number: 17177
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 12 Mar 2021 11:04:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9470: Use Parquet Bloom filters - Part 1

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17026 )

Change subject: IMPALA-9470: Use Parquet Bloom filters - Part 1
..


Patch Set 17:

(182 comments)

http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h
File be/src/thirdparty/xxhash/xxhash.h:

http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@70
PS17, Line 70: 
https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html?showComment=1552696407071#c3490092340461170735
line too long (112 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@92
PS17, Line 92:  *  
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@113
PS17, Line 113: #  elif defined (__cplusplus) || (defined (__STDC_VERSION__) && 
(__STDC_VERSION__ >= 199901L) /* C99 */)
line too long (104 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@243
PS17, Line 243: #  define XXH3_64bits_reset_withSecret XXH_NAME2(XXH_NAMESPACE, 
XXH3_64bits_reset_withSecret)
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@253
PS17, Line 253: #  define XXH3_128bits_reset_withSeed XXH_NAME2(XXH_NAMESPACE, 
XXH3_128bits_reset_withSeed)
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@254
PS17, Line 254: #  define XXH3_128bits_reset_withSecret 
XXH_NAME2(XXH_NAMESPACE, XXH3_128bits_reset_withSecret)
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@270
PS17, Line 270: #define XXH_VERSION_NUMBER  (XXH_VERSION_MAJOR *100*100 + 
XXH_VERSION_MINOR *100 + XXH_VERSION_RELEASE)
line too long (103 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@429
PS17, Line 429:  * @param statePtr A pointer to an @ref XXH32_state_t allocated 
with @ref XXH32_createState().
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@441
PS17, Line 441: XXH_PUBLIC_API void XXH32_copyState(XXH32_state_t* dst_state, 
const XXH32_state_t* src_state);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@476
PS17, Line 476: XXH_PUBLIC_API XXH_errorcode XXH32_update (XXH32_state_t* 
statePtr, const void* input, size_t length);
line too long (102 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@628
PS17, Line 628: XXH_PUBLIC_API void XXH64_copyState(XXH64_state_t* dst_state, 
const XXH64_state_t* src_state);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@631
PS17, Line 631: XXH_PUBLIC_API XXH_errorcode XXH64_update (XXH64_state_t* 
statePtr, const void* input, size_t length);
line too long (102 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@700
PS17, Line 700: XXH_PUBLIC_API XXH64_hash_t XXH3_64bits_withSeed(const void* 
data, size_t len, XXH64_hash_t seed);
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@724
PS17, Line 724: XXH_PUBLIC_API XXH64_hash_t XXH3_64bits_withSecret(const void* 
data, size_t len, const void* secret, size_t secretSize);
line too long (120 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@743
PS17, Line 743: XXH_PUBLIC_API void XXH3_copyState(XXH3_state_t* dst_state, 
const XXH3_state_t* src_state);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@756
PS17, Line 756: XXH_PUBLIC_API XXH_errorcode 
XXH3_64bits_reset_withSeed(XXH3_state_t* statePtr, XXH64_hash_t seed);
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@766
PS17, Line 766: XXH_PUBLIC_API XXH_errorcode 
XXH3_64bits_reset_withSecret(XXH3_state_t* statePtr, const void* secret, size_t 
secretSize);
line too long (121 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@768
PS17, Line 768: XXH_PUBLIC_API XXH_errorcode XXH3_64bits_update (XXH3_state_t* 
statePtr, const void* input, size_t length);
line too long (107 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@791
PS17, Line 791: XXH_PUBLIC_API XXH128_hash_t XXH3_128bits_withSeed(const void* 
data, size_t len, XXH64_hash_t seed);
line too long (100 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@792
PS17, Line 792: XXH_PUBLIC_API XXH128_hash_t XXH3_128bits_withSecret(const 
void* data, size_t len, const void* secret, size_t secretSize);
line too long (122 > 90)



[Impala-ASF-CR] IMPALA-9470: Use Parquet Bloom filters - Part 1

2021-03-12 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#17). ( 
http://gerrit.cloudera.org:8080/17026 )

Change subject: IMPALA-9470: Use Parquet Bloom filters - Part 1
..

IMPALA-9470: Use Parquet Bloom filters - Part 1

This change adds read support for Parquet Bloom filters for some types.
The supported Parquet type - Impala type pairs are the following:

 ---
|Parquet type |  Impala type|
|---|
|INT32|  TINYINT, SMALLINT, INT |
|INT64|  BIGINT |
|FLOAT|  FLOAT  |
|DOUBLE   |  DOUBLE |
|BYTE_ARRAY   |  STRING |
 ---

If a Bloom filter is available for a column that is fully dictionary
encoded, the Bloom filter is not used as the dictionary can give exact
results in filtering.

Testing:
  - Added tests/query_test/test_parquet_bloom_filter.py that tests that
Parquet Bloom filtering works for the supported types and that we do
not incorrectly discard row groups for the unsupported type VARCHAR.

Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287
---
M LICENSE.txt
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exprs/expr-value.h
M be/src/exprs/literal.cc
M be/src/exprs/literal.h
M be/src/kudu/util/block_bloom_filter.cc
M be/src/kudu/util/block_bloom_filter.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
A be/src/thirdparty/xxhash/README.md
A be/src/thirdparty/xxhash/xxhash.h
M be/src/util/CMakeLists.txt
M be/src/util/bloom-filter.cc
M be/src/util/bloom-filter.h
A be/src/util/impala-bloom-filter-buffer-allocator.cc
A be/src/util/impala-bloom-filter-buffer-allocator.h
A be/src/util/parquet-bloom-filter.cc
A be/src/util/parquet-bloom-filter.h
M bin/rat_exclude_files.txt
M bin/run_clang_tidy.sh
M common/thrift/parquet.thrift
A testdata/data/parquet-bloom-filtering.parquet
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-bloom-filter.test
A tests/query_test/test_parquet_bloom_filter.py
27 files changed, 6,910 insertions(+), 132 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/17026/17
--
To view, visit http://gerrit.cloudera.org:8080/17026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287
Gerrit-Change-Number: 17026
Gerrit-PatchSet: 17
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17088 )

Change subject: IMPALA-10520: Implement ds_theta_intersect() function
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6958/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Gerrit-Change-Number: 17088
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 12 Mar 2021 10:29:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17088 )

Change subject: IMPALA-10520: Implement ds_theta_intersect() function
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Gerrit-Change-Number: 17088
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 12 Mar 2021 10:29:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function

2021-03-12 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17088 )

Change subject: IMPALA-10520: Implement ds_theta_intersect() function
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Gerrit-Change-Number: 17088
Gerrit-PatchSet: 4
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 12 Mar 2021 10:28:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables

2021-03-12 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17130 )

Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67
Gerrit-Change-Number: 17130
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 12 Mar 2021 10:14:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters

2021-03-12 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17099 )

Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages 
that contain UTF-8 characters
..


Patch Set 3: Code-Review+2

LGTM; thanks for the fix, Quanlong!


--
To view, visit http://gerrit.cloudera.org:8080/17099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Gerrit-Change-Number: 17099
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 12 Mar 2021 10:02:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters

2021-03-12 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17099 )

Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages 
that contain UTF-8 characters
..


Patch Set 3: Code-Review+1

Hi Quanlong, thanks for adding the comment, LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/17099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Gerrit-Change-Number: 17099
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 12 Mar 2021 08:46:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17099 )

Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages 
that contain UTF-8 characters
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8348/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Gerrit-Change-Number: 17099
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 12 Mar 2021 08:37:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17121 )

Change subject: IMPALA-7712: Support Google Cloud Storage
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6957/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17121
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Gerrit-Change-Number: 17121
Gerrit-PatchSet: 10
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 12 Mar 2021 08:36:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17177 )

Change subject: IMPALA-10582: Fix wrong summary numbers in the webpage of 
catalogd operations
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8347/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c5361d981832d6f28db5f203a2c2538fe8ebb5e
Gerrit-Change-Number: 17177
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 12 Mar 2021 08:20:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters

2021-03-12 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17099 )

Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages 
that contain UTF-8 characters
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17099/2/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/17099/2/shell/impala_shell.py@1321
PS2, Line 1321:   # Python2 will implicitly convert unicode to str when 
printing to stderr. It's done
> nit: could you add a short one line comment that explains this condition?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Gerrit-Change-Number: 17099
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Fri, 12 Mar 2021 08:17:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters

2021-03-12 Thread Quanlong Huang (Code Review)
Hello Tamas Mate, Laszlo Gaal, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17099

to look at the new patch set (#3).

Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages 
that contain UTF-8 characters
..

IMPALA-10523: Fix impala-shell crash in printing error messages that contain 
UTF-8 characters

In Python2, print() converts all non-keyword arguments to strings like
str() does and writes them to the stream. str() on QueryStateException
returns its value(i.e. error message) which could be in unicode type.
Python2 will implicitly encode it to str type using the default
encoding, 'ascii'. This could result in UnicodeEncodeError when there
are non-ascii characters in the error message.

This patch explicitly encodes the error message using 'utf-8' encoding
if it's in unicode type and the shell is run in Python2.

Tests:
 - Add test in test_shell_interactive.py

Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
---
M shell/impala_shell.py
M tests/shell/test_shell_interactive.py
2 files changed, 16 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17099/3
--
To view, visit http://gerrit.cloudera.org:8080/17099
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71
Gerrit-Change-Number: 17099
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tamas Mate 


[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation

2021-03-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16842 )

Change subject: IMPALA-10377: Improve the accuracy of resource estimation
..


Patch Set 22:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6956/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
Gerrit-Change-Number: 16842
Gerrit-PatchSet: 22
Gerrit-Owner: liuyao 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: liuyao 
Gerrit-Comment-Date: Fri, 12 Mar 2021 08:09:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations

2021-03-12 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17177


Change subject: IMPALA-10582: Fix wrong summary numbers in the webpage of 
catalogd operations
..

IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations

Webpage of catalogd operations doesn't sum up requests correctly.
Instead, the current meaning is summing by tables. As the column name
is "Number of requests", we should sum up by requests.

Tests:
 - Manually run test_concurrent_inserts and verify the number is
   correct.

Change-Id: I1c5361d981832d6f28db5f203a2c2538fe8ebb5e
---
M be/src/catalog/catalog-server.cc
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/17177/1
--
To view, visit http://gerrit.cloudera.org:8080/17177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1c5361d981832d6f28db5f203a2c2538fe8ebb5e
Gerrit-Change-Number: 17177
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang