[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Wed, 19 May 2021 05:52:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..

IMPALA-10678: Support custom SASL protocol name in Kudu client

This patch added configurable flag variable kudu_sasl_protocol_name,
and call Kudu client API to set the SASL protocol name when creating
Kudu client in the FE and BE.
Upgraded toolchain to pull in new version of Kudu which provides new
Java/C++ client APIs for setting the SASL protocol name.

Testing:
 - Passed core run.

Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Reviewed-on: http://gerrit.cloudera.org:8080/17442
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/common/global-flags.cc
M be/src/exec/kudu-util.cc
M be/src/util/backend-gflag-util.cc
M bin/impala-config.sh
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
7 files changed, 15 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-10681: Improve join cardinality estimates

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17387 )

Change subject: IMPALA-10681: Improve join cardinality estimates
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8751/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc
Gerrit-Change-Number: 17387
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 May 2021 04:49:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10681: Improve join cardinality estimates

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17387 )

Change subject: IMPALA-10681: Improve join cardinality estimates
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8750/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc
Gerrit-Change-Number: 17387
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 May 2021 04:40:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10681: Improve join cardinality estimates

2021-05-18 Thread Aman Sinha (Code Review)
Hello Zoltan Borok-Nagy, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17387

to look at the new patch set (#3).

Change subject: IMPALA-10681: Improve join cardinality estimates
..

IMPALA-10681: Improve join cardinality estimates

During cardinality estimation for inner joins, if the join
conjunct involves a scan slot on left side and a function
(e.g MAX) on the right, currently we determine that the NDV
stats of either side is not useful and return the left side's
cardinality even though it may be a significant over-estimate.

In this patch, we handle join conjuncts of such types by
keeping them in an 'other' eligible conjuncts list as long as
the NDV for expressions on both sides of the join can be
reasonably estimated and the input cardinality is also available.
For example, if the conjunct is int_col = MAX(int_col) and the
right input does not have a group-by, the right NDV = 1 and
can be safely used. If it has a group-by and the group-by
columns alread have associated NDV, we can can still know the
combined NDV. Other such examples exist. An auxiliary struct is
introduced to keep track of the ndv and row count.

Once these 'other' eligible conjuncts are populated, we do the
join cardinality estimation in a manner similar to the normal
join conjuncts by fetching the stats from the auxiliary struct.

Testing:
 - Added new planner tests for inner join cardinality
 - Modified expected plans for certains tests including
   TPC-DS queries and ran end-to-end TPC-DS queries
 - Since TPC-DS plans are complex, I did a check of the cardinality
   changes for some of the hash joins but not the changes in the
   shape of a plan (e.g whether the join order changed).

   TODO: We would want to run a performance test to validate
   the plan changes for TPC-DS at a sufficiently high scale factor.

Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-key-scans-default.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-key-scans.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
15 files changed, 3,681 insertions(+), 3,331 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/17387/3
--
To view, visit http://gerrit.cloudera.org:8080/17387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc
Gerrit-Change-Number: 17387
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10681: Improve join cardinality estimates

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17387 )

Change subject: IMPALA-10681: Improve join cardinality estimates
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@425
PS2, Line 425:* that instead of the EqJoinConjunctScanSlots, it uses the 
{@link NdvAndRowCountStats} to
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@521
PS2, Line 521: if (lhsExpr instanceof AnalyticExpr || rhsExpr 
instanceof AnalyticExpr) return null;
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@522
PS2, Line 522: long lhsNdv = lhsScanSlot != null ? 
lhsScanSlot.getStats().getNumDistinctValues() :
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@524
PS2, Line 524: long rhsNdv = rhsScanSlot != null ? 
rhsScanSlot.getStats().getNumDistinctValues() :
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@528
PS2, Line 528: // In the following num rows assignment, if the 
underlying scan slot is not available
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@529
PS2, Line 529: // we cannot get the actual base table row count. In 
that case we approximate the row
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@530
PS2, Line 530: // count as just the lhs or rhs cardinality. Since the 
ratio of cardinality/num_rows
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@531
PS2, Line 531: // is used to adjust (scale down) the NDV later (when 
computing join cardinality), it
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17387/2/fe/src/main/java/org/apache/impala/planner/JoinNode.java@538
PS2, Line 538: otherEqJoinConjuncts.add(new NdvAndRowCountStats(lhsNdv, 
rhsNdv, lhsNumRows, rhsNumRows));
line too long (98 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc
Gerrit-Change-Number: 17387
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 May 2021 04:20:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10681: Improve join cardinality estimates

2021-05-18 Thread Aman Sinha (Code Review)
Hello Zoltan Borok-Nagy, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17387

to look at the new patch set (#2).

Change subject: IMPALA-10681: Improve join cardinality estimates
..

IMPALA-10681: Improve join cardinality estimates

During cardinality estimation for inner joins, if the join
conjunct involves a scan slot on left side and a function
(e.g MAX) on the right, currently we determine that the NDV
stats of either side is not useful and return the left side's
cardinality even though it may be a significant over-estimate.

In this patch, we handle join conjuncts of such types by
keeping them in an 'other' eligible conjuncts list as long as
the NDV for expressions on both sides of the join can be
reasonably estimated and the input cardinality is also available.
For example, if the conjunct is int_col = MAX(int_col) and the
right input does not have a group-by, the right NDV = 1 and
can be safely used. If it has a group-by and the group-by
columns alread have associated NDV, we can can still know the
combined NDV. Other such examples exist. An auxiliary struct is
introduced to keep track of the ndv and row count.

Once these 'other' eligible conjuncts are populated, we do the
join cardinality estimation in a manner similar to the normal
join conjuncts by fetching the stats from the auxiliary struct.

Testing:
 - Added new planner tests for inner join cardinality
 - Modified expected plans for certains tests including
   TPC-DS queries and ran end-to-end TPC-DS queries
 - Since TPC-DS plans are complex, I did a check of the cardinality
   changes for some of the hash joins but not the changes in the
   shape of a plan (e.g whether the join order changed).

   TODO: We would want to run a performance test to validate
   the plan changes for TPC-DS at a sufficiently high scale factor.

Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc
---
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-key-scans-default.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-key-scans.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
15 files changed, 3,675 insertions(+), 3,331 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/17387/2
--
To view, visit http://gerrit.cloudera.org:8080/17387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc
Gerrit-Change-Number: 17387
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10502: Handle CREATE/DROP events correctly

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17308 )

Change subject: IMPALA-10502: Handle CREATE/DROP events correctly
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17308
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2c5e96b48abac015240f20295b3ec3b1d71f24a
Gerrit-Change-Number: 17308
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 19 May 2021 04:04:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17284 )

Change subject: IMPALA-10645: Log catalogd HMS API metrics
..

IMPALA-10645: Log catalogd HMS API metrics

Expose rpc duration, cache hit ratio, etc for Catalogd HMS APIs.
The metrics currently are only logged at debug level
when the catalogd starts a HMS endpoint. A followup
will be done separately to expose them to the debug UI.

This patch was originally contributed by Kishen Das.

Testing:
1. Deployed the catalogd's metastore server and made sure that
the metrics are logged in the catalogd.INFO logs.

Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287
Reviewed-on: http://gerrit.cloudera.org:8080/17284
Reviewed-by: Vihang Karajgaonkar 
Tested-by: Impala Public Jenkins 
---
M common/thrift/JniCatalog.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
A fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/monitor/CatalogMonitor.java
11 files changed, 437 insertions(+), 34 deletions(-)

Approvals:
  Vihang Karajgaonkar: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287
Gerrit-Change-Number: 17284
Gerrit-PatchSet: 12
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17284 )

Change subject: IMPALA-10645: Log catalogd HMS API metrics
..


Patch Set 11: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287
Gerrit-Change-Number: 17284
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 19 May 2021 04:02:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 6: Code-Review+2

Thanks for working through the issues.
I will verify and merge the change


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 6
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Wed, 19 May 2021 02:15:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 7
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Wed, 19 May 2021 02:16:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7163/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 7
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Wed, 19 May 2021 02:16:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8749/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 6
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Wed, 19 May 2021 02:06:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8748/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 19 May 2021 01:56:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7162/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 6
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Wed, 19 May 2021 01:44:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-18 Thread Yong Yang (Code Review)
Yong Yang has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..

IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

OSS is the object store in Alibaba cloud, just like s3a,
and jindofs is a gateway based on Alibaba cloud object store.
The following is about the JindoFS, for more information:
https://www.alibabacloud.com/blog/introducing-jindofs-a-high-performance-data-lake-storage-solution_595600
If Alibaba object store would be treated as local disk
 without this change, the query performance is not good.
This change would create a dedicated queue for this kind of target,
 and improved the OSS scan performance.
I have tested it in our environment,
 and observed at least double the scan speed.

New flags:
 - num_oss_io_threads: Number of OSS/JindoFS I/O threads. Defaults to 16.

Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Signed-off-by: Yong Yang 
---
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/util/hdfs-util.cc
M be/src/util/hdfs-util.h
5 files changed, 25 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17455/6
--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 6
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-18 Thread Quanlong Huang (Code Review)
Hello Aman Sinha, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17427

to look at the new patch set (#4).

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..

IMPALA-10702: Add warning logs for slow or large catalogd response

It'd be helpful to log the slow or large responses of catalogd in
debugging scalability issues. This patch adds these warning logs in
JniCatalog, where we serialize thrift responses. See some example
outputs in the jira description.

Responses that have size larger than 50MB or take more than 60s to
finish will be logged with the request. Add flags for these two
thredshold in case users found the warnings too verbose and want to
increase the thresholds.

Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
---
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/common/JniUtil.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
5 files changed, 129 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/17427/4
--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10688: Implement ds cpc stringify() function

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17373 )

Change subject: IMPALA-10688: Implement ds_cpc_stringify() function
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8747/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285
Gerrit-Change-Number: 17373
Gerrit-PatchSet: 4
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 19 May 2021 01:08:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10433: Use Iceberg's fixed partition transforms

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17417 )

Change subject: IMPALA-10433: Use Iceberg's fixed partition transforms
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Gerrit-Change-Number: 17417
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 19 May 2021 00:59:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10433: Use Iceberg's fixed partition transforms

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17417 )

Change subject: IMPALA-10433: Use Iceberg's fixed partition transforms
..

IMPALA-10433: Use Iceberg's fixed partition transforms

Because of an Iceberg bug Impala didn't push predicates to
Iceberg for dates/timestamps when the predicate referred to a
value before the UNIX epoch.

https://github.com/apache/iceberg/pull/1981 fixed the Iceberg
bug, and lately Impala switched to an Iceberg version that has
the fix, therefore this patch enables predicate pushdown for all
timestamp/date values.

The above Iceberg patch maintains backward compatibility with the
old, wrong behavior. Therefore sometimes we need to read plus one
Iceberg partition than necessary.

Testing:
 * Updated current e2e tests

Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Reviewed-on: http://gerrit.cloudera.org:8080/17417
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test
2 files changed, 10 insertions(+), 19 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Gerrit-Change-Number: 17417
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10688: Implement ds cpc stringify() function

2021-05-18 Thread Fucun Chu (Code Review)
Fucun Chu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17373 )

Change subject: IMPALA-10688: Implement ds_cpc_stringify() function
..


Patch Set 4:

All tests run with the pre-review-test job passed,  failed test cases are not 
reproduced. See:https://jenkins.impala.io/job/pre-review-test/948/. Can the 
gerrit-verify-dryrun job be re-run, thanks.


--
To view, visit http://gerrit.cloudera.org:8080/17373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285
Gerrit-Change-Number: 17373
Gerrit-PatchSet: 4
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 19 May 2021 00:57:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10688: Implement ds cpc stringify() function

2021-05-18 Thread Fucun Chu (Code Review)
Fucun Chu has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17373 )

Change subject: IMPALA-10688: Implement ds_cpc_stringify() function
..

IMPALA-10688: Implement ds_cpc_stringify() function

This function receives a string that is a serialized Apache
DataSketches CPC sketch and returns its stringified format.

A stringified format should look like and contains the following data:

select ds_cpc_stringify(ds_cpc_sketch(float_col)) from
functional_parquet.alltypestiny;
++
| ds_cpc_stringify(ds_cpc_sketch(float_col)) |
++
| ### CPC sketch summary:|
|lg_k   : 11 |
|seed hash  : 93cc   |
|C  : 2  |
|flavor : 1  |
|merged : true   |
|intresting col : 0  |
|table entries  : 2  |
|window : not allocated  |
| ### End sketch summary |
||
++

Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285
---
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-cpc.test
4 files changed, 59 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/17373/4
--
To view, visit http://gerrit.cloudera.org:8080/17373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285
Gerrit-Change-Number: 17373
Gerrit-PatchSet: 4
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 29:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8746/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 29
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 19 May 2021 00:18:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-18 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#29). ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..

IMPALA-10650: Bailout min/max filters in hash join builder early

This change set addresses the weakness in population min/max filters
in the hash join builder by periodically measuring the usefulness of
each filter and set the 'always_true_' flag accordingly. Once set to
true, the insertion to such a filter completely skips the steps from
the evaluation of the value from a row to the verification of the
value in the min/max range. This optimization is LLVM-enabled.

In addition, a new flag 'is_min_max_value_present' is added to
TRuntimeFilterTargetDesc to indicate whether the min/max column stats
is present in the query plan. The flag eliminates the need to check
the presence of min/max stats for every row in back-end.

Early bail out improves the HJ builder step in general. For example,
the step for join node #11 in TPCDS Q8 improves 13%, and the step
for join node #8 in TPCDS Q16 improves 3.2%.

The Insert() methods are optimized with branch prediction compiler
hints which yield the following improvement when tested with the
insertion of 1 randomly generated items.

  Small Integers: 7.0%
  Integers:   4.1%
  Big Integers:   4.3%
  Strings:5.6%
  Dates:  4.4%
  Timestamps:10.7%
  Decimals(4):   10.4%
  Decimals(8):9.1%

In addition, the min/max stats for pages are read in batches with a
fast track version for column types of int32_t,  int64_t, float,
double and date that have identical storage format as Parquet. For a
row group, the page locations are read only once, instead of once for
every page skipped, resulting in 100x speedup when a subset of 199
pages are skipped.

Testing:
  1. Ran core test successfully;
  2. Ran TPCDS performance tests.

Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/filter-context.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/parquet/parquet-column-stats.inline.h
M be/src/exec/parquet/parquet-common.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/runtime/runtime-filter-ir.cc
M be/src/util/min-max-filter-ir.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/util/TColumnValueUtil.java
17 files changed, 979 insertions(+), 297 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17295/29
--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 29
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-18 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 28:

(1 comment)

Address Riza's comment and fix one FE null ptr exception.

http://gerrit.cloudera.org:8080/#/c/17295/28/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17295/28/be/src/exec/partitioned-hash-join-builder.cc@335
PS28, Line 335:   if (filter_ctxs_.size() == 0) return;
> Looks like we can remove this branch?
Done



--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 28
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 23:56:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 23:55:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7161/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 23:55:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17284 )

Change subject: IMPALA-10645: Log catalogd HMS API metrics
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8745/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287
Gerrit-Change-Number: 17284
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 22:33:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7157/


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 22:26:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9770: [DOCS] Remove Sentry references in documentation

2021-05-18 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17469 )

Change subject: IMPALA-9770: [DOCS] Remove Sentry references in documentation
..


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/17469/1/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

http://gerrit.cloudera.org:8080/#/c/17469/1/docs/shared/impala_common.xml@4579
PS1, Line 4579:Impala now does not support privileges of 
DELETE,
  :   UPDATE, and 
UPSERT operations. 
I'm thinking this sentence is already covered in the sentence just above it.


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_alter_database.xml
File docs/topics/impala_alter_database.xml:

http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_alter_database.xml@a68
PS1, Line 68:
I think we still support this variant. Fang-Yu, can you confirm?


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_alter_table.xml
File docs/topics/impala_alter_table.xml:

http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_alter_table.xml@a74
PS1, Line 74:
I think we still support this variant. Fang-Yu, can you confirm?


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_alter_table.xml@a322
PS1, Line 322:
Same as above


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_alter_view.xml
File docs/topics/impala_alter_view.xml:

http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_alter_view.xml@a71
PS1, Line 71:
I think we still support this variant. Fang-Yu, can you confirm?


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_authorization.xml
File docs/topics/impala_authorization.xml:

http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_authorization.xml@164
PS1, Line 164: before starting Impala cluster
I think we can omit this.


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_authorization.xml@229
PS1, Line 229: fe/src/test/resources/
I don't think we should use this specific path. This is true for an Impala 
development environment, but it has little relationship to actual user 
deployments.


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_authorization.xml@306
PS1, Line 306:   The following examples show how to set up 
authorization to deal with various scenarios
 :   and how to grant privileges on objects to groups of 
users via roles, but note that you
 :   could also grant privileges on objects to a user or a 
group directly without involving a
 :   role.
This is a very long sentence with a lot going on. Let's cut it down:

"The following examples show how to set up authorization to grant privileges on 
objects to groups of users via roles."


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_grant.xml
File docs/topics/impala_grant.xml:

http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_grant.xml@101
PS1, Line 101: belonging to a group
Nit: I don't think this addition gets us much. The original phrase was clear 
enough.


http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_show.xml
File docs/topics/impala_show.xml:

http://gerrit.cloudera.org:8080/#/c/17469/1/docs/topics/impala_show.xml@38
PS1, Line 38: The following statements are supported in Impala through Ranger to
:   manage authorization.
Two things here:
1. This is garbled. I'm assuming this is intended to be similar to the "The 
following statements are supported only when Impala uses Ranger to manage 
authorization." from before.
2. Where should this statement go? It doesn't apply to the non-authorization 
show statements, so I feel it should move down a bit. I guess the other 
question is whether this statement can just be removed completely.



--
To view, visit http://gerrit.cloudera.org:8080/17469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4c5e9aa4d060ceaa426908a444d280a5564749d
Gerrit-Change-Number: 17469
Gerrit-PatchSet: 1
Gerrit-Owner: Shajini Thayasingh 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 18 May 2021 22:24:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10502: Handle CREATE/DROP events correctly

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17308 )

Change subject: IMPALA-10502: Handle CREATE/DROP events correctly
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7160/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17308
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2c5e96b48abac015240f20295b3ec3b1d71f24a
Gerrit-Change-Number: 17308
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 22:15:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17284 )

Change subject: IMPALA-10645: Log catalogd HMS API metrics
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7159/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287
Gerrit-Change-Number: 17284
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 22:13:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics

2021-05-18 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17284 )

Change subject: IMPALA-10645: Log catalogd HMS API metrics
..


Patch Set 11: Code-Review+2

Rebased to latest master. There were only minor conflicts from the rebase. 
Carrying forward the +2 from Aman and Quanlong.


--
To view, visit http://gerrit.cloudera.org:8080/17284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287
Gerrit-Change-Number: 17284
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 22:12:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10645: Log catalogd HMS API metrics

2021-05-18 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/17284 )

Change subject: IMPALA-10645: Log catalogd HMS API metrics
..

IMPALA-10645: Log catalogd HMS API metrics

Expose rpc duration, cache hit ratio, etc for Catalogd HMS APIs.
The metrics currently are only logged at debug level
when the catalogd starts a HMS endpoint. A followup
will be done separately to expose them to the debug UI.

This patch was originally contributed by Kishen Das.

Testing:
1. Deployed the catalogd's metastore server and made sure that
the metrics are logged in the catalogd.INFO logs.

Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287
---
M common/thrift/JniCatalog.thrift
M common/thrift/metrics.json
M fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
A fe/src/main/java/org/apache/impala/catalog/metastore/HmsApiNameEnum.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
M 
fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/catalog/monitor/CatalogMonitor.java
11 files changed, 437 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/17284/11
--
To view, visit http://gerrit.cloudera.org:8080/17284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id41afe89bbe3395c158919bddd09f302c6752287
Gerrit-Change-Number: 17284
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library

2021-05-18 Thread Amogh Margoor (Code Review)
Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17389 )

Change subject: IMPALA-10680: Replace StringToFloatInternal using 
fast_double_parser library
..


Patch Set 5:

Benchmarking:
Tested it on more than million rows casted from string to double.
Results shows both algorithms took almost same time:

W/O library: Fetched 1222386 row(s) in 32.10s
With library: Fetched 1222386 row(s) in 31.71s

With library timing is slightly better around half a second but it can be a 
noise.


--
To view, visit http://gerrit.cloudera.org:8080/17389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
Gerrit-Change-Number: 17389
Gerrit-PatchSet: 5
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 21:17:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 20:26:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10701: Switch to use TByteBuffer from thrift

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17428 )

Change subject: IMPALA-10701: Switch to use TByteBuffer from thrift
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7156/


--
To view, visit http://gerrit.cloudera.org:8080/17428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c7834253a16e440204264b0462a1590dea2463
Gerrit-Change-Number: 17428
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 18 May 2021 20:24:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10433: Use Iceberg's fixed partition transforms

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17417 )

Change subject: IMPALA-10433: Use Iceberg's fixed partition transforms
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7158/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Gerrit-Change-Number: 17417
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 18 May 2021 19:03:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10433: Use Iceberg's fixed partition transforms

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17417 )

Change subject: IMPALA-10433: Use Iceberg's fixed partition transforms
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Gerrit-Change-Number: 17417
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 18 May 2021 19:03:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..

IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

Built-in functions to compute SHA-1 digest and SHA-2 family of digest
has been added. Support for SHA2 digest includes SHA224, SHA256,
SHA384 and SHA512. In FIPS mode SHA1, SHA224 and SHA256 have been
disabled and will throw error. SHA2 functions will also throw error
for unsupported bit length i.e., bit length apart from 224, 256, 384,
512.

Testing:
1. Added Unit test for expressions.
2. Added end-to-end test for new functions.

Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Reviewed-on: http://gerrit.cloudera.org:8080/17464
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
5 files changed, 181 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 6
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 5
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 18:48:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10433: Use Iceberg's fixed partition transforms

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17417 )

Change subject: IMPALA-10433: Use Iceberg's fixed partition transforms
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7153/


--
To view, visit http://gerrit.cloudera.org:8080/17417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Gerrit-Change-Number: 17417
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 18 May 2021 18:41:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 28:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8744/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 28
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 18:11:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-18 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 28:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@337
PS27, Line 337:   for (std::vector::const_iterator it = 
minmax_filter_ctxs_.begin();
> Sounds like a good idea.
Looks good, thanks!


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@404
PS27, Line 404:   }
> Partial filter publishing with propagation to scan nodes may be a little bi
Make sense. I suppose execution nodes consuming the minmax filters also still 
need to wait for the remaining filters to arrive before start reading.


http://gerrit.cloudera.org:8080/#/c/17295/28/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17295/28/be/src/exec/partitioned-hash-join-builder.cc@335
PS28, Line 335:   if (filter_ctxs_.size() == 0) return;
Looks like we can remove this branch?
In case minmax_filter_ctxs_ is empty, the loop below will stop immediately.



--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 28
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 18:09:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-18 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#28). ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..

IMPALA-10650: Bailout min/max filters in hash join builder early

This change set addresses the weakness in population min/max filters
in the hash join builder by periodically measuring the usefulness of
each filter and set the 'always_true_' flag accordingly. Once set to
true, the insertion to such a filter completely skips the steps from
the evaluation of the value from a row to the verification of the
value in the min/max range. This optimization is LLVM-enabled.

In addition, a new flag 'is_min_max_value_present' is added to
TRuntimeFilterTargetDesc to indicate whether the min/max column stats
is present in the query plan. The flag eliminates the need to check
the presence of min/max stats for every row in back-end.

Early bail out improves the HJ builder step in general. For example,
the step for join node #11 in TPCDS Q8 improves 13%, and the step
for join node #8 in TPCDS Q16 improves 3.2%.

The Insert() methods are optimized with branch prediction compiler
hints which yield the following improvement when tested with the
insertion of 1 randomly generated items.

  Small Integers: 7.0%
  Integers:   4.1%
  Big Integers:   4.3%
  Strings:5.6%
  Dates:  4.4%
  Timestamps:10.7%
  Decimals(4):   10.4%
  Decimals(8):9.1%

In addition, the min/max stats for pages are read in batches with a
fast track version for column types of int32_t,  int64_t, float,
double and date that have identical storage format as Parquet. For a
row group, the page locations are read only once, instead of once for
every page skipped, resulting in 100x speedup when a subset of 199
pages are skipped.

Testing:
  1. Ran core test;
  2. Ran performance test (TBD).

Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/filter-context.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/parquet/parquet-column-stats.inline.h
M be/src/exec/parquet/parquet-common.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/runtime/runtime-filter-ir.cc
M be/src/util/min-max-filter-ir.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/util/TColumnValueUtil.java
17 files changed, 977 insertions(+), 297 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17295/28
--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 28
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-18 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 27:

(14 comments)

Answer Riza and Zoltan's review comments.

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.h@155
PS27, Line 155:   static bool ShouldRejectFilterBasedOnColumnStats(
> nit: I'm OK with reformatting, but I'm not sure if it was intended
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.cc
File be/src/exec/filter-context.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.cc@231
PS27, Line 231: example
> Could you please update this example?
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@984
PS27, Line 984: scalar_reader->offset_index_
> nit: simply 'offset_index' from at L978?
Good catch!

Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@997
PS27, Line 997: DCHECK
> nit: DCHECK_GE could be used. It has the advantage that in case of failure
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@1010
PS27, Line 1010: Expected
> Any reason why we don't want the error message to be logged here?
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@1107
PS27, Line 1107:
> nit: probably unintended space
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc
File be/src/exec/parquet/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc@281
PS27, Line 281:   const int remainder = num_values % batch;
> nit: do we need to calculate remainder? In the second for-loop we could hav
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc@321
PS27, Line 321:   const int remainder = num_values % batch;
> nit: same as above
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h
File be/src/exec/parquet/parquet-column-stats.inline.h:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h@143
PS27, Line 143:   DCHECK(buffer.size() == sizeof(int32_t));
  :   DCHECK(parquet_type == parquet::Type::INT32);
> nit: DCHECK_EQ
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h@159
PS27, Line 159:   DCHECK(buffer.size() == sizeof(int32_t));
  :   DCHECK(parquet_type == parquet::Type::INT32);
> nit: DCHECK_EQ
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@337
PS27, Line 337:   for (const FilterContext& ctx : filter_ctxs_) {
> I wonder if we can speed this up by iterating ONLY the minmax filters.
Sounds like a good idea.

A new vector minmax_filter_ctxs_ is added to cache the local min max filter 
contexts. An element from it is removed if the element is set to AlwaysTrue. 
The element will not be bothered with overlap check again.


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@345
PS27, Line 345: not_useful = false;
> nit: I think it'd be a bit more readable if we decrease the negations, i.e.
Done


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@404
PS27, Line 404: PublishRuntimeFilters(num_build_rows);
> It seems to me that PublishRuntimeFilters is only called here in FinalizeBu
Partial filter publishing with propagation to scan nodes may be a little bit 
complicated since it involves network traffic and context management. See 
PhjBuilder::FinalizeBuild().

With the work optimizing the insertion to an already disabled filter, and the 
work to only iterate over enabled filters for overlap checking, it looks like 
we can live with the current publishing strategy.


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/util/min-max-filter-ir.cc
File be/src/util/min-max-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/util/min-max-filter-ir.cc@114
PS27, Line 114: predicion
> nit: prediction
Done



--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295

[Impala-ASF-CR] IMPALA-10489: Implement JWT support

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17435 )

Change subject: IMPALA-10489: Implement JWT support
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8742/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c
Gerrit-Change-Number: 17435
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 18 May 2021 16:57:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10485: Support Iceberg field-id based column resolution in the ORC scanner

2021-05-18 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17398 )

Change subject: IMPALA-10485: Support Iceberg field-id based column resolution 
in the ORC scanner
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17398/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17398/1//COMMIT_MSG@10
PS1, Line 10: field-id
> Could you add that this becomes the default for Iceberg tables as well?
Done


http://gerrit.cloudera.org:8080/#/c/17398/1/be/src/exec/orc-metadata-utils.h
File be/src/exec/orc-metadata-utils.h:

http://gerrit.cloudera.org:8080/#/c/17398/1/be/src/exec/orc-metadata-utils.h@62
PS1, Line 62:   enum SchemaResolutionStrategy {
: POSITION,
: ICEBERG_FIELD_ID
:   };
> Do you think this would worth to be generalized with TParquetFallbackSchema
Yeah, I renamed TParquetFallbackSchemaResolution to TSchemaResolutionStrategy 
and use it in both scanners.



--
To view, visit http://gerrit.cloudera.org:8080/17398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2b1abcc25ad2268aa96dff032328e8951dbfb9d
Gerrit-Change-Number: 17398
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 18 May 2021 17:04:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10485: Support Iceberg field-id based column resolution in the ORC scanner

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17398 )

Change subject: IMPALA-10485: Support Iceberg field-id based column resolution 
in the ORC scanner
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8743/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2b1abcc25ad2268aa96dff032328e8951dbfb9d
Gerrit-Change-Number: 17398
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 18 May 2021 17:25:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10485: Support Iceberg field-id based column resolution in the ORC scanner

2021-05-18 Thread Zoltan Borok-Nagy (Code Review)
Hello Tamas Mate, Gabor Kaszab, wangsheng, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17398

to look at the new patch set (#2).

Change subject: IMPALA-10485: Support Iceberg field-id based column resolution 
in the ORC scanner
..

IMPALA-10485: Support Iceberg field-id based column resolution in the ORC 
scanner

Currently the ORC scanner only supports position-based column
resolution. This patch adds Iceberg field-id based column resolution
which will be the default for Iceberg tables. It is needed to support
schema evolution in the future, i.e. ALTER TABLE DROP/RENAME COLUMNS.
(The Parquet scanner already supports Iceberg field-id based column
resolution)

Testing
 * added e2e test 'iceberg-orc-field-id.test' by copying the contents of
   nested-types-scanner-basic,
   nested-types-scanner-array-materialization,
   nested-types-scanner-position,
   nested-types-scanner-maps,
   and executing the queries on an Iceberg table with ORC data files

Change-Id: Ia2b1abcc25ad2268aa96dff032328e8951dbfb9d
---
M be/src/exec/orc-metadata-utils.cc
M be/src/exec/orc-metadata-utils.h
M be/src/exec/parquet/parquet-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M bin/impala-config.sh
M common/thrift/Query.thrift
M testdata/data/README
A 
testdata/data/iceberg_test/hadoop_catalog/ice/complextypestbl_iceberg_orc/data/0-0-boroknagyz_20210331133358_b718b2ff-9f49-4056-a5ed-0d37ec144fff-job_16171873329050_0002-1.orc
A 
testdata/data/iceberg_test/hadoop_catalog/ice/complextypestbl_iceberg_orc/data/1-0-boroknagyz_20210331133358_b718b2ff-9f49-4056-a5ed-0d37ec144fff-job_16171873329050_0002-1.orc
A 
testdata/data/iceberg_test/hadoop_catalog/ice/complextypestbl_iceberg_orc/metadata/46b4a907-2ff3-4799-ba4a-074d04734265-m0.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/complextypestbl_iceberg_orc/metadata/snap-8747481058330439933-1-46b4a907-2ff3-4799-ba4a-074d04734265.avro
A 
testdata/data/iceberg_test/hadoop_catalog/ice/complextypestbl_iceberg_orc/metadata/v1.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/complextypestbl_iceberg_orc/metadata/v2.metadata.json
A 
testdata/data/iceberg_test/hadoop_catalog/ice/complextypestbl_iceberg_orc/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-orc-field-id.test
M tests/query_test/test_iceberg.py
22 files changed, 2,888 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/17398/2
--
To view, visit http://gerrit.cloudera.org:8080/17398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2b1abcc25ad2268aa96dff032328e8951dbfb9d
Gerrit-Change-Number: 17398
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 5:

(5 comments)

Thanks for the changes, I think this is close to being done.

http://gerrit.cloudera.org:8080/#/c/17455/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17455/5//COMMIT_MSG@7
PS5, Line 7: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
Please can you reformat the commit message so that the maximum line length is 
72 chars.


http://gerrit.cloudera.org:8080/#/c/17455/5//COMMIT_MSG@9
PS5, Line 9: OSS is the object store in ali cloud, just like s3a, and jindofs 
is a gateway based on Ali cloud object store.
I think it would be clearer to say "Alibaba" rather than "Ali".


http://gerrit.cloudera.org:8080/#/c/17455/5//COMMIT_MSG@10
PS5, Line 10: The following is about the JindoFS, 
https://github.com/aliyun/alibabacloud-jindofs.
Thanks for the link.
For me that page is hard to understand (due to my limitations), maybe add an 
additional link for English-only speakers like 
https://www.alibabacloud.com/blog/introducing-jindofs-a-high-performance-data-lake-storage-solution_595600


http://gerrit.cloudera.org:8080/#/c/17455/5//COMMIT_MSG@11
PS5, Line 11: If ali object store would be treated as local disk without this 
change, the query performance is not good. This change would create a dedicate 
queue for this kind of target, and improved the OSS scan performance.
Nit: "dedicated" is clearer.


http://gerrit.cloudera.org:8080/#/c/17455/5//COMMIT_MSG@13
PS5, Line 13:
Please add a note

New flags:
 - num_oss_io_threads: Number of OSS/JindoFS I/O threads. Defaults to 16.



--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 5
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Tue, 18 May 2021 16:52:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10489: Implement JWT support

2021-05-18 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17435 )

Change subject: IMPALA-10489: Implement JWT support
..

IMPALA-10489: Implement JWT support

This patch added JWT support with following functionality:
 * Load and parse JWKS from pre-installed JSON file.
 * Read the JWT token from the HTTP Header.
 * Verify the JWT's signature with puclic key in JWKS.
 * Get the username out of the payload of JWT token.

We use third party library jwt-cpp to verify JWT token. jwt-cpp is a
headers only C++ library. It was added to native-toolchain.
This patch modified bootstrap_toolchian.py to download jwt-cpp from
toolchain s3 bucket, and modified makefiles to add jwt-cpp/include
in the include path.

Added BE unit-tests for loading JWKS file and verifying JWT token.
Also added FE custom cluster test for JWT authentication.

Testing:
 - Passed core run.

Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c
---
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/rpc/authentication.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M be/src/util/CMakeLists.txt
A be/src/util/jwt-util-test.cc
A be/src/util/jwt-util.cc
A be/src/util/jwt-util.h
M be/src/util/webserver.cc
M be/src/util/webserver.h
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M bin/rat_exclude_files.txt
A cmake_modules/FindJwtCpp.cmake
M common/thrift/generate_error_codes.py
M common/thrift/metrics.json
M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java
M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java
A testdata/jwt/jwks_rs256.json
22 files changed, 1,478 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/17435/5
--
To view, visit http://gerrit.cloudera.org:8080/17435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c
Gerrit-Change-Number: 17435
Gerrit-PatchSet: 5
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7157/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 16:31:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 16:31:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17389 )

Change subject: IMPALA-10680: Replace StringToFloatInternal using 
fast_double_parser library
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8741/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
Gerrit-Change-Number: 17389
Gerrit-PatchSet: 5
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 15:27:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-18 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 27:

(12 comments)

Found a few nits, but looks good overall.

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.h@155
PS27, Line 155:   static bool ShouldRejectFilterBasedOnColumnStats(
nit: I'm OK with reformatting, but I'm not sure if it was intended


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.cc
File be/src/exec/filter-context.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/filter-context.cc@231
PS27, Line 231: example
Could you please update this example?


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc
File be/src/exec/parquet/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@984
PS27, Line 984: scalar_reader->offset_index_
nit: simply 'offset_index' from at L978?


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@997
PS27, Line 997: DCHECK
nit: DCHECK_GE could be used. It has the advantage that in case of failure it 
prints the actual values.


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@1010
PS27, Line 1010: Expected
Any reason why we don't want the error message to be logged here?


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/hdfs-parquet-scanner.cc@1107
PS27, Line 1107:
nit: probably unintended space


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc
File be/src/exec/parquet/parquet-column-stats.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc@281
PS27, Line 281:   const int remainder = num_values % batch;
nit: do we need to calculate remainder? In the second for-loop we could have

 for (int i = pos; i < num_values; ++i)

Or, use pos itself:

 for (; pos < num_values; ++pos)


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.cc@321
PS27, Line 321:   const int remainder = num_values % batch;
nit: same as above


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h
File be/src/exec/parquet/parquet-column-stats.inline.h:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h@143
PS27, Line 143:   DCHECK(buffer.size() == sizeof(int32_t));
  :   DCHECK(parquet_type == parquet::Type::INT32);
nit: DCHECK_EQ


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/parquet/parquet-column-stats.inline.h@159
PS27, Line 159:   DCHECK(buffer.size() == sizeof(int32_t));
  :   DCHECK(parquet_type == parquet::Type::INT32);
nit: DCHECK_EQ


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@345
PS27, Line 345: not_useful = false;
nit: I think it'd be a bit more readable if we decrease the negations, i.e. 
only call the variable 'useful'.


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/util/min-max-filter-ir.cc
File be/src/util/min-max-filter-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/util/min-max-filter-ir.cc@114
PS27, Line 114: predicion
nit: prediction



--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 15:23:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library

2021-05-18 Thread Amogh Margoor (Code Review)
Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17389 )

Change subject: IMPALA-10680: Replace StringToFloatInternal using 
fast_double_parser library
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17389/4/be/src/util/string-parser.h
File be/src/util/string-parser.h:

http://gerrit.cloudera.org:8080/#/c/17389/4/be/src/util/string-parser.h@495
PS4, Line 495: // Library function doesn't handle leading 0s like 
'1','-.9','-001'
> What do you mean by 'doesn't handle'? Falls back to strtod(), returns nullp
It would return nullptr. I added that as a part of comment now. Exhaustive test 
for all these cases already exist in string-parser-test.cc and these cases were 
identified through them.



--
To view, visit http://gerrit.cloudera.org:8080/17389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
Gerrit-Change-Number: 17389
Gerrit-PatchSet: 5
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 15:05:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library

2021-05-18 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17389 )

Change subject: IMPALA-10680: Replace StringToFloatInternal using 
fast_double_parser library
..

IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library

StringToFloatInternal is used to parse string into float. It had logic
to ensure it is faster than standard functions like strtod in many
cases, but it was not as accurate. We are replacing it by a third
party library named fast_double_parser which is both fast and doesn't
sacrifise the accuracy for speed.

Testing:
1. Added test to check for accuracy improvement.
2. Ran existing Backend tests for correctness.

Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
---
M be/src/exprs/expr-test.cc
M be/src/util/string-parser-test.cc
M be/src/util/string-parser.h
M testdata/workloads/functional-query/queries/QueryTest/values.test
4 files changed, 73 insertions(+), 66 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/17389/5
--
To view, visit http://gerrit.cloudera.org:8080/17389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
Gerrit-Change-Number: 17389
Gerrit-PatchSet: 5
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10701: Switch to use TByteBuffer from thrift

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17428 )

Change subject: IMPALA-10701: Switch to use TByteBuffer from thrift
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7156/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c7834253a16e440204264b0462a1590dea2463
Gerrit-Change-Number: 17428
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 18 May 2021 14:20:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7155/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 14:19:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 18 May 2021 13:54:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..

IMPALA-10704: Fix retried query id not being unregistered when retry fails

When query retry fails in RetryQueryFromThread(), the retried query id
may not be unregistered if the failure happens before we store the
retry_request_state. In this case, QueryDriver::Unregister() has no way
to get the retried query id so it's not deleted. Note that the retried
query id is registered in RetryQueryFromThread() so should be deleted
later. This finally results in a leak in the query driver map, where
queries in it are shown as in-flight queries.

test_retry_query_result_cacheing_failed and
test_retry_query_set_query_in_flight_failed (added in IMPALA-10413)
asserts one in-flight query at the end. This is satisfied by the leak.
Instead, we should verify no running queries at the end.

This patch adds a new field in QueryDriver to remember the registered
retry query id as a backup way for getting it when query retry fails
before we store the ClientRequestState of the retried query (so
retried_client_request_state_ is null).

Tests:
 - Run test_retry_query_result_cacheing_failed and
   test_retry_query_set_query_in_flight_failed 100 times.

Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Reviewed-on: http://gerrit.cloudera.org:8080/17465
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M tests/custom_cluster/test_query_retries.py
3 files changed, 36 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 5
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 12:48:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7154/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 5
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 12:48:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-18 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 4: Code-Review+2

Thanks for applying the changes, it looks great!


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 4
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 12:48:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10433: Use Iceberg's fixed partition transforms

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17417 )

Change subject: IMPALA-10433: Use Iceberg's fixed partition transforms
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Gerrit-Change-Number: 17417
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 18 May 2021 12:41:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10433: Use Iceberg's fixed partition transforms

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17417 )

Change subject: IMPALA-10433: Use Iceberg's fixed partition transforms
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7153/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Gerrit-Change-Number: 17417
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 18 May 2021 12:41:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10433: Use Iceberg's fixed partition transforms

2021-05-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17417 )

Change subject: IMPALA-10433: Use Iceberg's fixed partition transforms
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a
Gerrit-Change-Number: 17417
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 18 May 2021 11:29:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types

2021-05-18 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17026 )

Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most 
common types
..


Patch Set 30: Code-Review+1

(3 comments)

I can give +2 once these comments are resolved.

A note about AVX2: I am ok with committing it as it is and clean it up in a 
later commit, as lot of cleanup should be done in other parts related to AVX2.

http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/util/parquet-bloom-filter-avx2.cc
File be/src/util/parquet-bloom-filter-avx2.cc:

http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/util/parquet-bloom-filter-avx2.cc@17
PS30, Line 17:
Can you add a comment to the files that are mainly copied from Kudu code about 
their source?


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/util/parquet-bloom-filter.cc
File be/src/util/parquet-bloom-filter.cc:

http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/util/parquet-bloom-filter.cc@37
PS30, Line 37: DEFINE_bool(disable_parquetbloomfilter_avx2, false,
Note that AVX2 support became required recently in x86_64:
https://gerrit.cloudera.org/#/c/17406/


http://gerrit.cloudera.org:8080/#/c/17026/30/testdata/workloads/functional-query/queries/QueryTest/parquet-bloom-filter.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-bloom-filter.test:

http://gerrit.cloudera.org:8080/#/c/17026/30/testdata/workloads/functional-query/queries/QueryTest/parquet-bloom-filter.test@9
PS30, Line 9: select int8_col from parquet_bloom_filter where int8_col = 1;
Can you add some tests with more predicates? 
HdfsParquetScanner::CreateColIdx2EqConjunctMap() has quite complex logic and 
some issues may not come up if there is only a single eq predicate. Some 
examples:
- other predicates on a column besides =
- eq predicates on more than one column



--
To view, visit http://gerrit.cloudera.org:8080/17026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287
Gerrit-Change-Number: 17026
Gerrit-PatchSet: 30
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 08:13:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 18 May 2021 07:57:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7152/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 18 May 2021 07:57:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-18 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 3:

> Patch Set 3: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7151/

Failed by IMPALA-10704


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 07:56:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-18 Thread Xianqing He (Code Review)
Xianqing He has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 18 May 2021 06:39:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-18 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 18 May 2021 06:37:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-18 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7151/


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 06:28:30 +
Gerrit-HasComments: No