[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8740/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 03:19:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-17 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 03:11:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-17 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 03:06:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 2:

rebased to fix some merge conflicts.


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 18 May 2021 02:59:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..

IMPALA-10678: Support custom SASL protocol name in Kudu client

This patch added configurable flag variable kudu_sasl_protocol_name,
and call Kudu client API to set the SASL protocol name when creating
Kudu client in the FE and BE.
Upgraded toolchain to pull in new version of Kudu which provides new
Java/C++ client APIs for setting the SASL protocol name.

Testing:
 - Passed core run.

Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
---
M be/src/common/global-flags.cc
M be/src/exec/kudu-util.cc
M be/src/util/backend-gflag-util.cc
M bin/impala-config.sh
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
7 files changed, 15 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/17442/2
--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-17 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 27:

(2 comments)

Hi Qifan,
I wonder if we can improve the minmax filter performance from the build side.
I have the following questions and comments.

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@337
PS27, Line 337:   for (const FilterContext& ctx : filter_ctxs_) {
I wonder if we can speed this up by iterating ONLY the minmax filters.
Maybe copy reference of the minmax filters into separate vector?

This function seems to be called frequently on every PhjBuilder::AddBatch.
I imagine if minmax filter is enabled, only half of filter_ctxs_ elements are 
actually minmax filter.
We can also pop filter out of the vector once it deemed not useful, therefore 
speeding up the next iteration.


http://gerrit.cloudera.org:8080/#/c/17295/27/be/src/exec/partitioned-hash-join-builder.cc@404
PS27, Line 404: PublishRuntimeFilters(num_build_rows);
It seems to me that PublishRuntimeFilters is only called here in FinalizeBuild 
(I assume near the end of the build process).
Since minmax filter can be quickly disabled after reading few early RowBatch, 
shall we consider to publish them as soon as possible?
Say, immediately publish disabled minmax filter from 
PhjBuilder::DetermineUsefulnessForMinmaxFilters()?



--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 02:52:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10485: Support Iceberg field-id based column resolution in the ORC scanner

2021-05-17 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17398 )

Change subject: IMPALA-10485: Support Iceberg field-id based column resolution 
in the ORC scanner
..


Patch Set 1: Code-Review+2

Sorry for late reply, quite busy recently. Thanks for the new feature, this 
patch LGTM.


--
To view, visit http://gerrit.cloudera.org:8080/17398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2b1abcc25ad2268aa96dff032328e8951dbfb9d
Gerrit-Change-Number: 17398
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 18 May 2021 02:11:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 27:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8739/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 18 May 2021 01:41:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10678: Support custom SASL protocol name in Kudu client

2021-05-17 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17442 )

Change subject: IMPALA-10678: Support custom SASL protocol name in Kudu client
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/17442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fb0b50f5e42e8a720564e51ad6c6185b51e3647
Gerrit-Change-Number: 17442
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 18 May 2021 01:39:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17465/1/be/src/runtime/query-driver.h
File be/src/runtime/query-driver.h:

http://gerrit.cloudera.org:8080/#/c/17465/1/be/src/runtime/query-driver.h@265
PS1, Line 265: registered_retry_query_id_
> I think we don't need this because the default constructor of TUniqueId alr
Got it. Thanks



--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 18 May 2021 01:24:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-17 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#27). ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..

IMPALA-10650: Bailout min/max filters in hash join builder early

This change set addresses the weakness in population min/max filters
in the hash join builder by periodically measuring the usefulness of
each filter and set the 'always_true_' flag accordingly. Once set to
true, the insertion to such a filter completely skips the steps from
the evaluation of the value from a row to the verification of the
value in the min/max range. This optimization is LLVM-enabled.

In addition, a new flag 'is_min_max_value_present' is added to
TRuntimeFilterTargetDesc to indicate whether the min/max column stats
is present in the query plan. The flag eliminates the need to check
the presence of min/max stats for every row in back-end.

Early bail out improves the HJ builder step in general. For example,
the step for join node #11 in TPCDS Q8 improves 13%, and the step
for join node #8 in TPCDS Q16 improves 3.2%.

The Insert() methods are optimized with branch prediction compiler
hints which yield the following improvement when tested with the
insertion of 1 randomly generated items.

  Small Integers: 7.0%
  Integers:   4.1%
  Big Integers:   4.3%
  Strings:5.6%
  Dates:  4.4%
  Timestamps:10.7%
  Decimals(4):   10.4%
  Decimals(8):9.1%

In addition, the min/max stats for pages are read in batches with a
fast track version for column types of int32_t,  int64_t, float,
double and date that have identical storage format as Parquet. For a
row group, the page locations are read only once, instead of once for
every page skipped, resulting in 100x speedup when a subset of 199
pages are skipped.

Testing:
  1. Ran core test;
  2. Ran performance test (TBD).

Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/filter-context.cc
M be/src/exec/filter-context.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/parquet/parquet-column-stats.inline.h
M be/src/exec/parquet/parquet-common.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/runtime/runtime-filter-ir.cc
M be/src/util/min-max-filter-ir.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/util/TColumnValueUtil.java
18 files changed, 871 insertions(+), 259 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17295/27
--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-17 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17465/1/be/src/runtime/query-driver.h
File be/src/runtime/query-driver.h:

http://gerrit.cloudera.org:8080/#/c/17465/1/be/src/runtime/query-driver.h@265
PS1, Line 265: registered_retry_query_id_
> Should we set the initial value zero for this variable explicitly?
I think we don't need this because the default constructor of TUniqueId already 
does it: be/generated-sources/gen-cpp/Types_types.h

class TUniqueId {
 public:

  TUniqueId(const TUniqueId&);
  TUniqueId(TUniqueId&&);
  TUniqueId& operator=(const TUniqueId&);
  TUniqueId& operator=(TUniqueId&&);
  TUniqueId() : hi(0), lo(0) {// <--- The default is setting hi and 
lo to 0
  }
};



--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 18 May 2021 01:02:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10502: Handle CREATE/DROP events correctly

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17308 )

Change subject: IMPALA-10502: Handle CREATE/DROP events correctly
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8738/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17308
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2c5e96b48abac015240f20295b3ec3b1d71f24a
Gerrit-Change-Number: 17308
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 00:56:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""

2021-05-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17429 )

Change subject: Revert "Revert "IMPALA-10613: Standup HMS thrift server in 
Catalog""
..

Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""

This reverts commit 829d1a6ab4643b07877fb410971b67f1b1d1b045.

Additionally, this patch has couple of addendums which are related
to the original change:
1. Bug fix the original reverted commit which uses
isSetGetFileMetadata instead of isGetFileMetadata
(see https://gerrit.cloudera.org/#/c/17330/)
2. Fix for intermittent failures on CatalogHmsFileMetadataTest
due to the limitation of the catalogd's HMS client requirement
of need to set "hive.metastore.execute.setugi" to false.

Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
Reviewed-on: http://gerrit.cloudera.org:8080/17429
Tested-by: Impala Public Jenkins 
Reviewed-by: Aman Sinha 
---
M be/src/catalog/catalog-server.cc
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A 
fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/common/impala_test_suite.py
A tests/custom_cluster/test_metastore_service.py
24 files changed, 5,406 insertions(+), 22 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Aman Sinha: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/17429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
Gerrit-Change-Number: 17429
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10502: Handle CREATE/DROP events correctly

2021-05-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17308 )

Change subject: IMPALA-10502: Handle CREATE/DROP events correctly
..


Patch Set 6:

(24 comments)

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@a689
PS5, Line 689:
> Will removing these break third party extensions? If not, we can remove the
yes, you are right. Thanks for catching that!


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@874
PS5, Line 874: loadFileMetadataForPartitions(client, addedPartBuilders, 
/*isRefresh=*/false);
> Can we refactor this with the above method? The major logics are the same.
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@883
PS5, Line 883:
 :   private HdfsPartition.Builder
> nit: can fit into one line
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2436
PS5, Line 2436:  value.
> nit: I think we prefer ++i and some spaces
Done. I never realized that we prefer ++i v/s i++ in the loops. A quick grep 
does indeed so ++i has 319 instances and i++ has only 61. Did some reading to 
know if this is more of a stylistic preference or something more :)


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java
File fe/src/main/java/org/apache/impala/catalog/TableLoader.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@78
PS5, Line 78:   synchronized (metastoreAccessLock_) {
> hmm.. not related to this patch, I think we don't need this anymore. We hav
yeah, I agree. I filed IMPALA-10706


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/catalog/TableLoader.java@112
PS5, Line 112:   throw new TableLoadingExceptio
> I'm not clear on the purpose of this. Do we depend on the createEventId whe
The table has a new field createEventId which is used to track the event. When 
a table is created from Impala we create a IncompleteTable which has the 
createEventId set to corresponding CREATE_TABLE event id from the HMS. But it 
is possible that when the table is loaded, the table was dropped and recreated 
outside Impala. Events processor doesn't update the createEventId of the table 
since this table is in unloaded state. Hence we need to update the eventId to 
the latest CREATE_TABLE event id when we load the table so that the drop and 
create table event which events processor receives on this loaded table is 
ignored.

I will update the comment with more details to make it more readable.


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@744
PS5, Line 744:   LOG.debug("EventId: {} Table {} was not added since "
> nit: Log table name as well?
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@787
PS5, Line 787:   Table tblBefore = null;
> nit: could you move this to line 805 since it's only used there?
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@809
PS5, Line 809: origin
> nit: "original" makes more sense for me :)
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1305
PS5, Line 1305:* Alters an existing view's definition in the metastore. 
Throws an exception
> Can we refactor this with the above method? The major logics are the same.
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1388
PS5, Line 1388:* Updates table property 'impala.lastComputeStatsTime' for 
COMPUTE (INCREMENTAL) STATS,
> nit: redundant blank line
Done


http://gerrit.cloudera.org:8080/#/c/17308/5/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1773
PS5, Line 1773:   ((CreateDatabaseEvent) event).getDatabase());
  : } catch (MetastoreNotificationException e) {
  :   throw new CatalogException("Unable to create a metastore 
event ", e);
  : }
  :   }
  :
  :   /**
> Yeah, I think handling the IF 

[Impala-ASF-CR] IMPALA-10502: Handle CREATE/DROP events correctly

2021-05-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/17308 )

Change subject: IMPALA-10502: Handle CREATE/DROP events correctly
..

IMPALA-10502: Handle CREATE/DROP events correctly

The current way to detect self-events in case of CREATE/DROP events on
database, table and partition is problematic when the same object is
created and dropped repeatedly in quick succession. This happens mainly
due to couple of reasons. For example if we have a sequence of
CREATE_TABLE, DROP_TABLE, CREATE_TABLE ... on the same table, it is
possible that when the create table event is being processed, the table
is not present in catalog because it was dropped recently. In such a
case, events processor does not have enough state information in
catalogd to determine that this table has been dropped from the
catalogd and the event should be ignored. Similarly, if a drop event
is being processed, it is possible that the table has been recreated
with the same name when the drop event is received. In such a case,
events processor removes the table from the catalogd.

This can cause problems for queries which expect the table to exist or
not exist. E.g create table query fails with a table already exists or
a drop table query fails with table does not exist error.

In order to fix this issue, catalogd now keeps track of dropped objects
in a deleteLog which are garbage collected as the events come in. Every
time a database, table or parition is dropped, the deleteLog is
populated with the the drop event id generated due to the drop
operation. This deleteLog is looked up when the event is received to
determine if the event can be ignored.

Testing:
1. Added a new test which loops to create create/drop events for
database, table and partitions.
2. Added new metrics which the test verifies to ensure that events
don't create or drop the object.

Change-Id: Ia2c5e96b48abac015240f20295b3ec3b1d71f24a
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
A fe/src/main/java/org/apache/impala/catalog/events/DeleteEventLog.java
A fe/src/main/java/org/apache/impala/catalog/events/EventFactory.java
M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/InFlightEvents.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M 
fe/src/test/java/org/apache/impala/catalog/events/EventsProcessorStressTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
M 
fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/custom_cluster/test_event_processing.py
21 files changed, 2,136 insertions(+), 853 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/17308/6
--
To view, visit http://gerrit.cloudera.org:8080/17308
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2c5e96b48abac015240f20295b3ec3b1d71f24a
Gerrit-Change-Number: 17308
Gerrit-PatchSet: 6
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sourabh Goyal 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10701: Switch to use TByteBuffer from thrift

2021-05-17 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17428 )

Change subject: IMPALA-10701: Switch to use TByteBuffer from thrift
..


Patch Set 3:

> Patch Set 3: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7133/

Failed by IMPALA-10704


--
To view, visit http://gerrit.cloudera.org:8080/17428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0c7834253a16e440204264b0462a1590dea2463
Gerrit-Change-Number: 17428
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 18 May 2021 00:32:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7151/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 00:31:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 00:31:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""

2021-05-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17429 )

Change subject: Revert "Revert "IMPALA-10613: Standup HMS thrift server in 
Catalog""
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
Gerrit-Change-Number: 17429
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 18 May 2021 00:28:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17429 )

Change subject: Revert "Revert "IMPALA-10613: Standup HMS thrift server in 
Catalog""
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
Gerrit-Change-Number: 17429
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 17 May 2021 23:46:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 5
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 23:23:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9770: [DOCS] Remove Sentry references in documentation

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17469 )

Change subject: IMPALA-9770: [DOCS] Remove Sentry references in documentation
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/631/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/17469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4c5e9aa4d060ceaa426908a444d280a5564749d
Gerrit-Change-Number: 17469
Gerrit-PatchSet: 1
Gerrit-Owner: Shajini Thayasingh 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 22:44:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9770: [DOCS] Remove Sentry references in documentation

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17469 )

Change subject: IMPALA-9770: [DOCS] Remove Sentry references in documentation
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/631/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/17469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4c5e9aa4d060ceaa426908a444d280a5564749d
Gerrit-Change-Number: 17469
Gerrit-PatchSet: 1
Gerrit-Owner: Shajini Thayasingh 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 22:37:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9770: [DOCS] Remove Sentry references in documentation

2021-05-17 Thread Shajini Thayasingh (Code Review)
Shajini Thayasingh has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17469


Change subject: IMPALA-9770: [DOCS] Remove Sentry references in documentation
..

IMPALA-9770: [DOCS] Remove Sentry references in documentation

Updated all the associated topics.

Change-Id: Id4c5e9aa4d060ceaa426908a444d280a5564749d
---
M docs/shared/impala_common.xml
M docs/topics/impala_adls.xml
M docs/topics/impala_alter_database.xml
M docs/topics/impala_alter_table.xml
M docs/topics/impala_alter_view.xml
M docs/topics/impala_authorization.xml
M docs/topics/impala_create_role.xml
M docs/topics/impala_delegation.xml
M docs/topics/impala_drop_role.xml
M docs/topics/impala_grant.xml
M docs/topics/impala_insert.xml
M docs/topics/impala_invalidate_metadata.xml
M docs/topics/impala_kudu.xml
M docs/topics/impala_langref_unsupported.xml
M docs/topics/impala_ldap.xml
M docs/topics/impala_lineage.xml
M docs/topics/impala_logging.xml
M docs/topics/impala_refresh.xml
M docs/topics/impala_refresh_authorization.xml
M docs/topics/impala_revoke.xml
M docs/topics/impala_scaling_limits.xml
M docs/topics/impala_security.xml
M docs/topics/impala_security_files.xml
M docs/topics/impala_show.xml
M docs/topics/impala_ssl.xml
25 files changed, 170 insertions(+), 361 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/17469/1
--
To view, visit http://gerrit.cloudera.org:8080/17469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id4c5e9aa4d060ceaa426908a444d280a5564749d
Gerrit-Change-Number: 17469
Gerrit-PatchSet: 1
Gerrit-Owner: Shajini Thayasingh 


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 4
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 22:31:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10676: Improve start/stop scripts for Hiveserver and Metastore

2021-05-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17340 )

Change subject: IMPALA-10676: Improve start/stop scripts for Hiveserver and 
Metastore
..

IMPALA-10676: Improve start/stop scripts for Hiveserver and Metastore

- Separate metastore and hiveserver starting in run-hive-server.sh
- Change kill-hive-server.sh to shut down Metastore and retry

Change-Id: Ie9208efdf49f383c5cfb10cd9881272847405a05
Reviewed-on: http://gerrit.cloudera.org:8080/17340
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
Reviewed-by: Vihang Karajgaonkar 
---
M testdata/bin/kill-hive-server.sh
M testdata/bin/run-hive-server.sh
2 files changed, 58 insertions(+), 10 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified
  Vihang Karajgaonkar: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/17340
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie9208efdf49f383c5cfb10cd9881272847405a05
Gerrit-Change-Number: 17340
Gerrit-PatchSet: 2
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..


Patch Set 26:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8737/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 26
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 May 2021 20:50:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10650: Bailout min/max filters in hash join builder early

2021-05-17 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#26). ( 
http://gerrit.cloudera.org:8080/17295 )

Change subject: IMPALA-10650: Bailout min/max filters in hash join builder early
..

IMPALA-10650: Bailout min/max filters in hash join builder early

This change set addresses the weakness in population min/max filters
in the hash join builder by periodically measuring the usefulness of
each filter and set the 'always_true_' flag accordingly. Once set to
true, the insertion to such a filter completely skips the steps from
the evaluation of the value from a row to the verification of the
value in the min/max range. This optimization is LLVM-enabled.

In addition, a new flag 'is_min_max_value_present' is added to
TRuntimeFilterTargetDesc to indicate whether the min/max column stats
is present in the query plan. The flag eliminates the need to check
the presence of min/max stats for every row in back-end.

Early bail out improves the HJ builder step in general. For example,
the step for join node #11 in TPCDS Q8 improves 13%, and the step
for join node #8 in TPCDS Q16 improves 3.2%.

The Insert() methods are optimized with branch prediction compiler
hints which yield the following improvement when tested with the
insertion of 1 randomly generated items.

  Small Integers: 7.0%
  Integers:   4.1%
  Big Integers:   4.3%
  Strings:5.6%
  Dates:  4.4%
  Timestamps:10.7%
  Decimals(4):   10.4%
  Decimals(8):9.1%

In addition, the min/max stats for pages are read in batches with a
fast track version for column types of int32_t,  int64_t, float,
double and date that have identical storage format as Parquet. For a
row group, the page locations are read only once, instead of once for
every page skipped.

Testing:
  1. Ran core test;
  2. Ran performance test (TBD).

Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/filter-context.cc
M be/src/exec/filter-context.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/parquet/parquet-column-stats.inline.h
M be/src/exec/parquet/parquet-common.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/runtime/runtime-filter-ir.cc
M be/src/util/min-max-filter-ir.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/util/TColumnValueUtil.java
18 files changed, 875 insertions(+), 259 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17295/26
--
To view, visit http://gerrit.cloudera.org:8080/17295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I193646e7acfdd3023f7c947d8107da58a1f41183
Gerrit-Change-Number: 17295
Gerrit-PatchSet: 26
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 4: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 4
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 May 2021 20:06:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8736/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 4
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 May 2021 19:17:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Amogh Margoor (Code Review)
Amogh Margoor has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 4:

(2 comments)

Thanks for the review, Zoltan. Have done the required changes.

http://gerrit.cloudera.org:8080/#/c/17464/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/17464/3/be/src/exprs/expr-test.cc@5582
PS3, Line 5582:   TestStringValue(sha2fn + ", 512)", expected);
> Please add some tests with some special values, e.g. NULL, empty string
Done


http://gerrit.cloudera.org:8080/#/c/17464/3/be/src/exprs/utility-functions-ir.cc
File be/src/exprs/utility-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17464/3/be/src/exprs/utility-functions-ir.cc@264
PS3, Line 264: x->SetError("O
> What is the return value of SHA* functions and why we ignore it?
It returns pointer to the hash. We don't need return value here as sha-hash.ptr 
points to the same. I have removed it for now as sha functions are not 
attributed as `warn_unused_result`.



--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 4
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 May 2021 19:03:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..

IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

Built-in functions to compute SHA-1 digest and SHA-2 family of digest
has been added. Support for SHA2 digest includes SHA224, SHA256,
SHA384 and SHA512. In FIPS mode SHA1, SHA224 and SHA256 have been
disabled and will throw error. SHA2 functions will also throw error
for unsupported bit length i.e., bit length apart from 224, 256, 384,
512.

Testing:
1. Added Unit test for expressions.
2. Added end-to-end test for new functions.

Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
5 files changed, 181 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17464/4
--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 4
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10485: Support Iceberg field-id based column resolution in the ORC scanner

2021-05-17 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17398 )

Change subject: IMPALA-10485: Support Iceberg field-id based column resolution 
in the ORC scanner
..


Patch Set 1: Code-Review+1

(2 comments)

Hi Zoltan, LGTM!

http://gerrit.cloudera.org:8080/#/c/17398/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17398/1//COMMIT_MSG@10
PS1, Line 10: field-id
Could you add that this becomes the default for Iceberg tables as well?


http://gerrit.cloudera.org:8080/#/c/17398/1/be/src/exec/orc-metadata-utils.h
File be/src/exec/orc-metadata-utils.h:

http://gerrit.cloudera.org:8080/#/c/17398/1/be/src/exec/orc-metadata-utils.h@62
PS1, Line 62:   enum SchemaResolutionStrategy {
: POSITION,
: ICEBERG_FIELD_ID
:   };
Do you think this would worth to be generalized with 
TParquetFallbackSchemaResolution? Couldn't find if Iceberg has resolution 
strategies worth switching between.



--
To view, visit http://gerrit.cloudera.org:8080/17398
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2b1abcc25ad2268aa96dff032328e8951dbfb9d
Gerrit-Change-Number: 17398
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 17 May 2021 18:40:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17429 )

Change subject: Revert "Revert "IMPALA-10613: Standup HMS thrift server in 
Catalog""
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8735/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
Gerrit-Change-Number: 17429
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 17 May 2021 18:00:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 17 May 2021 17:50:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 1:

Thanks for working on this. LGTM


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 17 May 2021 17:50:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8734/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 5
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 17:50:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17429 )

Change subject: Revert "Revert "IMPALA-10613: Standup HMS thrift server in 
Catalog""
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7150/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
Gerrit-Change-Number: 17429
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 17 May 2021 17:41:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""

2021-05-17 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/17429 )

Change subject: Revert "Revert "IMPALA-10613: Standup HMS thrift server in 
Catalog""
..

Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""

This reverts commit 829d1a6ab4643b07877fb410971b67f1b1d1b045.

Additionally, this patch has couple of addendums which are related
to the original change:
1. Bug fix the original reverted commit which uses
isSetGetFileMetadata instead of isGetFileMetadata
(see https://gerrit.cloudera.org/#/c/17330/)
2. Fix for intermittent failures on CatalogHmsFileMetadataTest
due to the limitation of the catalogd's HMS client requirement
of need to set "hive.metastore.execute.setugi" to false.

Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
---
M be/src/catalog/catalog-server.cc
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
A fe/src/main/java/org/apache/impala/catalog/CatalogHmsAPIHelper.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A 
fe/src/main/java/org/apache/impala/catalog/GetPartialCatalogObjectRequestBuilder.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogHmsClientUtils.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServer.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/CatalogMetastoreServiceHandler.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/ICatalogMetastoreServer.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/MetastoreServiceHandler.java
A 
fe/src/main/java/org/apache/impala/catalog/metastore/NoOpCatalogMetastoreServer.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/CatalogHmsFileMetadataTest.java
A 
fe/src/test/java/org/apache/impala/catalog/metastore/EnableCatalogdHmsCacheFlagTest.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M tests/common/impala_test_suite.py
A tests/custom_cluster/test_metastore_service.py
24 files changed, 5,406 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/17429/7
--
To view, visit http://gerrit.cloudera.org:8080/17429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
Gerrit-Change-Number: 17429
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7149/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 5
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 17:29:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

2021-05-17 Thread Yong Yang (Code Review)
Yong Yang has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695: add dedicated thread pool for OSS/JindoFS.
..

IMPALA-10695: add dedicated thread pool for OSS/JindoFS.

OSS is the object store in ali cloud, just like s3a, and jindofs is a gateway 
based on Ali cloud object store.
The following is about the JindoFS, 
https://github.com/aliyun/alibabacloud-jindofs.
If ali object store would be treated as local disk without this change, the 
query performance is not good. This change would create a dedicate queue for 
this kind of target, and improved the OSS scan performance.
I have tested it in our environment, and observed at least double the scan 
speed.

Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Signed-off-by: Yong Yang 
---
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/util/hdfs-util.cc
M be/src/util/hdfs-util.h
5 files changed, 25 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17455/5
--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 5
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 


[Impala-ASF-CR] IMPALA-10695:add OSS/JindoFS support, create a dedicate thread pool for this kind of target, default is 16.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695:add OSS/JindoFS support,  create a dedicate thread 
pool for this kind of target, default is 16.
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8733/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 4
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 17:05:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10695:add OSS/JindoFS support, create a dedicate thread pool for this kind of target, default is 16.

2021-05-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695:add OSS/JindoFS support,  create a dedicate thread 
pool for this kind of target, default is 16.
..


Patch Set 4:

(1 comment)

Please take a look at other Impala commit messages to see examples of how to do 
this.
Note I also cleared the 'topic' you set in gerrit as it seemed to make the main 
gerrit page untidy.

http://gerrit.cloudera.org:8080/#/c/17455/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17455/4//COMMIT_MSG@7
PS4, Line 7: IMPALA-10695:add OSS/JindoFS support,  create a dedicate thread 
pool for this kind of target, default is 16.
I think you should have a simple first line like "IMPALA-10695: add dedicated 
thread pool for OSS/JindoFS."
Then, after a blank line a description of what is in the change:
- a sentence describing OSS/JindoFS.
- a link to more information about OSS/JindoFS.
- why the change is useful
- how you tested the change



--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 4
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 16:53:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 1: -Code-Review

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17465/1/be/src/runtime/query-driver.h
File be/src/runtime/query-driver.h:

http://gerrit.cloudera.org:8080/#/c/17465/1/be/src/runtime/query-driver.h@265
PS1, Line 265: registered_retry_query_id_
Should we set the initial value zero for this variable explicitly?



--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Mon, 17 May 2021 16:45:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10695:add OSS/JindoFS support, create a dedicate thread pool for this kind of target, default is 16.

2021-05-17 Thread Yong Yang (Code Review)
Yong Yang has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: IMPALA-10695:add OSS/JindoFS support,  create a dedicate thread 
pool for this kind of target, default is 16.
..

IMPALA-10695:add OSS/JindoFS support,  create a dedicate thread pool for this 
kind of target, default is 16.

Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Signed-off-by: Yong Yang 
---
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/util/hdfs-util.cc
M be/src/util/hdfs-util.h
5 files changed, 25 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17455/4
--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 4
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-17 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Mon, 17 May 2021 16:38:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 3:

(2 comments)

Looks good, only had some minor comments

http://gerrit.cloudera.org:8080/#/c/17464/3/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/17464/3/be/src/exprs/expr-test.cc@5582
PS3, Line 5582:   TestStringValue(sha2fn + ", 512)", expected);
Please add some tests with some special values, e.g. NULL, empty string


http://gerrit.cloudera.org:8080/#/c/17464/3/be/src/exprs/utility-functions-ir.cc
File be/src/exprs/utility-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17464/3/be/src/exprs/utility-functions-ir.cc@264
PS3, Line 264: discard_result
What is the return value of SHA* functions and why we ignore it?

I think we should either add a code comment why we ignore the result, or add a 
warning to 'ctx'.



--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 3
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 May 2021 16:29:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] add OSS/JindoFS support, impala with this change will create a dedicate thread pool for this kind of target. By default 16 threads would be craeted

2021-05-17 Thread Yong Yang (Code Review)
Yong Yang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: add OSS/JindoFS support, impala with this change will create a 
dedicate thread pool for this kind of target. By default 16 threads would be 
craeted
..


Patch Set 2:

Can I directly modify this request or need resubmit a new one?


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 2
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 16:14:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] add OSS/JindoFS support, impala with this change will create a dedicate thread pool for this kind of target. By default 16 threads would be craeted

2021-05-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: add OSS/JindoFS support, impala with this change will create a 
dedicate thread pool for this kind of target. By default 16 threads would be 
craeted
..


Patch Set 2:

(1 comment)

Hi Yong Yang,
thanks for your contribution to Impala!
I think the code looks good but the commit message needs some tidying.

http://gerrit.cloudera.org:8080/#/c/17455/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17455/2//COMMIT_MSG@7
PS2, Line 7: add OSS/JindoFS support, impala with this change will create a 
dedicate thread pool for this kind of target.
Can you please rewrite the commit messages according the guidelines in 
https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala ?
Importantly the first line of the commit message should start with 
"IMPALA-10696:".
A good way to see what commit message should look like is to read other 
contributors commit messages.
Also, it might be useful to include a very short description to OSS/JindoFS, or 
a link to a good introduction to this technology.



--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 2
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 16:07:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8732/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 3
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 May 2021 14:52:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..

IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

Built-in functions to compute SHA-1 digest and SHA-2 family of digest
has been added. Support for SHA2 digest includes SHA224, SHA256,
SHA384 and SHA512. In FIPS mode SHA1, SHA224 and SHA256 have been
disabled and will throw error. SHA2 functions will also throw error
for unsupported bit length i.e., bit length apart from 224, 256, 384,
512.

Testing:
1. Added Unit test for expressions.
2. Added end-to-end test for new functions.

Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
5 files changed, 145 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17464/3
--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 3
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10688: Implement ds cpc stringify function

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17373 )

Change subject: IMPALA-10688: Implement ds_cpc_stringify function
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7147/


--
To view, visit http://gerrit.cloudera.org:8080/17373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285
Gerrit-Change-Number: 17373
Gerrit-PatchSet: 3
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 13:55:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17466 )

Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8731/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead
Gerrit-Change-Number: 17466
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 13:36:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17026 )

Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most 
common types
..


Patch Set 30:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8729/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287
Gerrit-Change-Number: 17026
Gerrit-PatchSet: 30
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 May 2021 13:18:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10642: Write support for Parquet Bloom filters - most common types

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17262 )

Change subject: IMPALA-10642: Write support for Parquet Bloom filters - most 
common types
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8730/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17262
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie865efd4f0c11b9e111fb94f77d084bf6ee20792
Gerrit-Change-Number: 17262
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 17 May 2021 13:17:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10626: Add support for Iceberg's Catalogs API

2021-05-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17466


Change subject: IMPALA-10626: Add support for Iceberg's Catalogs API
..

IMPALA-10626: Add support for Iceberg's Catalogs API

Iceberg recently switched to use its Catalogs class to define
catalog and table properties. Catalog information is stored in
a configuration file such as hive-site.xml. And the table properties
contiain information about which catalog is being used and what is
the Iceberg table id.

E.g. in the Hive conf we can have the following properties to define
catalogs:

 iceberg.catalog..type = hadoop
 iceberg.catalog..warehouse = somelocation

 or

 iceberg.catalog..type = hive

And at the table level we can have the following:

iceberg.catalog = 
name = 

Table property 'iceberg.catalog' refers to a Catalog defined in the
configuration file. This is in contradiction with Impala's current
behavior where we are already using 'iceberg.catalog', and it can
have the following values:

 * hive.catalog for HiveCatalog
 * hadoop.catalog for HadoopCatalog
 * hadoop.tables for HadoopTables

To be backward-compatible and also support the new Catalogs properties
Impala still recognizes the above special values. But, from now Impala
doesn't define 'iceberg.catalog' by default. 'iceberg.catalog' being
NULL means HiveCatalog for both Impala and Iceberg's Catalogs API,
hence for Hive and Spark as well.

If 'iceberg.catalog' has a different value than the special values it
indicates that Iceberg's Catalogs API is being used, so Impala will
try to look up the catalog configuration from the Hive config file.

Testing:
 * added SHOW CREATE TABLE tests
 * added e2e tests that create/insert/drop Iceberg tables with Catalogs
 * manually tested interop behavior with Hive

Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java
A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/resources/hive-site.xml.py
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
M tests/query_test/test_iceberg.py
14 files changed, 308 insertions(+), 35 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17466/1
--
To view, visit http://gerrit.cloudera.org:8080/17466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead
Gerrit-Change-Number: 17466
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17026 )

Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most 
common types
..


Patch Set 30:

(182 comments)

http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h
File be/src/thirdparty/xxhash/xxhash.h:

http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@70
PS30, Line 70: 
https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html?showComment=1552696407071#c3490092340461170735
line too long (112 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@92
PS30, Line 92:  *  
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@113
PS30, Line 113: #  elif defined (__cplusplus) || (defined (__STDC_VERSION__) && 
(__STDC_VERSION__ >= 199901L) /* C99 */)
line too long (104 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@243
PS30, Line 243: #  define XXH3_64bits_reset_withSecret XXH_NAME2(XXH_NAMESPACE, 
XXH3_64bits_reset_withSecret)
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@253
PS30, Line 253: #  define XXH3_128bits_reset_withSeed XXH_NAME2(XXH_NAMESPACE, 
XXH3_128bits_reset_withSeed)
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@254
PS30, Line 254: #  define XXH3_128bits_reset_withSecret 
XXH_NAME2(XXH_NAMESPACE, XXH3_128bits_reset_withSecret)
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@270
PS30, Line 270: #define XXH_VERSION_NUMBER  (XXH_VERSION_MAJOR *100*100 + 
XXH_VERSION_MINOR *100 + XXH_VERSION_RELEASE)
line too long (103 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@429
PS30, Line 429:  * @param statePtr A pointer to an @ref XXH32_state_t allocated 
with @ref XXH32_createState().
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@441
PS30, Line 441: XXH_PUBLIC_API void XXH32_copyState(XXH32_state_t* dst_state, 
const XXH32_state_t* src_state);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@476
PS30, Line 476: XXH_PUBLIC_API XXH_errorcode XXH32_update (XXH32_state_t* 
statePtr, const void* input, size_t length);
line too long (102 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@628
PS30, Line 628: XXH_PUBLIC_API void XXH64_copyState(XXH64_state_t* dst_state, 
const XXH64_state_t* src_state);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@631
PS30, Line 631: XXH_PUBLIC_API XXH_errorcode XXH64_update (XXH64_state_t* 
statePtr, const void* input, size_t length);
line too long (102 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@700
PS30, Line 700: XXH_PUBLIC_API XXH64_hash_t XXH3_64bits_withSeed(const void* 
data, size_t len, XXH64_hash_t seed);
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@724
PS30, Line 724: XXH_PUBLIC_API XXH64_hash_t XXH3_64bits_withSecret(const void* 
data, size_t len, const void* secret, size_t secretSize);
line too long (120 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@743
PS30, Line 743: XXH_PUBLIC_API void XXH3_copyState(XXH3_state_t* dst_state, 
const XXH3_state_t* src_state);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@756
PS30, Line 756: XXH_PUBLIC_API XXH_errorcode 
XXH3_64bits_reset_withSeed(XXH3_state_t* statePtr, XXH64_hash_t seed);
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@766
PS30, Line 766: XXH_PUBLIC_API XXH_errorcode 
XXH3_64bits_reset_withSecret(XXH3_state_t* statePtr, const void* secret, size_t 
secretSize);
line too long (121 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@768
PS30, Line 768: XXH_PUBLIC_API XXH_errorcode XXH3_64bits_update (XXH3_state_t* 
statePtr, const void* input, size_t length);
line too long (107 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@791
PS30, Line 791: XXH_PUBLIC_API XXH128_hash_t XXH3_128bits_withSeed(const void* 
data, size_t len, XXH64_hash_t seed);
line too long (100 > 90)


http://gerrit.cloudera.org:8080/#/c/17026/30/be/src/thirdparty/xxhash/xxhash.h@792
PS30, Line 792: XXH_PUBLIC_API XXH128_hash_t XXH3_128bits_withSecret(const 
void* data, size_t len, const void* secret, size_t secretSize);
line too long (122 > 90)


[Impala-ASF-CR] IMPALA-10640: Support reading Parquet Bloom filters - most common types

2021-05-17 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#30). ( 
http://gerrit.cloudera.org:8080/17026 )

Change subject: IMPALA-10640: Support reading Parquet Bloom filters - most 
common types
..

IMPALA-10640: Support reading Parquet Bloom filters - most common types

This change adds read support for Parquet Bloom filters for types that
can reasonably be supported in Impala. Other types, such as CHAR(N),
would be very difficult to support because the length may be different
in Parquet and in Impala which results in truncation or padding, and
that changes the hash which makes using the Bloom filter impossible.
Write support will be added in a later change.
The supported Parquet type - Impala type pairs are the following:

 ---
|Parquet type |  Impala type|
|---|
|INT32|  TINYINT, SMALLINT, INT |
|INT64|  BIGINT |
|FLOAT|  FLOAT  |
|DOUBLE   |  DOUBLE |
|BYTE_ARRAY   |  STRING |
 ---

The following types are not supported for the given reasons:

 
|Impala type |  Problem  |
||
|VARCHAR(N)  | truncation can change hash|
|CHAR(N) | padding / truncation can change hash  |
|DECIMAL | multiple encodings supported  |
|TIMESTAMP   | multiple encodings supported, timezone conversion |
|DATE| not considered yet|
 

Support may be added for these types later, see IMPALA-10641.

If a Bloom filter is available for a column that is fully dictionary
encoded, the Bloom filter is not used as the dictionary can give exact
results in filtering.

Testing:
  - Added tests/query_test/test_parquet_bloom_filter.py that tests
whether Parquet Bloom filtering works for the supported types and
that we do not incorrectly discard row groups for the unsupported
type VARCHAR. The Parquet file used in the test was generated with
an external tool.
  - Added unit tests for ParquetBloomFilter in file
be/src/util/parquet-bloom-filter-test.cc
  - A minor, unrelated change was done in
be/src/util/bloom-filter-test.cc: the MakeRandom() function had
return type uint64_t, the documentation claimed it returned a 64 bit
random number, but the actual number of random bits is 32, which is
what is intended in the tests. The return type and documentation
have been corrected to use 32 bits.

Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287
---
M LICENSE.txt
M be/src/exec/parquet/CMakeLists.txt
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
A be/src/exec/parquet/parquet-bloom-filter-util.cc
A be/src/exec/parquet/parquet-bloom-filter-util.h
M be/src/exprs/expr-value.h
M be/src/exprs/literal.cc
M be/src/exprs/literal.h
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
A be/src/thirdparty/xxhash/README.md
A be/src/thirdparty/xxhash/xxhash.h
M be/src/util/CMakeLists.txt
M be/src/util/bloom-filter-test.cc
M be/src/util/bloom-filter.cc
M be/src/util/bloom-filter.h
A be/src/util/impala-bloom-filter-buffer-allocator.cc
A be/src/util/impala-bloom-filter-buffer-allocator.h
A be/src/util/parquet-bloom-filter-avx2.cc
A be/src/util/parquet-bloom-filter-test.cc
A be/src/util/parquet-bloom-filter.cc
A be/src/util/parquet-bloom-filter.h
M bin/jenkins/critique-gerrit-review.py
M bin/rat_exclude_files.txt
M bin/run_clang_tidy.sh
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M common/thrift/parquet.thrift
M testdata/data/README
A testdata/data/parquet-bloom-filtering.parquet
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-bloom-filter-disabled.test
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-bloom-filter.test
A tests/query_test/test_parquet_bloom_filter.py
37 files changed, 7,410 insertions(+), 127 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/17026/30
--
To view, visit http://gerrit.cloudera.org:8080/17026
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287
Gerrit-Change-Number: 17026
Gerrit-PatchSet: 30
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Zoltan 

[Impala-ASF-CR] IMPALA-10642: Write support for Parquet Bloom filters - most common types

2021-05-17 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/17262 )

Change subject: IMPALA-10642: Write support for Parquet Bloom filters - most 
common types
..

IMPALA-10642: Write support for Parquet Bloom filters - most common types

This change adds support for writing Parquet Bloom filters for the types
for which read support was added in IMPALA-10640.

Writing of Parquet Bloom filters can be controlled by the
'parquet_bloom_filter_write' query option and the
'parquet.bloom.filter.columns' table property. The query option has the
following possible values:
  NEVER  - never write Parquet Bloom filters
  IF_NO_DICT - write Parquet Bloom filters if specified in the table
   properties AND if the row group is not fully
   dictionary encoded (the number of distinct values exceeds
   the maximum dictionary size)
  ALWAYS - always write Parquet Bloom filters if specified in the
   table properties, even if the row group is fully
   dictionary encoded

The 'parquet.bloom.filter.columns' table property is a comma separated
list of 'col_name:bytes' pairs. The 'bytes' part means the size of the
bitset of the Bloom filter, and is optional. If the size is not given,
it will be the maximal Bloom filter size
(ParquetBloomFilter::MAX_BYTES).
Example: "col1:1024,col2,col4:100'.

Testing:
  - Added a test in tests/query_test/test_parquet_bloom_filter.py that
uses Impala to write the same table as in the test file
'testdata/data/parquet-bloom-filtering.parquet' and checks whether the
Parquet Bloom filter header and bitset are identical.
  - 'test_fallback_from_dict' tests falling back from dict encoding to
plain and using Bloom filters.

Change-Id: Ie865efd4f0c11b9e111fb94f77d084bf6ee20792
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.h
M be/src/exec/parquet/parquet-bloom-filter-util.cc
M be/src/exec/parquet/parquet-bloom-filter-util.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/dict-encoding.h
M be/src/util/parquet-bloom-filter-test.cc
M be/src/util/parquet-bloom-filter.cc
M be/src/util/parquet-bloom-filter.h
M common/thrift/DataSinks.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M tests/query_test/test_parquet_bloom_filter.py
20 files changed, 694 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/17262/11
--
To view, visit http://gerrit.cloudera.org:8080/17262
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie865efd4f0c11b9e111fb94f77d084bf6ee20792
Gerrit-Change-Number: 17262
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17427 )

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8728/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 10:39:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17465 )

Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8727/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Mon, 17 May 2021 10:30:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10702: Add warning logs for slow or large catalogd response

2021-05-17 Thread Quanlong Huang (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17427

to look at the new patch set (#2).

Change subject: IMPALA-10702: Add warning logs for slow or large catalogd 
response
..

IMPALA-10702: Add warning logs for slow or large catalogd response

It'd be helpful to log the slow or large responses of catalogd in
debugging scalability issues. This patch adds these warning logs in
JniCatalog, where we serialize thrift responses. See some example
outputs in the jira description.

Responses that have size larger than 50MB or take more than 60s to
finish will be logged with the request. Add flags for these two
thredshold in case users found the warnings too verbose and want to
increase the thresholds.

Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
---
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/common/JniUtil.java
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
5 files changed, 129 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/17427/2
--
To view, visit http://gerrit.cloudera.org:8080/17427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icffcfcaad2a718aebf79e2331efb05ca7a9a7671
Gerrit-Change-Number: 17427
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10704: Fix retried query id not being unregistered when retry fails

2021-05-17 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17465


Change subject: IMPALA-10704: Fix retried query id not being unregistered when 
retry fails
..

IMPALA-10704: Fix retried query id not being unregistered when retry fails

When query retry fails in RetryQueryFromThread(), the retried query id
may not be unregistered if the failure happens before we store the
retry_request_state. In this case, QueryDriver::Unregister() has no way
to get the retried query id so it's not deleted. Note that the retried
query id is registered in RetryQueryFromThread() so should be deleted
later. This finally results in a leak in the query driver map, where
queries in it are shown as in-flight queries.

test_retry_query_result_cacheing_failed and
test_retry_query_set_query_in_flight_failed (added in IMPALA-10413)
asserts one in-flight query at the end. This is satisfied by the leak.
Instead, we should verify no running queries at the end.

This patch adds a new field in QueryDriver to remember the registered
retry query id as a backup way for getting it when query retry fails
before we store the ClientRequestState of the retried query (so
retried_client_request_state_ is null).

Tests:
 - Run test_retry_query_result_cacheing_failed and
   test_retry_query_set_query_in_flight_failed 100 times.

Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
---
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M tests/custom_cluster/test_query_retries.py
3 files changed, 36 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/17465/1
--
To view, visit http://gerrit.cloudera.org:8080/17465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I074526799d68041a425b2379e74f8d8b45ce892a
Gerrit-Change-Number: 17465
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8726/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 2
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 10:14:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8725/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 1
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 10:07:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/17464 )

Change subject: IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..

IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest.

Built-in functions to compute SHA-1 digest and SHA-2 family of digest
has been added. Support for SHA2 digest includes SHA224, SHA256,
SHA384 and SHA512. In FIPS mode SHA1, SHA224 and SHA256 have been
disabled and will return NULL. SHA2 functions will also return NULL
for unsupported bit length.

Testing:
1. Added Unit test for expressions.
2. Added end-to-end test for new functions.

Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
5 files changed, 142 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17464/2
--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 2
Gerrit-Owner: Amogh Margoor 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10679 Add builtin functions to comptute SHA-1 and SHA-2 digest.

2021-05-17 Thread Amogh Margoor (Code Review)
Amogh Margoor has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17464


Change subject: IMPALA-10679 Add builtin functions to comptute SHA-1 and SHA-2 
digest.
..

IMPALA-10679 Add builtin functions to comptute SHA-1 and SHA-2 digest.

Built-in functions to compute SHA-1 digest and SHA-2 family of digest
has been added. Support for SHA2 digest includes SHA224, SHA256,
SHA384 and SHA512. In FIPS mode SHA1, SHA224 and SHA256 have been
disabled and will return NULL. SHA2 functions will also return NULL
for unsupported bit length.

Testing:
1. Added Unit test for expressions.
2. Added end-to-end test for new functions.

Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
5 files changed, 142 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/17464/1
--
To view, visit http://gerrit.cloudera.org:8080/17464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be
Gerrit-Change-Number: 17464
Gerrit-PatchSet: 1
Gerrit-Owner: Amogh Margoor 


[Impala-ASF-CR] IMPALA-10688: Implement ds cpc stringify function

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17373 )

Change subject: IMPALA-10688: Implement ds_cpc_stringify function
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7147/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285
Gerrit-Change-Number: 17373
Gerrit-PatchSet: 3
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 07:56:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10688: Implement ds cpc stringify function

2021-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17373 )

Change subject: IMPALA-10688: Implement ds_cpc_stringify function
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285
Gerrit-Change-Number: 17373
Gerrit-PatchSet: 3
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 07:56:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10688: Implement ds cpc stringify function

2021-05-17 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17373 )

Change subject: IMPALA-10688: Implement ds_cpc_stringify function
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285
Gerrit-Change-Number: 17373
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 17 May 2021 07:55:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] add OSS/JindoFS support, impala with this change will create a dedicate thread pool for this kind of target. By default 16 threads would be craeted

2021-05-17 Thread Yong Yang (Code Review)
Yong Yang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17455 )

Change subject: add OSS/JindoFS support, impala with this change will create a 
dedicate thread pool for this kind of target. By default 16 threads would be 
craeted
..


Patch Set 2:

this test passed: https://jenkins.impala.io/job/pre-review-test/942/


--
To view, visit http://gerrit.cloudera.org:8080/17455
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4643105628f3860e3145c85d9ed205fe20291add
Gerrit-Change-Number: 17455
Gerrit-PatchSet: 2
Gerrit-Owner: Yong Yang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yong Yang 
Gerrit-Comment-Date: Mon, 17 May 2021 07:45:10 +
Gerrit-HasComments: No