[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-06 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#18). ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..

IMPALA-9741: Support querying Iceberg table by impala

This patch mainly realizes the querying of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When querying iceberg table, we pushdown
partition column predicates to iceberg to decide which data files
need to be scanned, and then transfer this information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in test_scanners.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-121-8db0f1e1-d88c-4aad-a8b3-24fd07329cdb-0.parquet
A 

[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 05:27:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService

2020-08-06 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16291 )

Change subject: WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService
..


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/runtime/query-exec-mgr.cc@76
PS1, Line 76: << TNetworkAddressToString(MakeNetworkAddress(
:  query_ctx.coord_hostname, 
query_ctx.coord_krpc_address.port));
> This could be shortened to "<< query_ctx.coord_hostname << ":" << query_ctx
Fixed it


http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/service/control-service.cc@155
PS1, Line 155:  << TNetworkAddressToString(MakeNetworkAddress(
> Same comment about shortening
fixed it


http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/testutil/in-process-servers.cc
File be/src/testutil/in-process-servers.cc:

http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/testutil/in-process-servers.cc@47
PS1, Line 47: // Thrift server ctor allows port to be set to 0. Not supported 
with KRPC.
:   // So KRPC port must be explicitly set here.
> This comment is a bit weird now, since it was written in contrast to the li
Fixed it as suggested


http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/testutil/in-process-servers.cc@86
PS1, Line 86:   RETURN_IF_ERROR(WaitForServer(FLAGS_hostname, krpc_port_, 10, 
100));
> Does this work? WaitForServer() is expecting to connect to a Thrift server
Changed to hs2_http_port_ for safe.


http://gerrit.cloudera.org:8080/#/c/16291/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/16291/1/common/thrift/ImpalaInternalService.thrift@522
PS1, Line 522:   7: optional Types.TNetworkAddress coord_krpc_address
> Could we rename this coord_ip_address to make it clearer?
Yes, renamed it as suggested.


http://gerrit.cloudera.org:8080/#/c/16291/1/tests/custom_cluster/test_blacklist.py
File tests/custom_cluster/test_blacklist.py:

http://gerrit.cloudera.org:8080/#/c/16291/1/tests/custom_cluster/test_blacklist.py@67
PS1, Line 67: e
> flake8: E501 line too long (91 > 90 characters)
fixed it


http://gerrit.cloudera.org:8080/#/c/16291/1/tests/custom_cluster/test_blacklist.py@114
PS1, Line 114: e
> flake8: E501 line too long (91 > 90 characters)
fixed it.



--
To view, visit http://gerrit.cloudera.org:8080/16291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5fa83c8009590124dded4783f77ef70fa30119e6
Gerrit-Change-Number: 16291
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 07 Aug 2020 05:01:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10044: Fix cleanup for bootstrap toolchain.py failure case

2020-08-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16294 )

Change subject: IMPALA-10044: Fix cleanup for bootstrap_toolchain.py failure 
case
..

IMPALA-10044: Fix cleanup for bootstrap_toolchain.py failure case

If DownloadUnpackTarball::download()'s wget_and_unpack_package call
hits an exception, the exception handler cleans up any created
directories. Currently, it erroneously cleans up the directory where
the tarballs are downloaded even when it is not a temporary directory.
This would delete the entire toolchain.

This fixes the cleanup to only delete that directory if it is a
temporary directory.

Testing:
 - Simulated exception from wget_and_unpack_package and verified
   behavior.

Change-Id: Ia57f56b6717635af94247fce50b955c07a57d113
Reviewed-on: http://gerrit.cloudera.org:8080/16294
Reviewed-by: Laszlo Gaal 
Tested-by: Impala Public Jenkins 
---
M bin/bootstrap_toolchain.py
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Laszlo Gaal: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia57f56b6717635af94247fce50b955c07a57d113
Gerrit-Change-Number: 16294
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-10044: Fix cleanup for bootstrap toolchain.py failure case

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16294 )

Change subject: IMPALA-10044: Fix cleanup for bootstrap_toolchain.py failure 
case
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia57f56b6717635af94247fce50b955c07a57d113
Gerrit-Change-Number: 16294
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Fri, 07 Aug 2020 03:49:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16296 )

Change subject: IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()
..

IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()

MonoTime is a utility Impala imports from Kudu. The behavior of
MonoTime::GetDeltaSince() was accidentally flipped in
https://gerrit.cloudera.org/#/c/14932/ so we're getting negative
durations where we expect positive durations.

The function is deprecated anyways, so this patch removes all uses of
it and replaces them with the MonoTime '-' operator.

Testing:
- Manually ran with and without patch and inspected calculated values.
- Added DCHECKs to prevent sucn an issue from occurring again.

Change-Id: If8cd3eb51a4fd101bbe4b9c44ea9be6ea2ea0d06
Reviewed-on: http://gerrit.cloudera.org:8080/16296
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/service/data-stream-service.cc
3 files changed, 7 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If8cd3eb51a4fd101bbe4b9c44ea9be6ea2ea0d06
Gerrit-Change-Number: 16296
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16296 )

Change subject: IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8cd3eb51a4fd101bbe4b9c44ea9be6ea2ea0d06
Gerrit-Change-Number: 16296
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Fri, 07 Aug 2020 03:47:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16263 )

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6820/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 03:11:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-06 Thread Sahil Takiar (Code Review)
Hello Qifan Chen, Tim Armstrong, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16263

to look at the new patch set (#5).

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..

IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries

Strip debug symbols from libkudu_client.so and libstdc++.so. The same
technique used to strip debug symbols from impalad binaries is used.

This decreases the Docker image sizes by about 100 MB.

Test:
* Ran Dockerized tests

Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
---
M docker/setup_build_context.py
1 file changed, 28 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/16263/5
--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-06 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16263 )

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16263/4/docker/setup_build_context.py
File docker/setup_build_context.py:

http://gerrit.cloudera.org:8080/#/c/16263/4/docker/setup_build_context.py@71
PS4, Line 71: def strip_debug_symbols(src_file, dst_file):
> flake8: E302 expected 2 blank lines, found 1
Done



--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 02:43:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 01:00:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..

IMPALA-9851: Truncate long error message.

Error message length was unbounded and can grow very large into couple
of MB in size. This patch truncate error message to maximum 128kb in
size.

This patch also fix potentially long error message related to
BufferPool::Client::DebugString(). Before this patch, DebugString() will
print all pages in 'pinned_pages_', 'dirty_unpinned_pages_', and
'in_flight_write_pages_' PageList. With this patch, DebugString() only
include maximum of 100 first pages in each PageList.

Testing:
- Add be test BufferPoolTest.ShortDebugString
- Add test within ErrorMsg.GenericFormatting to test for truncation.
- Run and pass core tests.

Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Reviewed-on: http://gerrit.cloudera.org:8080/16300
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/internal-queue.h
8 files changed, 163 insertions(+), 62 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..


Patch Set 27:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/16220/27//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16220/27//COMMIT_MSG@77
PS27, Line 77: loggerd
nit: typo


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.h@448
PS27, Line 448:   /// memory tracked by query memory trackers. The top element 
in the queue is the
nit: by all children query mem trackers.


http://gerrit.cloudera.org:8080/#/c/16220/23/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/16220/23/be/src/runtime/mem-tracker.cc@481
PS23, Line 481: MemTracker* MemTracker::GetRootMemTracker() {
  :   MemTracker* ancestor = this;
  :   while (ancestor && ancestor->parent()) {
  : ancestor = ancestor->parent();
  :   }
  :   return ancestor;
  : }
is this used anywhere?


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc@422
PS27, Line 422: UpdatePoolStatsForQueries
can you also add a test for this in mem-tracker-test


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc@458
PS27, Line 458: else {
Add a DCHECK(tracker->is_query_mem_tracker_) to make sure these stats are 
collected only for query memtrackers since they dont make sense for trackers 
lower in the mem tracker hierarchy.

Also mention in the method comment for UpdatePoolStatsForQueries() that it 
should only be called for mem-trackers that are either query mem trackers or 
higher in the mem tracker hierarchy.


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller-test.cc
File be/src/scheduling/admission-controller-test.cc:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller-test.cc@972
PS27, Line 972: void hook_me(const char* hook)
  : {
  :   bool echo = false;
  :   FILE* fp = nullptr;
  :
  :   while (!(fp = fopen(hook, "r"))) {
  :  if ( !echo ) {
  :std::cout << "gdb -pid " << getpid() << ", then touch " 
<< hook << std::endl;
  :echo = true;
  :  }
  :   }
  :   fclose(fp);
  : }
remove?


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h@735
PS27, Line 735: contains
nit: contain


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h@1055
PS27, Line 1055:   /// A helper type to glue information together to compute 
the topN queries
   :   /// out of  topM queries.
   :   typedef std::tuple Item;
   :   const int64_t& getMemConsumed(const Item& item) const { 
return std::get<0>(item); }
   :   const string& getName(const Item& item) const { return 
std::get<1>(item); }
   :   const TUniqueId& getTUniqueId(const Item& item) const { 
return std::get<2>(item); }
   :   const TPoolStats* getTPoolStats(const Item& item) const { 
return std::get<3>(item); }
Guess i am a bit biased for using structs since i am kinda used to using them 
in this codebase, but this seems good too.


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@272
PS27, Line 272: DebugPoolStatsForConsumedMemory
nit: AppendStatsForConsumedMemory


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@303
PS27, Line 303: // Return a debug string for memory consumption part of the 
pool stats.
  : string 
AdmissionController::PoolStats::DebugPoolStatsForConsumedMemory(
  : const TPoolStats& stats) const {
  :   stringstream ss;
  :   DebugPoolStatsForConsumedMemory(ss, stats);
  :   return ss.str();
  : }
since DebugPoolStats is the only method using this, we can probably get rid of 
it and pass the ss object in DebugPoolStats directly, since this would generate 
the string twice (2 seperate calls to ss.str())


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@361
PS27, Line 361: DebugTopNQueriesForAllPoolsInHost
nit:  

[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6819/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 00:41:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16263 )

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6818/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 00:34:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6244/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 00:15:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-06 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Fang-Yu Rao, David Rorke, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16280

to look at the new patch set (#4).

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..

IMPALA-10034: Add remaining TPC-DS queries to workload.

Include remaining TPC-DS queries to the testdata workload definition.

Q8 and Q38 were using non standard variants, those have been
replaced by the official query versions. Q35 is using an official
variant. Had to escape a table alias in Q90 as we treat 'AT' as a
reserved keyword.

Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
---
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q23-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q24-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q28.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q35.test
D testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q44.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q49.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q66.test
M testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q87.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q90.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q93.test
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
17 files changed, 1,248 insertions(+), 104 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/16280/4
--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-06 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16263 )

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16263/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16263/3//COMMIT_MSG@9
PS3, Line 9: so
> Just wonder if some other .so files in toolchain are worth the stripping ef
Most of these aren't copied into the Docker images. I doubled checked and now 
we strip debug symbols from all .so files in the Docker images that have a 
non-trivial size.


http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py
File docker/setup_build_context.py:

http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py@87
PS3, Line 87: .py
> +1, if we're copying them into the container, it's a mistake
Done


http://gerrit.cloudera.org:8080/#/c/16263/3/docker/setup_build_context.py@91
PS3, Line 91:   check_call([STRIP, "--strip-debug", libstdcpp_so, "-o",
:   os.path.join(LIB_DIR, 
os.path.basename(libstdcpp_so))])
> Nit: I think it would be good to factor out this strip call into a function
Done



--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 00:05:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16263 )

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16263/4/docker/setup_build_context.py
File docker/setup_build_context.py:

http://gerrit.cloudera.org:8080/#/c/16263/4/docker/setup_build_context.py@71
PS4, Line 71: def strip_debug_symbols(src_file, dst_file):
flake8: E302 expected 2 blank lines, found 1



--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 07 Aug 2020 00:05:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10029: Strip debug symbols from libkudu client and libstdc++ binaries

2020-08-06 Thread Sahil Takiar (Code Review)
Hello Qifan Chen, Tim Armstrong, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16263

to look at the new patch set (#4).

Change subject: IMPALA-10029: Strip debug symbols from libkudu_client and 
libstdc++ binaries
..

IMPALA-10029: Strip debug symbols from libkudu_client and libstdc++ binaries

Strip debug symbols from libkudu_client.so and libstdc++.so. The same
technique used to strip debug symbols from impalad binaries is used.

This decreases the Docker image sizes by about 100 MB.

Test:
* Ran Dockerized tests

Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
---
M docker/setup_build_context.py
1 file changed, 27 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/16263/4
--
To view, visit http://gerrit.cloudera.org:8080/16263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61fdf47041bd96248ecb48ae57dde143de2da294
Gerrit-Change-Number: 16263
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9645 Port LLVM codegen to adapt aarch64

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15718 )

Change subject: IMPALA-9645 Port LLVM codegen to adapt aarch64
..


Patch Set 20: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f30ee84ea9bf5245da88154632bb69079103d11
Gerrit-Change-Number: 15718
Gerrit-PatchSet: 20
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 23:33:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16252 )

Change subject: IMPALA-9988 (part 2): Integrate ldap filters and 
impala.doas.user
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6816/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
Gerrit-Change-Number: 16252
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 23:13:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP - IMPALA-9979: part 2: partitioned top-n

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16242 )

Change subject: WIP - IMPALA-9979: part 2: partitioned top-n
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6817/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Gerrit-Change-Number: 16242
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 23:13:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP - IMPALA-9979: part 2: partitioned top-n

2020-08-06 Thread Tim Armstrong (Code Review)
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16242

to look at the new patch set (#13).

Change subject: WIP - IMPALA-9979: part 2: partitioned top-n
..

WIP - IMPALA-9979: part 2: partitioned top-n

Planner changes:
---
The planner now identifies predicates that can be converted into
limits in a partitioned or unpartitioned top-n with the following
method:
* Push down predicates that reference analytic tuple into inline view.
  These will be evaluated after the analytic plan for the inline
  SelectStmt is generated.
* Identify predicates that reference the analytic tuple and could
  be converted to limits.
* If they can be applied to the last sort group of the analytic
  plan, and the windows are all compatible, then the lowest
  limit gets converted into a limit in the top N.
* Otherwise generate a select node with the conjuncts. We add
  logic to merge SELECT nodes to avoid generating duplicates
  from inside and outside the inline view.

The optimization can be disabled by setting
ANALYTIC_RANK_PUSHDOWN_THRESHOLD=0. By default it is
only enable for limits of 1000 or less, because the
in-memory Top-N may perform significantly worse than
a full sort for large heaps (since updating the heap
for every input row ends up being more expensive than
doing a traditional sort). We could probably optimize
this more with better tuning so that it can gracefully
fall back to doing the full sort at runtime.

rank() and row_number() are handled. rank() needs support in
the TopN node to include ties for the last place, which is
also added in this patch.

If predicates are trivially false, we generate empty nodes.

The logic to choose between TopNNode and SortNode based
on TOPN_BYTES_LIMIT is moved from SingleNodePlanner to
SortNode so it can be reused.

Backend changes:
---
The top-n node in the backend is augmented to handle both
the partitioning (for which we use a std::map and a
comparator based on the partition exprs) and the tie-handling
semantics required by rank() predicates. The partitioned
top-n node has a soft limit of 64MB on the size of the
in-memory heaps and can spill with use of an embedded Sorter.
The current implementation tries to evict heaps that are
less effective at filtering rows.

We currently use the partitioned top-n node to implement
rank() pushdown in all cases because of the tie-handling
support. We also cannot use the merging exchange for
rank() because the limit does not handle ties in the same way,
so we need to generate an unordered exchange with a partitioned
top-n node on top of the exchange.

Limitations:
---
There are several possible extensions to this that we did not do:
* dense_rank() is not supported because it would require additional
  backend support - IMPALA-10014.
* Only one predicate per analytic is pushed.
* Redundant rank()/row_number() predicates are not merged,
  only the lowest is chosen.
* Lower bounds are not converted into OFFSET.
* The analytic operator cannot be eliminated even if the analytic
  expression was only used in the predicate.
* This doesn't push predicates into UNION - IMPALA-10013
* Always false predicates don't result in empty plan - IMPALA-10015

Tests:
-
* Planner tests - added tests that exercise the interesting code
  paths added in planning.
  - Predicate ordering in SELECT nodes changed in a couple of cases
because some predicates were pushed into the inline views.
* Modified SORT targeted perf tests to avoid conversion to Top-N
* Added targeted perf test for partitioned top-n.
* End-to-end tests
 - Unpartitioned Top-N end-to-end tests
 - Basic partitioning and duplicate handling tests on functional
 - Similar basic tests on larger inputs from TPC-DS and with
   larger partition counts.
 - I inspected the results and also ran the same tests with
   analytic_rank_pushdown_threshold=0 to confirm that the
   results were the same as with the full sort.
 - Fallback to spilling sort.

* TODO
* e2e tests with smaller batch size to catch boundary conditions
* more eviction/spilling tests.

Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/exec-node.cc
M be/src/exec/topn-node-ir.cc
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/exprs/slot-ref.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/tuple-row-compare.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M 

[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

2020-08-06 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16252 )

Change subject: IMPALA-9988 (part 2): Integrate ldap filters and 
impala.doas.user
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16252/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16252/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user
> OK by me
Done



--
To view, visit http://gerrit.cloudera.org:8080/16252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
Gerrit-Change-Number: 16252
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 22:40:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

2020-08-06 Thread Thomas Tauber-Marshall (Code Review)
Hello Tamas Mate, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16252

to look at the new patch set (#4).

Change subject: IMPALA-9988 (part 2): Integrate ldap filters and 
impala.doas.user
..

IMPALA-9988 (part 2): Integrate ldap filters and impala.doas.user

This patch fixes the integration between LDAP filters and proxy
users by ensuring that the 'impala.doas.user' HS2 config option is
considered when applying filters. This requires deferring checking the
filters until the OpenSession() call.

This patch also introduces new flags --ldap_bind_dn and
--ldap_bind_password_cmd which must be specified in order to use LDAP
filters, unless the LDAP server is set up to allow anonymous binds.

It also uses some gflag utilities from Kudu to tag
--ldap_bind_password_cmd as sensitive and redact it on the webui and
in logging in order to increase security in case a user specifies it
as 'echo '

These config options are modeled after equivalent options in Hue:
https://github.com/cloudera/hue/blob/master/desktop/conf.dist/hue.ini#L425

Testing:
- Added a test that uses the 'impala.doas.user' config with LDAP
  filters.

Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
---
M be/src/common/logging.cc
M be/src/rpc/authentication.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/default-path-handlers.cc
M be/src/util/ldap-util.cc
M be/src/util/ldap-util.h
M be/src/util/webserver.cc
M fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java
M fe/src/test/java/org/apache/impala/customcluster/LdapImpalaShellTest.java
M fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java
M fe/src/test/java/org/apache/impala/testutil/LdapUtil.java
13 files changed, 235 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/16252/4
--
To view, visit http://gerrit.cloudera.org:8080/16252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ca8e1a0466288225efbe05b2d0068b8241df070
Gerrit-Change-Number: 16252
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10044: Fix cleanup for bootstrap toolchain.py failure case

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16294 )

Change subject: IMPALA-10044: Fix cleanup for bootstrap_toolchain.py failure 
case
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6243/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia57f56b6717635af94247fce50b955c07a57d113
Gerrit-Change-Number: 16294
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Thu, 06 Aug 2020 22:37:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16296 )

Change subject: IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8cd3eb51a4fd101bbe4b9c44ea9be6ea2ea0d06
Gerrit-Change-Number: 16296
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 06 Aug 2020 22:32:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16296 )

Change subject: IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6242/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8cd3eb51a4fd101bbe4b9c44ea9be6ea2ea0d06
Gerrit-Change-Number: 16296
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 06 Aug 2020 22:32:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10044: Fix cleanup for bootstrap toolchain.py failure case

2020-08-06 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16294 )

Change subject: IMPALA-10044: Fix cleanup for bootstrap_toolchain.py failure 
case
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia57f56b6717635af94247fce50b955c07a57d113
Gerrit-Change-Number: 16294
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Thu, 06 Aug 2020 22:31:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10039: Fixed Expr-test crash caused by thread unsafe function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16299 )

Change subject: IMPALA-10039: Fixed Expr-test crash caused by thread unsafe 
function
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6815/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85245bf4bffb469913d53741847e67773b7d4627
Gerrit-Change-Number: 16299
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 06 Aug 2020 20:52:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10054: Fix flakiness in test multiple sort run bytes limits

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16301 )

Change subject: IMPALA-10054: Fix flakiness in 
test_multiple_sort_run_bytes_limits
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6814/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84a8b579c943cddba4432cf183f7f002ef8ec6ad
Gerrit-Change-Number: 16301
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Thu, 06 Aug 2020 20:41:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10039: Fixed Expr-test crash caused by thread unsafe function

2020-08-06 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/16299 )

Change subject: IMPALA-10039: Fixed Expr-test crash caused by thread unsafe 
function
..

IMPALA-10039: Fixed Expr-test crash caused by thread unsafe function

Recent patch for IMPALA-5746 registers a callback function for the
updating of cluster membership. The callback function cancels the
queries scheduled by the failed coordinators. This callback function
was called during Expr-test and caused crash since QueryState::Cancel()
was called before thread unsafe function QueryState::Init() was
completed.
This patch make QueryState::Cancel() to wait until QueryState::Init()
is completed, checks if the process running for tests and only registers
the callback function if it's not running for BE/FE tests.

Testing:
 - The issue could be reproduced by running expr-test for 10-20
   iterations. Verified the fixing by running expr-test over 1000
   iterations without crash.
 - Passed TestProcessFailures::test_kill_coordinator.
 - Passed core tests.

Change-Id: I85245bf4bffb469913d53741847e67773b7d4627
---
M be/src/runtime/exec-env.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
3 files changed, 23 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16299/3
--
To view, visit http://gerrit.cloudera.org:8080/16299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I85245bf4bffb469913d53741847e67773b7d4627
Gerrit-Change-Number: 16299
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-10054: Fix flakiness in test multiple sort run bytes limits

2020-08-06 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16301


Change subject: IMPALA-10054: Fix flakiness in 
test_multiple_sort_run_bytes_limits
..

IMPALA-10054: Fix flakiness in test_multiple_sort_run_bytes_limits

test_multiple_sort_run_bytes_limits seems to become flaky in
ubuntu-16.04-dockerised-tests. This flakiness may come from accuracy
change in query estimates or mem_limit specified in the test does not
fit anymore. This patch tune the parameter of the first and the second
test case of test_multiple_sort_run_bytes_limits to pass the assertion.
The assertion is also changed a bit to allow easier debugging in case if
test regression occurs again in the future.

Testing:
- Run and pass test_sort.py

Change-Id: I84a8b579c943cddba4432cf183f7f002ef8ec6ad
---
M tests/query_test/test_sort.py
1 file changed, 6 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/16301/1
--
To view, visit http://gerrit.cloudera.org:8080/16301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I84a8b579c943cddba4432cf183f7f002ef8ec6ad
Gerrit-Change-Number: 16301
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()

2020-08-06 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16296 )

Change subject: IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()
..


Patch Set 1: Code-Review+2

Nice catch, LGTM


--
To view, visit http://gerrit.cloudera.org:8080/16296
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8cd3eb51a4fd101bbe4b9c44ea9be6ea2ea0d06
Gerrit-Change-Number: 16296
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 06 Aug 2020 20:14:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..

IMPALA-10005: Fix Snappy decompression for non-block filesystems

Snappy-compressed text always uses THdfsCompression::SNAPPY_BLOCKED
type compression in the backend. However, for non-block filesystems,
the frontend is incorrectly passing THdfsCompression::SNAPPY instead.
On debug builds, this leads to a DCHECK when trying to read
Snappy-compressed text. On release builds, it fails to decompress
the data.

This fixes the frontend to always pass THdfsCompression::SNAPPY_BLOCKED
for Snappy-compressed text.

This reworks query_test/test_compressed_formats.py to provide better
coverage:
 - Changed the RC and Seq test cases to verify that the file extension
   doesn't matter. Added Avro to this case as well.
 - Fixed the text case to use appropriate extensions (fixing IMPALA-9004)
 - Changed the utility function so it doesn't use Hive. This allows it
   to be enabled on non-HDFS filesystems like S3.
 - Changed the test to use unique_database and allow parallel execution.
 - Changed the test to run in the core job, so it now has coverage on
   the usual S3 test configuration. It is reasonably quick (1-2 minutes)
   and runs in parallel.

Testing:
 - Exhaustive job
 - Core s3 job
 - Changed the frontend to force it to use the code for non-block
   filesystems (i.e. the TFileSplitGeneratorSpec code) and
   verified that it is now able to read Snappy-compressed text.

Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Reviewed-on: http://gerrit.cloudera.org:8080/16278
Tested-by: Impala Public Jenkins 
Reviewed-by: Sahil Takiar 
---
M fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java
M tests/query_test/test_compressed_formats.py
2 files changed, 132 insertions(+), 84 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Sahil Takiar: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2:

Thanks for the review!


--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 06 Aug 2020 20:12:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-06 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2: Code-Review+2

Thanks for the explanations. LGTM.


--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 06 Aug 2020 20:11:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService

2020-08-06 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16291 )

Change subject: WIP IMPALA-9180 (part 1): Remove legacy ImpalaInternalService
..


Patch Set 1:

(5 comments)

Just a few small comments, I'll take a deeper look when its not a WIP anymore. 
Looks pretty good, though

http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/runtime/query-exec-mgr.cc@76
PS1, Line 76: << TNetworkAddressToString(MakeNetworkAddress(
:  query_ctx.coord_hostname, 
query_ctx.coord_krpc_address.port));
This could be shortened to "<< query_ctx.coord_hostname << ":" << 
query_ctx.coord_krpc_address.port"


http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/service/control-service.cc@155
PS1, Line 155:  << TNetworkAddressToString(MakeNetworkAddress(
Same comment about shortening


http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/testutil/in-process-servers.cc
File be/src/testutil/in-process-servers.cc:

http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/testutil/in-process-servers.cc@47
PS1, Line 47: // Thrift server ctor allows port to be set to 0. Not supported 
with KRPC.
:   // So KRPC port must be explicitly set here.
This comment is a bit weird now, since it was written in contrast to the line 
above which is no longer there. Maybe change it to:
This flag is read directly in several places to find the address of the backend 
interface, so we must set it here.


http://gerrit.cloudera.org:8080/#/c/16291/1/be/src/testutil/in-process-servers.cc@86
PS1, Line 86:   RETURN_IF_ERROR(WaitForServer(FLAGS_hostname, krpc_port_, 10, 
100));
Does this work? WaitForServer() is expecting to connect to a Thrift server but 
we're directing it to a krpc server now.

I guess its fine because all WaitForServer does is try to open a socket, which 
will work  either way?


http://gerrit.cloudera.org:8080/#/c/16291/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/16291/1/common/thrift/ImpalaInternalService.thrift@522
PS1, Line 522:   7: optional Types.TNetworkAddress coord_krpc_address
Could we rename this coord_ip_address to make it clearer?



--
To view, visit http://gerrit.cloudera.org:8080/16291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5fa83c8009590124dded4783f77ef70fa30119e6
Gerrit-Change-Number: 16291
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 06 Aug 2020 20:11:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans

2020-08-06 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
..


Patch Set 25:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16098/25/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16098/25/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1199
PS25, Line 1199: hasCorruptTableStats_
should this be 'partitionsWithCorruptOrMissingStats.size() != 0'

you want to check if any partitions have either corrupt or missing stats, right?


http://gerrit.cloudera.org:8080/#/c/16098/25/tests/metadata/test_explain.py
File tests/metadata/test_explain.py:

http://gerrit.cloudera.org:8080/#/c/16098/25/tests/metadata/test_explain.py@132
PS25, Line 132: # Set the number of rows at the table level to -1.
  : self.execute_query(
  :   "alter table %s set tblproperties('numRows'='-1')" % 
mixed_tbl)
> It is needed to trigger the code to take the code path of estimation. Witho
I think this might actually be a bug (see other comment). only one of the 
partitions has stats, the others should have missing stats.



--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 25
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 20:05:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16278/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16278/2//COMMIT_MSG@9
PS2, Line 9: Snappy-compressed text always uses THdfsCompression::SNAPPY_BLOCKED
> is this true of all engines? do all engines use SNAPPY_BLOCKED for snappy-c
Under the covers, we were already using SNAPPY_BLOCKED before. All our Snappy 
text test tables are written by Hive, and Hive uses SNAPPY_BLOCKED. Our text 
scanner has an assert that verifies that we don't pass THdfsCompression::SNAPPY 
into it.

In the old code, the way this worked is that we translated SNAPPY to 
THdfsCompression::SNAPPY_BLOCKED when we converted to thrift here:
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java#L79

The non-block filesystems don't go through that Thrift codepath. They go 
through a flatbuffers codepath, which inadvertently didn't do the conversion 
from SNAPPY to SNAPPY_BLOCKED.
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java#L93

Long story short: the behavior should be the same as before (except non-block 
is fixed)


http://gerrit.cloudera.org:8080/#/c/16278/2//COMMIT_MSG@10
PS2, Line 10: for non-block filesystems
> why does this only happen for non-block filesystems?
Rolled the answer to this into the other comment.


http://gerrit.cloudera.org:8080/#/c/16278/2//COMMIT_MSG@24
PS2, Line 24: Changed the utility function so it doesn't use Hive.
> adding coverage for S3 is nice, but do we lose any inter-operability covera
The Hive statements are "create table like" statements and "drop table" 
statements, which are purely metadata. The compression codec is not part of the 
metadata.

I don't think we lose any coverage that we don't have covered by other tests.

The data that we are using was already written by Hive during dataload (i.e. we 
aren't writing data with Hive in this test).



--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 06 Aug 2020 20:02:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6241/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 19:48:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 19:48:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 19:48:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10005: Fix Snappy decompression for non-block filesystems

2020-08-06 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16278 )

Change subject: IMPALA-10005: Fix Snappy decompression for non-block filesystems
..


Patch Set 2:

(3 comments)

mostly questions

http://gerrit.cloudera.org:8080/#/c/16278/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16278/2//COMMIT_MSG@9
PS2, Line 9: Snappy-compressed text always uses THdfsCompression::SNAPPY_BLOCKED
is this true of all engines? do all engines use SNAPPY_BLOCKED for 
snappy-compressed text? I guess, put another way, if we write snappy-compressed 
text via Hive, can Impala still read it after this change?


http://gerrit.cloudera.org:8080/#/c/16278/2//COMMIT_MSG@10
PS2, Line 10: for non-block filesystems
why does this only happen for non-block filesystems?


http://gerrit.cloudera.org:8080/#/c/16278/2//COMMIT_MSG@24
PS2, Line 24: Changed the utility function so it doesn't use Hive.
adding coverage for S3 is nice, but do we lose any inter-operability coverage 
here between Hive and Impala?



--
To view, visit http://gerrit.cloudera.org:8080/16278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Gerrit-Change-Number: 16278
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 06 Aug 2020 19:30:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6238/


--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 19:22:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16259 )

Change subject: IMPALA-9963: Implement ds_kll_n() function
..

IMPALA-9963: Implement ds_kll_n() function

This function receives a serialized Apache DataSketches KLL sketch
and returns how many input values were fed into this sketch.

Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Reviewed-on: http://gerrit.cloudera.org:8080/16259
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exprs/datasketches-common.h
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test
5 files changed, 56 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Gerrit-Change-Number: 16259
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16259 )

Change subject: IMPALA-9963: Implement ds_kll_n() function
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Gerrit-Change-Number: 16259
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 19:15:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9645 Port LLVM codegen to adapt aarch64

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15718 )

Change subject: IMPALA-9645 Port LLVM codegen to adapt aarch64
..


Patch Set 19:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/6813/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f30ee84ea9bf5245da88154632bb69079103d11
Gerrit-Change-Number: 15718
Gerrit-PatchSet: 19
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:56:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6812/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:51:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10039: Fixed Expr-test crash

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16299 )

Change subject: IMPALA-10039: Fixed Expr-test crash
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6239/


--
To view, visit http://gerrit.cloudera.org:8080/16299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85245bf4bffb469913d53741847e67773b7d4627
Gerrit-Change-Number: 16299
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:39:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9645 Port LLVM codegen to adapt aarch64

2020-08-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15718 )

Change subject: IMPALA-9645 Port LLVM codegen to adapt aarch64
..


Patch Set 19: Code-Review+2

Fixed the clang-tidy error.


--
To view, visit http://gerrit.cloudera.org:8080/15718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f30ee84ea9bf5245da88154632bb69079103d11
Gerrit-Change-Number: 15718
Gerrit-PatchSet: 19
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:34:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16300/1/be/src/util/internal-queue.h
File be/src/util/internal-queue.h:

http://gerrit.cloudera.org:8080/#/c/16300/1/be/src/util/internal-queue.h@279
PS1, Line 279:   // reached. If 'fn' returns false, terminate iteration. It is 
invalid to call other
> Comment what happens if n is greater than the size of the list?
Done



--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:34:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9645 Port LLVM codegen to adapt aarch64

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15718 )

Change subject: IMPALA-9645 Port LLVM codegen to adapt aarch64
..


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6240/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3f30ee84ea9bf5245da88154632bb69079103d11
Gerrit-Change-Number: 15718
Gerrit-PatchSet: 20
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:34:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Riza Suminto (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16300

to look at the new patch set (#2).

Change subject: IMPALA-9851: Truncate long error message.
..

IMPALA-9851: Truncate long error message.

Error message length was unbounded and can grow very large into couple
of MB in size. This patch truncate error message to maximum 128kb in
size.

This patch also fix potentially long error message related to
BufferPool::Client::DebugString(). Before this patch, DebugString() will
print all pages in 'pinned_pages_', 'dirty_unpinned_pages_', and
'in_flight_write_pages_' PageList. With this patch, DebugString() only
include maximum of 100 first pages in each PageList.

Testing:
- Add be test BufferPoolTest.ShortDebugString
- Add test within ErrorMsg.GenericFormatting to test for truncation.
- Run and pass core tests.

Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
---
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/internal-queue.h
8 files changed, 163 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/16300/2
--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9645 Port LLVM codegen to adapt aarch64

2020-08-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#19) to the change originally 
created by zhaoren...@hotmail.com. ( http://gerrit.cloudera.org:8080/15718 )

Change subject: IMPALA-9645 Port LLVM codegen to adapt aarch64
..

IMPALA-9645 Port LLVM codegen to adapt aarch64

On aarch64, the Lowered type  of  struct {bool, int128} is form
{ {i8}, {i128} }. No padding add. This is different with x86-64,
which is { {i8}, {15*i8}, {i128} } with padding add automatically.

And here also add some type conversion between x86 and aarch64 data types.

And also add some aarch64 cpu's feature.

Change-Id: I3f30ee84ea9bf5245da88154632bb69079103d11
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/llvm-codegen.cc
M be/src/exec/text-converter.cc
M be/src/exprs/scalar-fn-call.cc
4 files changed, 175 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/15718/19
--
To view, visit http://gerrit.cloudera.org:8080/15718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3f30ee84ea9bf5245da88154632bb69079103d11
Gerrit-Change-Number: 15718
Gerrit-PatchSet: 19
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..


Patch Set 1: Code-Review+2

(1 comment)

One minor comment

http://gerrit.cloudera.org:8080/#/c/16300/1/be/src/util/internal-queue.h
File be/src/util/internal-queue.h:

http://gerrit.cloudera.org:8080/#/c/16300/1/be/src/util/internal-queue.h@279
PS1, Line 279:   // from 'fn'.
Comment what happens if n is greater than the size of the list?



--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:11:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10039: Fixed Expr-test crash

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16299 )

Change subject: IMPALA-10039: Fixed Expr-test crash
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85245bf4bffb469913d53741847e67773b7d4627
Gerrit-Change-Number: 16299
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:04:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10039: Fixed Expr-test crash

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16299 )

Change subject: IMPALA-10039: Fixed Expr-test crash
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6239/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85245bf4bffb469913d53741847e67773b7d4627
Gerrit-Change-Number: 16299
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:04:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10039: Fixed Expr-test crash

2020-08-06 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16299 )

Change subject: IMPALA-10039: Fixed Expr-test crash
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85245bf4bffb469913d53741847e67773b7d4627
Gerrit-Change-Number: 16299
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 06 Aug 2020 18:03:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6236/


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 17:46:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..


Patch Set 27:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6811/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 17:46:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..


Patch Set 26:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6810/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 26
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 17:44:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#27). ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..

IMPALA-9989 Improve admission control pool stats logging

This work addresses the current limitation in admission controller by
appending the last known memory consumption statistics about a pool or
a host to the existing memory exhaustion message. The statistics is
logged in impalad.INFO when a query is queued or timed out due to
memory pressure on the pool or on the host. The statistics can also be
part of the query profile.

The BNF of the new memory consumption statistics is as follows.

  topN_query_stats ::=
queries: a list of query Ids for up to 5 queries with top memory
 consumptions
total_mem_consumed: total memory consumed by these topN queries
percentage_mem_consumed_per_pool: total memory consumed divided
  by pool memory usage (if
  feasible to report)

  all_query_stats ::=
min: the minimal memory consumption of all running queries
max: the maximal memory consumption of all running queries
total: the total memory consumption of all running queries
average: the average memory consumption of all running queries
 (if feasible to report)

  pool_stats_per_host ::=
  ':'  
  pool_stats ::= List of 

  host_stats_per_pool ::=  ':' 
  host_stats ::= List of 

  memory_consumption_statistics ::=  | 

The pool_stats describes memory consumption in all pools in a host and
is useful in analyzing memory exhaustion in that host. The host_stats
describes the memory consumption in all hosts for a pool and is useful
in analyzing memory exhaustion in that pool.

Example of pool_stats_per_host:

   pool_name=root.queueD:
 topN_query_stats:
queries=[
   0003:0012,
   0003:0011
],
total_mem_consumed=18.00 MB
fraction_of_pool_total_mem=0.19
 all_query_stats:
num_running=20,
min=1.00 MB,
max=9.00 MB,
total_mem_consumed=95.00 MB,
average=4.75 MB

Example of host_stats_per_pool:

   host_name=host2:25000:
 topN_query_stats:
queries=[
   00020002:0001,
   00020002:0002,
   00020002:,
   00020002:0004
],
total_mem_consumed=55.00 MB

When a query request is queued due to memory exhaustion, the above
memory_consumption_statistics is loggerd when the logging is set
at level 2 or higher.

When a query request is timed out due to memory exhaustion, the above
memory_consumption_statistics is reported when the logging is set
at level 1 or higher.

Testing:
1. Added a new test TopNQueryCheck in admission-controller-test.cc to
   verify that the topN query memory consumption details are reported
   correctly.
2. Add two new tests in test_admission_controller.py to simulate
   queries being queued and then timed out due to pool or host memory
   pressure.
3. Core tests.

Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
---
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/util/container-util.h
M common/thrift/StatestoreService.thrift
M tests/custom_cluster/test_admission_controller.py
8 files changed, 885 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/27
--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#26). ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..

IMPALA-9989 Improve admission control pool stats logging

This work addresses the current limitation in admission controller by
appending the last known memory consumption statistics about a pool or
a host to the existing memory exhaustion message. The statistics is
logged in impalad.INFO when a query is queued or timed out due to
memory pressure on the pool or on the host. The statistics can also be
part of the query profile.

The BNF of the new memory consumption statistics is as follows.

  topN_query_stats ::=
queries: a list of query Ids for up to 5 queries with top memory
 consumptions
total_mem_consumed: total memory consumed by these topN queries
percentage_mem_consumed_per_pool: total memory consumed divided
  by pool memory usage (if
  feasible to report)

  all_query_stats ::=
min: the minimal memory consumption of all running queries
max: the maximal memory consumption of all running queries
total: the total memory consumption of all running queries
average: the average memory consumption of all running queries
 (if feasible to report)

  pool_stats_per_host ::=  ':'  
  pool_stats ::= List of 

  host_stats_per_pool ::=  ':' 
  host_stats ::= List of 

  memory_consumption_statistics ::=  | 

The pool_stats describes memory consumption in all pools in a host and is
useful in analyzing memory exhaustion in that host. The host_stats describes
the memory consumption in all hosts for a pool and is useful in analyzing
memory exhaustion in that pool.

Example of pool_stats_per_host:

   pool_name=root.queueD:
 topN_query_stats:
queries=[
   0003:0012,
   0003:0011
],
total_mem_consumed=18.00 MB
fraction_of_pool_total_mem=0.19
 all_query_stats:
num_running=20,
min=1.00 MB,
max=9.00 MB,
total_mem_consumed=95.00 MB,
average=4.75 MB

Example of host_stats_per_pool:

   host_name=host2:25000:
 topN_query_stats:
queries=[
   00020002:0001,
   00020002:0002,
   00020002:,
   00020002:0004
],
total_mem_consumed=55.00 MB

When a query request is queued due to memory exhaustion, the above
memory_consumption_statistics is loggerd when the logging is set
at level 2 or higher.

When a query request is timed out due to memory exhaustion, the above
memory_consumption_statistics is reported when the logging is set
at level 1 or higher.

Testing:
1. Added a new test TopNQueryCheck in admission-controller-test.cc to
   verify that the topN query memory consumption details are reported
   correctly.
2. Add two new tests in test_admission_controller.py to simulate
   queries being queued and then timed out due to pool or host memory
   pressure.
3. Core tests.

Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
---
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/util/container-util.h
M common/thrift/StatestoreService.thrift
M tests/custom_cluster/test_admission_controller.py
8 files changed, 885 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/26
--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 26
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9341: Check current delegateAdmin value when performing REVOKE

2020-08-06 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has abandoned this change. ( http://gerrit.cloudera.org:8080/16013 )

Change subject: IMPALA-9341: Check current delegateAdmin value when performing 
REVOKE
..


Abandoned

Abandon this patch since we adopt the other approach.
--
To view, visit http://gerrit.cloudera.org:8080/16013
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ib02c51bd15c94c1fb6b1776ecfd7ec3eeafc4e2c
Gerrit-Change-Number: 16013
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..


Patch Set 25:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6809/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 25
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 15:12:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..


Patch Set 24:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6808/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 24
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 15:08:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10047: Revert core piece of IMPALA-6984

2020-08-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16288 )

Change subject: IMPALA-10047: Revert core piece of IMPALA-6984
..

IMPALA-10047: Revert core piece of IMPALA-6984

Performance testing on TPC-DS found a peformance regression
on short queries due to delayed exec status reports. Further
testing traced this back to IMPALA-6984's behavior of
cancelling backends on EOS. The coordinator log show that
CancelBackends() call intermittently taking 10 seconds due
to timing out in the RPC layer.

As a temporary workaround, this reverts the core part of
IMPALA-6984 that added that CancelBackends() call for EOS.
It leaves the rest of IMPALA-6984 intact, as other code has built
on top of it.

Testing:
 - Core job
 - Performance tests

Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
(cherry picked from commit b91f3c0e064d592f3cdf2a2e089ca6546133ba55)
Reviewed-on: http://gerrit.cloudera.org:8080/16288
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/coordinator.cc
1 file changed, 1 insertion(+), 3 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/16288
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
Gerrit-Change-Number: 16288
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16300 )

Change subject: IMPALA-9851: Truncate long error message.
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6807/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 15:04:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..


Patch Set 25:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16220/25/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/16220/25/tests/custom_cluster/test_admission_controller.py@896
PS25, Line 896: "
flake8: E122 continuation line missing indentation or outdented


http://gerrit.cloudera.org:8080/#/c/16220/25/tests/custom_cluster/test_admission_controller.py@923
PS25, Line 923: "
flake8: E122 continuation line missing indentation or outdented



--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 25
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 14:42:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#25). ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..

IMPALA-9989 Improve admission control pool stats logging

This work addresses the current limitation in admission controller by
appending the last known memory consumption statistics about a pool or
a host to the existing memory exhaustion message. The message is
logged in impalad.INFO when a query is queued or timed out due to
memory pressure on the pool or on the host.

This new memory consumption statistics covers the following content:
  topN_query_stats ::=
queries: a list of query Ids for up to 5 queries with top memory
 consumptions
total_mem_consumed: total memory consumed by these topN queries
percentage_mem_consumed_per_pool: total memory consumed divided
  by pool memory usage (if
  feasible to report)
  all_query_stats ::=
min: the minimal memory consumption of all running queries
max: the maximal memory consumption of all running queries
total: the total memory consumption of all running queries
average: the average memory consumption of all running queries
 (if feasible to report)

  pool_stats_per_host ::=
:  
  pool_stats::=
List of 

  host_stats_per_pool ::=
: 
  host_stats::=
List of 

  memory_consumption_statistics ::=
 | 

pool_stats describes memory consumption in all pools in a host
and is useful in analyzing memory exhaustion in that host.
host_stats describes the memory consumption for all hosts in a pool
and is useful in analyzing memory exhaustion in that pool.

Example of pool_stats_per_host:

   pool_name=root.queueD:
 topN_query_stats:
queries=[
   0003:0012,
   0003:0011
],
total_mem_consumed=18.00 MB
fraction_of_pool_total_mem=0.19
 all_query_stats:
num_running=20,
min=1.00 MB,
max=9.00 MB,
total_mem_consumed=95.00 MB,
average=4.75 MB

Example of host_stats_per_pool:

   host_name=host2:25000:
 topN_query_stats:
queries=[
   00020002:0001,
   00020002:0002,
   00020002:,
   00020002:0004
],
total_mem_consumed=55.00 MB

When a query request is queued due to memory exhaustion, the above
memory_consumption_statistics is loggerd when the logging is set
at level 2 or higher.

When a query request is timed out due to memory exhaustion, the above
memory_consumption_statistics is reported when the logging is set
at level 1 or higher.

Testing:
1. Added a new test TopNQueryCheck in admission-controller-test.cc to
verify that the topN query memory consumption details are reported
correctly.
2. Add two new tests in test_admission_controller.py to simulate
queries being queued and then timed out due to pool or host memory
pressure.
3. Core tests.

Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
---
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/util/container-util.h
M common/thrift/StatestoreService.thrift
M tests/custom_cluster/test_admission_controller.py
8 files changed, 885 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/25
--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 25
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..


Patch Set 24:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16220/24/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/16220/24/tests/custom_cluster/test_admission_controller.py@896
PS24, Line 896: "
flake8: E122 continuation line missing indentation or outdented


http://gerrit.cloudera.org:8080/#/c/16220/24/tests/custom_cluster/test_admission_controller.py@923
PS24, Line 923: "
flake8: E122 continuation line missing indentation or outdented



--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 24
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 14:38:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging

2020-08-06 Thread Qifan Chen (Code Review)
Qifan Chen has uploaded a new patch set (#24). ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
..

IMPALA-9989 Improve admission control pool stats logging

This work addresses the current limitation in admission controller by
appending the last known memory consumption statistics about a pool or
a host to the existing memory exhaustion message. The message is
logged in impalad.INFO when a query is queued or timed out due to
memory pressure on the pool or on the host.

This new memory consumption statistics covers the following content:
  topN_query_stats ::=
queries: a list of query Ids for up to 5 queries with top memory
 consumptions
total_mem_consumed: total memory consumed by these topN queries
percentage_mem_consumed_per_pool: total memory consumed divided
  by pool memory usage (if
  feasible to report)
  all_query_stats ::=
min: the minimal memory consumption of all running queries
max: the maximal memory consumption of all running queries
total: the total memory consumption of all running queries
average: the average memory consumption of all running queries
 (if feasible to report)

  pool_stats_per_host ::=
:  
  pool_stats::=
List of 

  host_stats_per_pool ::=
: 
  host_stats::=
List of 

  memory_consumption_statistics ::=
 | 

pool_stats describes memory consumption in all pools in a host
and is useful in analyzing memory exhaustion in that host.
host_stats describes the memory consumption for all hosts in a pool
and is useful in analyzing memory exhaustion in that pool.

Example of pool_stats_per_host:

   pool_name=root.queueD:
 topN_query_stats:
queries=[
   0003:0012,
   0003:0011
],
total_mem_consumed=18.00 MB
fraction_of_pool_total_mem=0.19
 all_query_stats:
num_running=20,
min=1.00 MB,
max=9.00 MB,
total_mem_consumed=95.00 MB,
average=4.75 MB

Example of host_stats_per_pool:

   host_name=host2:25000:
 topN_query_stats:
queries=[
   00020002:0001,
   00020002:0002,
   00020002:,
   00020002:0004
],
total_mem_consumed=55.00 MB

When a query request is queued due to memory exhaustion, the above
memory_consumption_statistics is loggerd when the logging is set
at level 2 or higher.

When a query request is timed out due to memory exhaustion, the above
memory_consumption_statistics is reported when the logging is set
at level 1 or higher.

Testing:
1. Added a new test TopNQueryCheck in admission-controller-test.cc to
verify that the topN query memory consumption details are reported
correctly.
2. Add two new tests in test_admission_controller.py to simulate
queries being queued and then timed out due to pool or host memory
pressure.
3. Core tests.

Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
---
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/util/container-util.h
M common/thrift/StatestoreService.thrift
M tests/custom_cluster/test_admission_controller.py
8 files changed, 885 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/24
--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 24
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9851: Truncate long error message.

2020-08-06 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16300


Change subject: IMPALA-9851: Truncate long error message.
..

IMPALA-9851: Truncate long error message.

Error message length was unbounded and can grow very large into couple
of MB in size. This patch truncate error message to maximum 128kb in
size.

This patch also fix potentially long error message related to
BufferPool::Client::DebugString(). Before this patch, DebugString() will
print all pages in 'pinned_pages_', 'dirty_unpinned_pages_', and
'in_flight_write_pages_' PageList. With this patch, DebugString() only
include maximum of 100 first pages in each PageList.

Testing:
- Add be test BufferPoolTest.ShortDebugString
- Add test within ErrorMsg.GenericFormatting to test for truncation.
- Run and pass core tests.

Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
---
M be/src/runtime/bufferpool/buffer-pool-internal.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/internal-queue.h
8 files changed, 162 insertions(+), 62 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/16300/1
--
To view, visit http://gerrit.cloudera.org:8080/16300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Gerrit-Change-Number: 16300
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 14:07:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6238/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 14:07:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16259 )

Change subject: IMPALA-9963: Implement ds_kll_n() function
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Gerrit-Change-Number: 16259
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 14:01:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16259 )

Change subject: IMPALA-9963: Implement ds_kll_n() function
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6237/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Gerrit-Change-Number: 16259
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 14:01:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10017: Implement ds kll union() function

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16267 )

Change subject: IMPALA-10017: Implement ds_kll_union() function
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6806/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I020aea28d36f9b6ef9fb57c08411f2170f5c24bf
Gerrit-Change-Number: 16267
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 13:15:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 13:12:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10017: Implement ds kll union() function

2020-08-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16267 )

Change subject: IMPALA-10017: Implement ds_kll_union() function
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16267/3/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16267/3/be/src/exprs/aggregate-functions-ir.cc@1851
PS3, Line 1851: etch)) {
> The code you linked is urelated here, it is for HLL sketches. However, the
Can you also add a similar block for HLL (line 1796)? It is ok to do that in 
another patch, but I think that it is the simplest to do it here.



--
To view, visit http://gerrit.cloudera.org:8080/16267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I020aea28d36f9b6ef9fb57c08411f2170f5c24bf
Gerrit-Change-Number: 16267
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 13:12:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function

2020-08-06 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16259 )

Change subject: IMPALA-9963: Implement ds_kll_n() function
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Gerrit-Change-Number: 16259
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 13:06:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10017: Implement ds kll union() function

2020-08-06 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16267 )

Change subject: IMPALA-10017: Implement ds_kll_union() function
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16267/3/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16267/3/be/src/exprs/aggregate-functions-ir.cc@1851
PS3, Line 1851: etch)) {
> Can you add a try-catch block? I randomly checked a deserialize function an
The code you linked is urelated here, it is for HLL sketches. However, the one 
for KLL ca also throw:
https://github.com/apache/impala/blob/074731e2bcf37643710f2fdf236829991a462fc3/be/src/thirdparty/datasketches/kll_sketch_impl.hpp#L534
E.g. ensure_minimum_memory() throws, haven't checked the rest, I'll add a 
try-catch block, thanks for spotting.
Done


http://gerrit.cloudera.org:8080/#/c/16267/3/be/src/exprs/aggregate-functions-ir.cc@1922
PS3, Line 1922:   DCHECK(!dst->is_null);
> merge can throw an exception, please put it in a try-catch block:
Done.
I also found one occurrence above, changed that as well.



--
To view, visit http://gerrit.cloudera.org:8080/16267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I020aea28d36f9b6ef9fb57c08411f2170f5c24bf
Gerrit-Change-Number: 16267
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 12:52:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10017: Implement ds kll union() function

2020-08-06 Thread Gabor Kaszab (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16267

to look at the new patch set (#4).

Change subject: IMPALA-10017: Implement ds_kll_union() function
..

IMPALA-10017: Implement ds_kll_union() function

This function receives a set of serialized Apache DataSketches KLL
sketches produced by ds_kll_sketch() and merges them into a single
sketch.

An example usage is to create a sketch for each partition of a table,
write these sketches to a separate table and based on which partition
the user is interested of the relevant sketches can be union-ed
together to get an estimate. E.g.:
  SELECT
  ds_kll_quantile(ds_kll_union(sketch_col), 0.5)
  FROM sketch_tbl
  WHERE partition_col=1 OR partition_col=5;

Testing:
  - Apart from the automated tests I added to this patch I also
tested ds_kll_union() on a bigger dataset to check that
serialization, deserialization and merging steps work well. I
took TPCH25.linelitem, created a number of sketches with grouping
by l_shipdate and called ds_kll_union() on those sketches.

Change-Id: I020aea28d36f9b6ef9fb57c08411f2170f5c24bf
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M testdata/data/README
A testdata/data/kll_sketches_from_impala.parquet
M testdata/workloads/functional-query/queries/QueryTest/datasketches-kll.test
M tests/query_test/test_datasketches.py
7 files changed, 199 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/16267/4
--
To view, visit http://gerrit.cloudera.org:8080/16267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I020aea28d36f9b6ef9fb57c08411f2170f5c24bf
Gerrit-Change-Number: 16267
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10034: Add remaining TPC-DS queries to workload.

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16280 )

Change subject: IMPALA-10034: Add remaining TPC-DS queries to workload.
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6236/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16280
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5436689390f149694f14e6da1df624de4f5f7ad
Gerrit-Change-Number: 16280
Gerrit-PatchSet: 3
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 06 Aug 2020 12:33:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10044: Fix cleanup for bootstrap toolchain.py failure case

2020-08-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16294 )

Change subject: IMPALA-10044: Fix cleanup for bootstrap_toolchain.py failure 
case
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6235/


--
To view, visit http://gerrit.cloudera.org:8080/16294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia57f56b6717635af94247fce50b955c07a57d113
Gerrit-Change-Number: 16294
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Thu, 06 Aug 2020 07:14:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10018: Implement ds kll rank() function

2020-08-06 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16283 )

Change subject: IMPALA-10018: Implement ds_kll_rank() function
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16283/2/be/src/exprs/datasketches-functions-ir.cc
File be/src/exprs/datasketches-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16283/2/be/src/exprs/datasketches-functions-ir.cc@70
PS2, Line 70:   return sketch.get_rank(probe_value.val);
> A try-catch block could be added to be sure.
get_rank() doesn't throw, no need for a try-catch block.
https://github.com/apache/impala/blob/master/be/src/thirdparty/datasketches/kll_sketch_impl.hpp#L313



--
To view, visit http://gerrit.cloudera.org:8080/16283
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95857886dfbb8c84aeeaf718c0e610012fda4be0
Gerrit-Change-Number: 16283
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 06:23:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9963: Implement ds kll n() function

2020-08-06 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16259 )

Change subject: IMPALA-9963: Implement ds_kll_n() function
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16259/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16259/1//COMMIT_MSG@9
PS1, Line 9:
> I think this should be in singular.
Done


http://gerrit.cloudera.org:8080/#/c/16259/4/be/src/exprs/datasketches-functions-ir.cc
File be/src/exprs/datasketches-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16259/4/be/src/exprs/datasketches-functions-ir.cc@70
PS4, Line 70:   return sketch.get_n();
> A try-catch block could be added to be sure.
get_n() doesn't throw, it simply returns a member variable.
https://github.com/apache/impala/blob/master/be/src/thirdparty/datasketches/kll_sketch_impl.hpp#L232



--
To view, visit http://gerrit.cloudera.org:8080/16259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Gerrit-Change-Number: 16259
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 06 Aug 2020 06:11:08 +
Gerrit-HasComments: Yes