[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..

IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

Introduced new error message when scanning a corrupt Sequence or RCFile.
Added new checks to detect buffer overrun while handling Sequence or RCFile.

Testing:
  a) Made changes to fuzz test for RCFile/Sequence file, ran fuzz test in a loop
  with 200 iteration without failure.

  b) Ran exhaustive test on the changes without failure.

Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Reviewed-on: http://gerrit.cloudera.org:8080/8936
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/read-write-util-test.cc
M be/src/exec/read-write-util.h
M be/src/exec/scanner-context.inline.h
M be/src/util/decompress.cc
M tests/query_test/test_scanners_fuzz.py
9 files changed, 259 insertions(+), 72 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 21
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..


Patch Set 20: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 20
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 18 May 2018 05:05:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed
- loop query_test & failure for 12 hours with no dchecks or crashes
  (This had previously reproduced IMPALA-7030 and IMPALA-7033 with
  the previous version of this change).

Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Reviewed-on: http://gerrit.cloudera.org:8080/10440
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 396 insertions(+), 391 deletions(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 4
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 04:36:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..

IMPALA-6941: load more text scanner compression plugins

Add extensions for LZ4 and ZSTD (which are supported by Hadoop).
Even without a plugin this results in better behaviour because
we don't try to treat the files with unknown extensions as
uncompressed text.

Also allow loading tables containing files with unsupported
compression types. There was weird behaviour before we knew
of the file extension but didn't support querying the table -
the catalog would load the table but the impalad would fail
processing the catalog update. The simplest way to fix it
is to just allow loading the tables.

Similarly, make the "LOAD DATA" operation more permissive -
we can copy files into a directory even if we can't
decompress them.

Switch to always checking plugin version - running mismatched plugin
is inherently unsafe.

Testing:
Positive case where LZO is loaded is exercised. Added
coverage for negative case where LZO is disabled.

Fixed test gaps:
* Querying LZO table with LZO plugin not available.
* Interacting with tables with known but unsupported text
  compressions.
* Querying files with unknown compression suffixes (which are
  treated as uncompressed text).

Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Reviewed-on: http://gerrit.cloudera.org:8080/10165
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M be/src/common/global-flags.cc
M be/src/exec/CMakeLists.txt
D be/src/exec/hdfs-lzo-text-scanner.cc
D be/src/exec/hdfs-lzo-text-scanner.h
A be/src/exec/hdfs-plugin-text-scanner.cc
A be/src/exec/hdfs-plugin-text-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java
M fe/src/main/java/org/apache/impala/catalog/HdfsCompression.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
A testdata/workloads/functional-query/queries/QueryTest/disable-lzo-plugin.test
A 
testdata/workloads/functional-query/queries/QueryTest/unsupported-compression-partitions.test
A tests/custom_cluster/test_scanner_plugin.py
M tests/metadata/test_partition_metadata.py
18 files changed, 458 insertions(+), 280 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 03:44:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2501/


--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 03:18:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..

IMPALA-5737: Tighten minicluster memory limit

This patch limits minicluster memory based on the memory available.
Two memory profiles are introduced. If there is less than 32GB memory
available, memory limit of the minicluster will be set to ~27GB.
Otherwise it will be set to ~32GB. This will hopefully help with
minicluster with more than 3 nodes.

The 27GB profile works on m2.4xlarge which has 8 cores. It should be
more than enough for 4-core machines.

Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
---
M bin/impala-config.sh
M bin/start-impala-cluster.py
M testdata/bin/run-hbase.sh
M testdata/bin/run-hive-server.sh
M testdata/bin/run-sentry-service.sh
M testdata/cluster/node_templates/common/etc/init.d/common.tmpl
M testdata/cluster/node_templates/common/etc/init.d/hdfs-common
M testdata/cluster/node_templates/common/etc/init.d/kudu-common
8 files changed, 83 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/10277/6
--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] test-with-docker: work with git worktree

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10335 )

Change subject: test-with-docker: work with git worktree
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Gerrit-Change-Number: 10335
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 May 2018 02:18:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] test-with-docker: work with git worktree

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10335 )

Change subject: test-with-docker: work with git worktree
..

test-with-docker: work with git worktree

This commit adds a little of git-wrangling to allow test-with-docker to
work when invoked from git directories managed by "git worktree". These
are different in that they reference another git directory elsewhere on
the file system, which also needs to be mounted into the container.

Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Reviewed-on: http://gerrit.cloudera.org:8080/10335
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M docker/entrypoint.sh
M docker/test-with-docker.py
2 files changed, 21 insertions(+), 2 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Gerrit-Change-Number: 10335
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 10:

> Uploaded patch set 10.

No notable conflicts during rebase.


--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 10
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 01:56:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2503/


--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 20
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 18 May 2018 01:55:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3833: Fix invalid data handling in Sequence and RCFile scanners

2018-05-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8936 )

Change subject: IMPALA-3833: Fix invalid data handling in Sequence and RCFile 
scanners
..


Patch Set 20: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9cfc38af3f30c65ada9734eb471dbfa6ecdd74a
Gerrit-Change-Number: 8936
Gerrit-PatchSet: 20
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Fri, 18 May 2018 01:55:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-17 Thread Bikramjeet Vig (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10060

to look at the new patch set (#10).

Change subject: IMPALA-5216: Make admission control queuing async
..

IMPALA-5216: Make admission control queuing async

Implement asynchronous admission control queuing. This is achieved by
running the admission control code-path in a separate thread. Major
changes include: propagating cancellation to the admission control
thread and dequeuing thread, and adding a new Query Operation State
called "PENDING" that represents the state between completion of
planning and starting of query execution.

Testing:
- Added a deterministic end to end test
- Ran multiple stress tests successfully with a cancellation probability
of 60% and with different values for the following parameters:
max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a
valid state afterwards (no orphan fragments or wrong metrics).
- Ran all exhaustive tests and ASAN core tests successfully.

TODO: change terminology of "in_flight_query" to "submitted_queries".
  Need to identify all references of this terminology, eg. in
  comments, tests, variable names, etc.

Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
---
M be/src/common/atomic.h
M be/src/common/logging.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/promise-test.cc
M be/src/util/promise.h
M common/thrift/ImpalaService.thrift
M tests/authorization/test_authorization.py
M tests/beeswax/impala_beeswax.py
M tests/common/impala_connection.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/hs2/hs2_test_suite.py
M tests/hs2/test_hs2.py
M tests/query_test/test_observability.py
M www/query_backends.tmpl
29 files changed, 659 insertions(+), 227 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10060/10
--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 10
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10418 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 4:

This change did not cherrypick successfully into branch 2.x. To resolve this, 
please do the cherry-pick manually and submit it to Gerrit at refs/for/2.x or 
add an exception to the branch 2.x copy of bin/ignored_commits.json. Thanks, 
your friendly bot at https://jenkins.impala.io/job/cherrypick-2.x-and-test/512/ 
.


--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Fri, 18 May 2018 01:45:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 9:

> > (2 comments)
 > >
 > > will rebase in next patch, please hold off on reviewing it till
 > > then
 >
 > Please try to do rebases in independent patchsets though. It makes
 > looking at diffs of patchsets much easier to keep the rebases
 > separate.

Yup, thats exactly what I am doing :). Just wanted to hold off reviewing till 
then so that after rebase, I can highlight any conflict resolutions that might 
be important to the core changes in this patch.


--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 9
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 01:33:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 01:22:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2502/


--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 3
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 01:23:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 9:

> (2 comments)
 >
 > will rebase in next patch, please hold off on reviewing it till
 > then

Please try to do rebases in independent patchsets though. It makes looking at 
diffs of patchsets much easier to keep the rebases separate.


--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 9
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 01:14:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10060 )

Change subject: IMPALA-5216: Make admission control queuing async
..


Patch Set 9:

(2 comments)

will rebase in next patch, please hold off on reviewing it till then

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/service/impala-server.cc@663
PS7, Line 663: return Status::OK();
> Right, depends on the client. Maybe check to see if/how CM and Hue and any
will update this once i get a definite response from both teams


http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/promise.h
File be/src/util/promise.h:

http://gerrit.cloudera.org:8080/#/c/10060/7/be/src/util/promise.h@77
PS7, Line 77: /// Set() is done and the promise is safe to delete.
> Sure. I was thinking of something simpler just to refactor the code. But I'
Done



--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 9
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 18 May 2018 01:07:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5216: Make admission control queuing async

2018-05-17 Thread Bikramjeet Vig (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10060

to look at the new patch set (#9).

Change subject: IMPALA-5216: Make admission control queuing async
..

IMPALA-5216: Make admission control queuing async

Implement asynchronous admission control queuing. This is achieved by
running the admission control code-path in a separate thread. Major
changes include: propagating cancellation to the admission control
thread and dequeuing thread, and adding a new Query Operation State
called "PENDING" that represents the state between completion of
planning and starting of query execution.

Testing:
- Added a deterministic end to end test
- Ran multiple stress tests successfully with a cancellation probability
of 60% and with different values for the following parameters:
max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a
valid state afterwards (no orphan fragments or wrong metrics).
- Ran all exhaustive tests and ASAN core tests successfully.

TODO: change terminology of "in_flight_query" to "submitted_queries".
  Need to identify all references of this terminology, eg. in
  comments, tests, variable names, etc.

Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
---
M be/src/common/atomic.h
M be/src/common/logging.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/debug-util.cc
M be/src/util/debug-util.h
M be/src/util/promise-test.cc
M be/src/util/promise.h
M common/thrift/ImpalaService.thrift
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
M tests/authorization/test_authorization.py
M tests/beeswax/impala_beeswax.py
M tests/common/impala_connection.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_krpc_mem_usage.py
M tests/hs2/hs2_test_suite.py
M tests/hs2/test_hs2.py
M tests/query_test/test_observability.py
M www/query_backends.tmpl
30 files changed, 656 insertions(+), 232 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/10060/9
--
To view, visit http://gerrit.cloudera.org:8080/10060
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Gerrit-Change-Number: 10060
Gerrit-PatchSet: 9
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 00:19:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2500/


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 00:19:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6941: load more text scanner compression plugins

2018-05-17 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10165 )

Change subject: IMPALA-6941: load more text scanner compression plugins
..


Patch Set 11: Code-Review+2

Makes a lot of sense to me. FE changes lgtm


--
To view, visit http://gerrit.cloudera.org:8080/10165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2a9c4a4a11bed81df706e9e834400bfedfe48e6
Gerrit-Change-Number: 10165
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 00:13:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7029: Clone LHS when rewriting a between predicate

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has abandoned this change. ( http://gerrit.cloudera.org:8080/10402 )

Change subject: IMPALA-7029: Clone LHS when rewriting a between predicate
..


Abandoned

This turns out to be a non-issue. The case should be that I changed some expr 
in place which is wrong.
--
To view, visit http://gerrit.cloudera.org:8080/10402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I2db3af99593bc207ee8dc2a1550acf21ebd61c41
Gerrit-Change-Number: 10402
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-6131: Track time of last statistics update in metadata

2018-05-17 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10116 )

Change subject: IMPALA-6131: Track time of last statistics update in metadata
..


Patch Set 12: Code-Review+2

(1 comment)

lgtm after addressing my last comment

http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/10116/12/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2868
PS12, Line 2868: // catalogd would read it back after alter_table, but 
as this would not work for
I don't think it's true that we'd always read it back after calling this 
function.

I think the bigger issue is that if we let the HMS set lastDdlTime, then we'd 
have to fetch the HMS table metadata again only to get that updated 
lastDdlTime. So with it's more efficient for us to set it directly.



-- 
To view, visit http://gerrit.cloudera.org:8080/10116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59a671ac29d352bd92ce40d5cb6662bb23f146b5
Gerrit-Change-Number: 10116
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 18 May 2018 00:05:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2499/


--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 00:04:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5168: Codegen HASH PARTITIONED KrpcDataStreamSender::Send()

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10421 )

Change subject: IMPALA-5168: Codegen HASH_PARTITIONED 
KrpcDataStreamSender::Send()
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/10421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff
Gerrit-Change-Number: 10421
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 17 May 2018 23:56:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5168: Codegen HASH PARTITIONED KrpcDataStreamSender::Send()

2018-05-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10421 )

Change subject: IMPALA-5168: Codegen HASH_PARTITIONED 
KrpcDataStreamSender::Send()
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10421/1/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test
File testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test:

http://gerrit.cloudera.org:8080/#/c/10421/1/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test@61
PS1, Line 61:
> would you be adding the tests in the next patch?
Oh sorry, forgot to reply. I added it in the new file 
datastream-sender-codegen.test.



--
To view, visit http://gerrit.cloudera.org:8080/10421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff
Gerrit-Change-Number: 10421
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 17 May 2018 23:38:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..

IMPALA-5737: Tighten minicluster memory limit

This patch limits minicluster memory based on the memory available.
Two memory profiles are introduced. If there is less than 32GB memory
available, memory limit of the minicluster will be set to ~27GB.
Otherwise it will be set to ~32GB. This will hopefully help with
minicluster with more than 3 nodes.

The 27GB profile works on m2.4xlarge which has 8 cores. It should be
more than enough for 4-core machines.

Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
---
M bin/impala-config.sh
M bin/start-impala-cluster.py
M testdata/bin/run-hbase.sh
M testdata/bin/run-hive-server.sh
M testdata/bin/run-sentry-service.sh
M testdata/cluster/node_templates/common/etc/init.d/common.tmpl
M testdata/cluster/node_templates/common/etc/init.d/hdfs-common
M testdata/cluster/node_templates/common/etc/init.d/kudu-common
8 files changed, 83 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/10277/5
--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 23:26:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] test-with-docker: work with git worktree

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10335 )

Change subject: test-with-docker: work with git worktree
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2497/


--
To view, visit http://gerrit.cloudera.org:8080/10335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Gerrit-Change-Number: 10335
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 22:55:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6662: Make stress test resilient to hangs due to client crashes

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9635 )

Change subject: IMPALA-6662: Make stress test resilient to hangs due to client 
crashes
..


Patch Set 6:

What's the next step here?


--
To view, visit http://gerrit.cloudera.org:8080/9635
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I10c5dc9b8c2fffc471bac2279e348bc1d1fec3b7
Gerrit-Change-Number: 9635
Gerrit-PatchSet: 6
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 17 May 2018 22:48:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] test-with-docker: work with git worktree

2018-05-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10335 )

Change subject: test-with-docker: work with git worktree
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10335
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9186e0b6f068aacc25f8d691508165c04329fa8b
Gerrit-Change-Number: 10335
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 22:45:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10277/4/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/10277/4/bin/impala-config.sh@653
PS4, Line 653: # Get cgroup available memory
nit: is this misaligned now?



--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 17 May 2018 22:40:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5168: Codegen HASH PARTITIONED KrpcDataStreamSender::Send()

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10421 )

Change subject: IMPALA-5168: Codegen HASH_PARTITIONED 
KrpcDataStreamSender::Send()
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@718
PS1, Line 718:  
RETURN_IF_ERROR(partition_exprs_[i]->GetCodegendComputeFn(codegen, 
_fn));
 :
 : // Load the expression evaluator for the i-th partition 
expression
 : llvm::Function* get_expr_eval_fn =
 : 
codegen->GetFunction(IRFunction::KRPC_DSS_GET_PART_EXPR_EVAL, false);
 : DCHECK(get_expr_eval_fn != nullptr);
> The above is actually the unoptimized version of the code. The reason I avo
sounds good


http://gerrit.cloudera.org:8080/#/c/10421/1/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test
File testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test:

http://gerrit.cloudera.org:8080/#/c/10421/1/testdata/workloads/functional-query/queries/QueryTest/disable-codegen.test@61
PS1, Line 61:
> add cases for "codegen not supported/needed"
would you be adding the tests in the next patch?



--
To view, visit http://gerrit.cloudera.org:8080/10421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff
Gerrit-Change-Number: 10421
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 17 May 2018 22:40:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10446 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10446
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 22:39:46 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10446 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 1:

Trivial cherry-pick; simply ignored the second copy of the 'kms' file.


--
To view, visit http://gerrit.cloudera.org:8080/10446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10446
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 22:37:16 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Philip Zeyliger (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/10446

to review the following change.


Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..

IMPALA-7035: Configure jceks.key.serialFilter for KMS.

Configures a Java property for KMS to account for JDK 8u171's security fixes. I
was seeing impala-py.test tests/metadata/test_hdfs_encryption.py fail with the
following error:

  AssertionError: Error creating encryption zone: RemoteException: Can't 
recover key for testkey1 from keystore 
file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore

The issue is described in HDFS-13494, and I imagine it'll be fixed in due time. 
In the
meanwhile, setting this property seems to do the trick.

Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Reviewed-on: http://gerrit.cloudera.org:8080/10418
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
1 file changed, 6 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/10446/1
--
To view, visit http://gerrit.cloudera.org:8080/10446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10446
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc
File be/src/common/init.cc:

http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc@302
PS5, Line 302: const string UBSAN_DEFAULT_OPTIONS = Substitute(
> A couple of other nits I forgot:
I moved to a static variable inside the function. I think then there's no 
question about the static initialization order stuff, and we're also clearly 
not leaking the name.


http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc@304
PS5, Line 304: ==
> !=
Yep, caught this in further testing. I must have done a round of testing 
without the #if defined(UNDEFINED_SANITIZER) stuff; it turns out that it wasn't 
defined (hence a new CMakeFiles change to define it).


http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc@305
PS5, Line 305: extern "C" const char *__ubsan_default_options() {
> Oh, and actually one other thing - how confident are you that this avoids t
See above.



--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 22:31:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6070: Adding ASAN, --tail to test-with-docker.

2018-05-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10319 )

Change subject: IMPALA-6070: Adding ASAN, --tail to test-with-docker.
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I51451cdf1352fc0f9516d729b9a77700488d993f
Gerrit-Change-Number: 10319
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 22:31:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@695
PS1, Line 695: is_fragment_failure ? 
> as mentioned in this thread, it will be no worse than before this change. I
sounds good. Thanks for the explanation.



--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 22:30:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc
File be/src/common/init.cc:

http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc@304
PS5, Line 304: ==
!=



--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 22:17:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10418 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..

IMPALA-7035: Configure jceks.key.serialFilter for KMS.

Configures a Java property for KMS to account for JDK 8u171's security fixes. I
was seeing impala-py.test tests/metadata/test_hdfs_encryption.py fail with the
following error:

  AssertionError: Error creating encryption zone: RemoteException: Can't 
recover key for testkey1 from keystore 
file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore

The issue is described in HDFS-13494, and I imagine it'll be fixed in due time. 
In the
meanwhile, setting this property seems to do the trick.

Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Reviewed-on: http://gerrit.cloudera.org:8080/10418
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 
---
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
M testdata/cluster/node_templates/cdh6/etc/init.d/kms
2 files changed, 12 insertions(+), 0 deletions(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10418 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 22:15:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5168: Codegen HASH PARTITIONED KrpcDataStreamSender::Send()

2018-05-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10421 )

Change subject: IMPALA-5168: Codegen HASH_PARTITIONED 
KrpcDataStreamSender::Send()
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@718
PS1, Line 718:  
RETURN_IF_ERROR(partition_exprs_[i]->GetCodegendComputeFn(codegen, 
_fn));
 :
 : // Load the expression evaluator for the i-th partition 
expression
 : llvm::Function* get_expr_eval_fn =
 : 
codegen->GetFunction(IRFunction::KRPC_DSS_GET_PART_EXPR_EVAL, false);
 : DCHECK(get_expr_eval_fn != nullptr);
> can we use Builder.CreateExtractValue() and get rid of this function call.
The above is actually the unoptimized version of the code. The reason I avoided 
using CreateExtractValue() or GetElementPtr() etc is to avoid the need to make 
assumption about the order of the fields inside the struct plus it's easier to 
review in general.

I confirmed in the optimized version of the code that this function call is 
actually inlined and optimized away as dead code.


http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@803
PS1, Line 803: t string sender_name = PartitionTypeName() + " Send
> nit: Invert condition to avoid indentation.
Done


http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@818
PS1, Line 818: ced;
> nit: HashRow()
Done


http://gerrit.cloudera.org:8080/#/c/10421/1/be/src/runtime/krpc-data-stream-sender.cc@820
PS1, Line 820: en->ReplaceCallSitesWithValue(hash_and_add_
> nit: maybe use a named const like
Done



--
To view, visit http://gerrit.cloudera.org:8080/10421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff
Gerrit-Change-Number: 10421
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Comment-Date: Thu, 17 May 2018 22:11:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5168: Codegen HASH PARTITIONED KrpcDataStreamSender::Send()

2018-05-17 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/10421 )

Change subject: IMPALA-5168: Codegen HASH_PARTITIONED 
KrpcDataStreamSender::Send()
..

IMPALA-5168: Codegen HASH_PARTITIONED KrpcDataStreamSender::Send()

This change codegens the hash partitioning logic of
KrpcDataStreamSender::Send() when the partitioning strategy
is HASH_PARTITIONED. It does so by unrolling the loop which
evaluates each row against the partitioning expressions and
hashes the result. It also replaces the number of channels
of that sender with a constant at runtime.

With this change, we get reasonable speedup with some benchmarks:

++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
++---+-++++
| TPCH(_300) | parquet / none / none | 20.03   | -6.44% | 13.56  | 
-7.15% |
++---+-++++

+-+---+-++++
| Workload| File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
+-+---+-++++
| TARGETED-PERF(_300) | parquet / none / none | 58.59   | -5.56% | 12.28
  | -5.30% |
+-+---+-++++

+-+---+-++++
| Workload| File Format   | Avg (s) | Delta(Avg) | 
GeoMean(s) | Delta(GeoMean) |
+-+---+-++++
| TPCDS-UNMODIFIED(_1000) | parquet / none / none | 15.60   | -3.10% | 7.16 
  | -4.33% |
+-+---+-++++

+---+---+-++++
| Workload  | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) 
| Delta(GeoMean) |
+---+---+-++++
| TPCH_NESTED(_300) | parquet / none / none | 30.93   | -3.02% | 17.46  
| -4.71% |
+---+---+-++++

Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exchange-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/partial-sort-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/fragment-instance-state.cc
A be/src/runtime/krpc-data-stream-sender-ir.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/raw-value-ir.cc
M be/src/runtime/raw-value.cc
M be/src/runtime/runtime-state.h
M be/src/util/runtime-profile.h
A 
testdata/workloads/functional-query/queries/QueryTest/datastream-sender-codegen.test
M tests/query_test/test_codegen.py
25 files changed, 427 insertions(+), 100 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/10421/2
--
To view, visit http://gerrit.cloudera.org:8080/10421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1c44cc9312c062cc7a5a4ac9156ceaa31fb887ff
Gerrit-Change-Number: 10421
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Bikramjeet Vig 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Dan Hecht (Code Review)
Hello Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10440

to look at the new patch set (#2).

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed
- loop query_test & failure for 12 hours with no dchecks or crashes
  (This had previously reproduced IMPALA-7030 and IMPALA-7033 with
  the previous version of this change).

Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 396 insertions(+), 391 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/10440/2
--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.h@351
PS1, Line 351: Cannot finalize execution until exec RPCs are no longer
 :   /// being sent.
> We added a DCHECK, but this comment suggests that this method will ensure t
Yeah, I can see how the comment is confusing. Updated the comments. 
(Ultimately, in a future change I likely will move the wait() but want to hold 
off on that to "prove" that the wait is only needed for the 
UpdateBackendExecStatus() call path)


http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@530
PS1, Line 530: // TODO: IMPALA-7011 is this needed? This will also happen on 
the "backend" side of
 :   // cancel rpc. And in the case of EOS, sink already knows this.
> I think we can update this comment, since we figured out that we do need it
Done. (Though I haven't given up on this quite yet -- I think it's needed given 
the current plan-root-sink code, but it might be possible to fix that code to 
make it not needed. But the JIRA can track that work).


http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@695
PS1, Line 695: exec_rpcs_complete_barrier_->Wait();
> should we be concerned about adding a possible long wait() call on an RPC c
as mentioned in this thread, it will be no worse than before this change. In 
the old code, we hold the Coordinator::lock_ while sending the exec RPCs. 
Additionally, we take the Coordinator::lock_ inside UpdateStatus() and also at 
line 688 -- so this RPC handler will always block until the exec rpcs are done 
sending in the old code.

In the new code, the blocking only happens when the query hits an error.

I agree it's not ideal, but we're better off. And I plan to improve things with 
IMPALA-6788.



--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 22:00:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR](2.x) Ignore "IMPALA-7003: Deflake erasure coding data loading"

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10445 )

Change subject: Ignore "IMPALA-7003: Deflake erasure coding data loading"
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9beafd4b0f0fc163ebe969fc39b4fdb6b27c0fa
Gerrit-Change-Number: 10445
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 17 May 2018 21:58:13 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) Ignore "IMPALA-7003: Deflake erasure coding data loading"

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10445 )

Change subject: Ignore "IMPALA-7003: Deflake erasure coding data loading"
..

Ignore "IMPALA-7003: Deflake erasure coding data loading"

The commit message in IMPALA-7003 mistakenly didn't include
"not for 2.x".

Change-Id: Ic9beafd4b0f0fc163ebe969fc39b4fdb6b27c0fa
Reviewed-on: http://gerrit.cloudera.org:8080/10445
Reviewed-by: Joe McDonnell 
Tested-by: Tianyi Wang 
---
M bin/ignored_commits.json
1 file changed, 3 insertions(+), 1 deletion(-)

Approvals:
  Joe McDonnell: Looks good to me, approved
  Tianyi Wang: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic9beafd4b0f0fc163ebe969fc39b4fdb6b27c0fa
Gerrit-Change-Number: 10445
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR](2.x) Ignore "IMPALA-7003: Deflake erasure coding data loading"

2018-05-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10445 )

Change subject: Ignore "IMPALA-7003: Deflake erasure coding data loading"
..


Patch Set 1: Code-Review+2

Thanks. Run this to validate the JSON:
python -m json.tool bin/ignored_commits.json

Then you can +1 verify and merge.


--
To view, visit http://gerrit.cloudera.org:8080/10445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic9beafd4b0f0fc163ebe969fc39b4fdb6b27c0fa
Gerrit-Change-Number: 10445
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 17 May 2018 21:55:57 +
Gerrit-HasComments: No


[Impala-ASF-CR](2.x) Ignore "IMPALA-7003: Deflake erasure coding data loading"

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10445


Change subject: Ignore "IMPALA-7003: Deflake erasure coding data loading"
..

Ignore "IMPALA-7003: Deflake erasure coding data loading"

The commit message in IMPALA-7003 mistakenly didn't include
"not for 2.x".

Change-Id: Ic9beafd4b0f0fc163ebe969fc39b4fdb6b27c0fa
---
M bin/ignored_commits.json
1 file changed, 3 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/10445/1
--
To view, visit http://gerrit.cloudera.org:8080/10445
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: 2.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic9beafd4b0f0fc163ebe969fc39b4fdb6b27c0fa
Gerrit-Change-Number: 10445
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.h@351
PS1, Line 351: Cannot finalize execution until exec RPCs are no longer
 :   /// being sent.
> this comment is new
We added a DCHECK, but this comment suggests that this method will ensure that 
finalization fails(like returns a non-ok status) if exec RPCs are pending.
we should either move the wait() call to this method or move this comment to 
UpdateBackendExecStatus() and update the one for this to say that "It should 
not be called if exec RPCs are pending"


http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@530
PS1, Line 530: // TODO: IMPALA-7011 is this needed? This will also happen on 
the "backend" side of
 :   // cancel rpc. And in the case of EOS, sink already knows this.
I think we can update this comment, since we figured out that we do need it.


http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@695
PS1, Line 695: exec_rpcs_complete_barrier_->Wait();
should we be concerned about adding a possible long wait() call on an RPC 
callback?



--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 21:48:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8523 )

Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls
..


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/Frontend.thrift@375
PS12, Line 375:   per_node_scan_ranges
Thanks, the new thrift structures make more sense to me. Maybe it's simpler by 
moving the "union" up one level. i.e. make this:

map>

Then the backend can be simplified: the "ConcreteList" is already in the format 
you need, and the GeneratorSpecList can be converted to a ConcreteList in one 
go that you pass to the scheduler. So then no need for the iterator thingy.

What do you think?


http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

http://gerrit.cloudera.org:8080/#/c/8523/12/common/thrift/PlanNodes.thrift@202
PS12, Line 202: THdfsFileSplitGeneratorSpec
update



--
To view, visit http://gerrit.cloudera.org:8080/8523
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5
Gerrit-Change-Number: 8523
Gerrit-PatchSet: 12
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 17 May 2018 21:48:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 3:

I forgot to test the latest change outsider of a docker. (/proc/self/croup 
always exists so we can't assume it's inside a docker with that condition)


--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 17 May 2018 21:39:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2495/


--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 17 May 2018 21:33:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10396 )

Change subject: IMPALA-3134: Support different proc mem limits among impalads 
for admission control checks
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10396/3/be/src/scheduling/admission-controller.cc@408
PS3, Line 408: pair min_proc_mem_limit =
 :   make_pair(nullptr, std::numeric_limits::max());
@Tim: let me know if you decide not to adopt this pattern for your patch for 
IMPALA-6035



--
To view, visit http://gerrit.cloudera.org:8080/10396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e
Gerrit-Change-Number: 10396
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 21:24:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks

2018-05-17 Thread Bikramjeet Vig (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10396

to look at the new patch set (#3).

Change subject: IMPALA-3134: Support different proc mem limits among impalads 
for admission control checks
..

IMPALA-3134: Support different proc mem limits among impalads for
admission control checks

Currently the admission controller assumes that all backends have the
same process mem limit as the impalad it itself is running on. With
this patch the proc mem limit for each impalad is available to the
admission controller and it uses it for making correct admisssion
decisions. It currently works under the assumption that the
per-process memory limit does not change dynamically.

Testing:
Added an e2e test.

Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/impala-server.cc
M bin/start-impala-cluster.py
M common/thrift/StatestoreService.thrift
M tests/custom_cluster/test_admission_controller.py
8 files changed, 119 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/10396/3
--
To view, visit http://gerrit.cloudera.org:8080/10396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e
Gerrit-Change-Number: 10396
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3134: Support different proc mem limits among impalads for admission control checks

2018-05-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10396 )

Change subject: IMPALA-3134: Support different proc mem limits among impalads 
for admission control checks
..


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/10396/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10396/2//COMMIT_MSG@14
PS2, Line 14: decisions.
> Can you mention that this assumes that the per-process memory limit does no
Done


http://gerrit.cloudera.org:8080/#/c/10396/2/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/10396/2/be/src/scheduling/admission-controller.cc@49
PS2, Line 49: int64_t GetProcMemLimit() {
> This is dead code now
Done


http://gerrit.cloudera.org:8080/#/c/10396/2/be/src/scheduling/admission-controller.cc@152
PS2, Line 152: const string HOST_MEM_NOT_AVAILABLE = "Not enough memory 
available on host $0."
> Maybe this error message should include the total process memory limit, e.g
good idea. Done.


http://gerrit.cloudera.org:8080/#/c/10396/2/be/src/scheduling/admission-controller.cc@450
PS2, Line 450:
> Unnecessary vertical whitespace
Done


http://gerrit.cloudera.org:8080/#/c/10396/2/be/src/scheduling/admission-controller.cc@458
PS2, Line 458:
> Vertical whitespace
Done


http://gerrit.cloudera.org:8080/#/c/10396/2/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

http://gerrit.cloudera.org:8080/#/c/10396/2/be/src/scheduling/query-schedule.h@69
PS2, Line 69:   // The process memory limit of this backend.
> Maybe mention that it's the value obtained from the statestore during sched
Done


http://gerrit.cloudera.org:8080/#/c/10396/2/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/10396/2/bin/start-impala-cluster.py@87
PS2, Line 87: seperated
> separated
Done


http://gerrit.cloudera.org:8080/#/c/10396/2/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/10396/2/tests/custom_cluster/test_admission_controller.py@429
PS2, Line 429: 1000
> Can we set this lower? Is there a reason that it needs to sit in the queue
forgot to update that.


http://gerrit.cloudera.org:8080/#/c/10396/2/tests/custom_cluster/test_admission_controller.py@431
PS2, Line 431:   def test_heterogeneous_proc_mem_limit(self, vector):
> Can we add a 3GB query that succeeds? E.g. with num_nodes=1, submitted to o
Done


http://gerrit.cloudera.org:8080/#/c/10396/2/tests/custom_cluster/test_admission_controller.py@433
PS2, Line 433: mem limits of each impalad"""
> Maybe add an extra sentence to explain the concept of the test. e.g. "Start
Done


http://gerrit.cloudera.org:8080/#/c/10396/2/tests/custom_cluster/test_admission_controller.py@434
PS2, Line 434: # Choose a query that runs on all 3 backends.
> Are all these queries submitted to the first impalad? It's not clear if tha
modified a test to send a query to a different impalad. (check the queuing test)



--
To view, visit http://gerrit.cloudera.org:8080/10396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e
Gerrit-Change-Number: 10396
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 21:18:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4025: Part 1: Add percentile disc aggregation function

2018-05-17 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9777 )

Change subject: IMPALA-4025: Part 1: Add percentile_disc aggregation function
..


Patch Set 7:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@99
PS7, Line 99: outputGroupingExprSmap_
is this used (or planned to be used) anywhere?


http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@187
PS7, Line 187:   void setContainsPercentile() {
public?


http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@217
PS7, Line 217: boolean containsPercentile = false;
make public as well?


http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@354
PS7, Line 354:   boolean selectBlockContainsPercentile() { return 
containsPercentile; }
intentional visibility?


http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/9777/7/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@108
PS7, Line 108: rewriteSelectStatementHook(stmt, analyzer);
Any way to separate this refactor into its own change? first example I saw: 
where did the helper 'hasSubqueryInDisunction' go? I'm guessing this doesn't 
have anything to specifically do with the change for percentile. Looks like a 
good cleanup overall, but since there is some subtle existing functionality 
here, it would be helpful to separate clean-up from the actual change.


http://gerrit.cloudera.org:8080/#/c/9777/5/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
File fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java:

http://gerrit.cloudera.org:8080/#/c/9777/5/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java@253
PS5, Line 253: if (isLogical_) {
change this to a precondition



--
To view, visit http://gerrit.cloudera.org:8080/9777
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iacef7b3fcd74c4c73d88400ce27307c3baa0121e
Gerrit-Change-Number: 9777
Gerrit-PatchSet: 7
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 17 May 2018 21:11:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-110: Planner support for multiple distinct aggregations.

2018-05-17 Thread Alex Behm (Code Review)
Alex Behm has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10444


Change subject: IMPALA-110: Planner support for multiple distinct aggregations.
..

IMPALA-110: Planner support for multiple distinct aggregations.

Adds planner support for multiple distinct aggregations in
a single SELECT block.

Design summary:
- The existing tree-based plan shape with a two-phased
  aggregation is maintained.
- Existing plans are not changed.
- A single query block may contain multiple distinct aggregates.
- Aggregates are grouped into "aggregation classes" based on their
  expressions in the distinct portion which may be empty for
  non-distinct aggregates.
- The aggregation framework is generalized to simultaneously process
  multiple aggregation classes within the tree-based plan. This process
  splits the results of different aggregation classes into separate rows,
  so a final aggregation is needed to transpose the results into the
  desired form.
- Main challenge: Each aggregation class consumes and produces
  different tuples, so conceptually a union-type of tuples flows through
  the runtime. The tuple union is represented by a TupleRow with one tuple
  per aggregation class. Only one tuple in such a TupleRow is non-NULL.
- Backend exec nodes in the aggregation plan will be aware of this
  tuple-union either explicitly in their implementation or by relying on
  expressions that distinguish the aggregation classes.
- To distinguish the aggregation classes, e.g. in hash exchanges,
  CASE expressions are crafted to hash/group on the appropriate slots.

Deferred FE work:
- Beautify/condense the long CASE exprs
- Push applicable conjuncts into individual aggregators before
  the transposition step
- Added a few testing TODOs to reduce the size of this patch
- Decide whether we want to change existing plans to the new model

Testing:
- Added analyzer and planner tests
- Ran end-to-end queries based on a prototype BE implementation
- Ran hdfs/core tests

Change-Id: I4c5cb348f9431350d2e5bf2c84325dcc44d38d2f
---
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exprs/CMakeLists.txt
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M be/src/exprs/scalar-expr.cc
A be/src/exprs/valid-tuple-id.cc
A be/src/exprs/valid-tuple-id.h
M common/thrift/Exprs.thrift
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
A fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
A fe/src/main/java/org/apache/impala/analysis/ValidTupleIdExpr.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct-limit.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct-materialization.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct-predicates.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct.test
37 files changed, 4,809 insertions(+), 591 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/10444/1
--
To view, visit http://gerrit.cloudera.org:8080/10444
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4c5cb348f9431350d2e5bf2c84325dcc44d38d2f
Gerrit-Change-Number: 10444
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2495/


--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Thu, 17 May 2018 20:44:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5737: Tighten minicluster memory limit

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/10277 )

Change subject: IMPALA-5737: Tighten minicluster memory limit
..

IMPALA-5737: Tighten minicluster memory limit

This patch limits minicluster memory based on the memory available.
Two memory profiles are introduced. If there is less than 32GB memory
available, memory limit of the minicluster will be set to ~27GB.
Otherwise it will be set to ~32GB. This will hopefully help with
minicluster with more than 3 nodes.

The 27GB profile works on m2.4xlarge which has 8 cores. It should be
more than enough for 4-core machines.

Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
---
M bin/impala-config.sh
M bin/start-impala-cluster.py
M testdata/bin/run-hbase.sh
M testdata/bin/run-hive-server.sh
M testdata/bin/run-sentry-service.sh
M testdata/cluster/node_templates/common/etc/init.d/common.tmpl
M testdata/cluster/node_templates/common/etc/init.d/hdfs-common
M testdata/cluster/node_templates/common/etc/init.d/kudu-common
8 files changed, 81 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/10277/3
--
To view, visit http://gerrit.cloudera.org:8080/10277
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If8240551e726c6da546a926a1ce3444f41ef87fe
Gerrit-Change-Number: 10277
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..

IMPALA-5842: Write page index in Parquet files

This commit builds on the previous work of
Pooja Nilangekar: https://gerrit.cloudera.org/#/c/7464/

The commit implements the write path of PARQUET-922:
"Add column indexes to parquet.thrift". As specified in the
parquet-format, Impala writes the page indexes just before
the footer. This allows much more efficient page filtering
than using the same information from the 'statistics' field
of DataPageHeader.

I updated Pooja's python tests as well.

Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Reviewed-on: http://gerrit.cloudera.org:8080/9693
Reviewed-by: Zoltan Borok-Nagy 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/parquet-column-stats.h
M be/src/exec/parquet-column-stats.inline.h
M be/src/util/CMakeLists.txt
A be/src/util/string-util-test.cc
A be/src/util/string-util.cc
A be/src/util/string-util.h
M common/thrift/parquet.thrift
M testdata/bin/load-dependent-tables.sql
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/query_test/test_chars.py
A tests/query_test/test_parquet_page_index.py
M tests/util/get_parquet_metadata.py
14 files changed, 1,002 insertions(+), 113 deletions(-)

Approvals:
  Zoltan Borok-Nagy: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 21
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 20: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 20
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 May 2018 20:22:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 5: Code-Review+1

Actually I'll let Jim do +2


--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 19:47:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10404/2//COMMIT_MSG
Commit Message:

PS2:
> Nice work; you got me to do it. ;)
We have TestInfo::is_test(), but I'm not sure if that is initialized early 
enough.



--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 19:46:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 5:

(2 comments)

If this is getting too hairy for UBSAN, feel free to move it back to the shell 
files.

http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc
File be/src/common/init.cc:

http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc@302
PS5, Line 302: const string UBSAN_DEFAULT_OPTIONS = Substitute(
A couple of other nits I forgot:

I think the style guide says to call this kUbsanDefaultOptions and to put it in 
an unnamed namespace.


http://gerrit.cloudera.org:8080/#/c/10404/5/be/src/common/init.cc@305
PS5, Line 305: extern "C" const char *__ubsan_default_options() {
Oh, and actually one other thing - how confident are you that this avoids the 
static initialization order fiasco?

https://isocpp.org/wiki/faq/ctors#static-init-order



--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 19:46:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@693
PS1, Line 693:   // We may start receiving status reports before all exec 
rpcs are complete.
 :   // Can't apply state transition until no more exec rpcs 
will be sent.
 :   exec_rpcs_complete_barrier_->Wait();
> Just to make sure I understand the bug correctly, we only need to wait for
For the bug we were hitting. But we also need to wait for all exec rpcs to 
avoid the case of the cancel rpc passing the exec rpc for the other backends 
(the new comment in HandleExecStateTransition is meant to explain that - lmk if 
it needs clarification).

I think what we'll want to do is preserve this barrier, but be able to skip 
sending exec rpcs when there is an error to unblock this barrier sooner.



--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 19:16:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10418 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2494/


--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 18:55:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10404/2//COMMIT_MSG
Commit Message:

PS2:
> Should we do this for the other sanitizers, too?
Nice work; you got me to do it. ;)

I'm not aware of automated things that use UBSAN and TSAN for extra testing, 
but I think this is safe enough.

If someone's familiar with a way to check if I'm in a test at runtime, I'm all 
ears. I couldn't bring myself to parse argv[0].


http://gerrit.cloudera.org:8080/#/c/10404/4/be/src/common/init.cc
File be/src/common/init.cc:

http://gerrit.cloudera.org:8080/#/c/10404/4/be/src/common/init.cc@302
PS4, Line 302: string UBSAN_DEFAULT_OPTIONS = Substitute(
> nit: const
Done



--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 18:50:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Philip Zeyliger (Code Review)
Hello Jim Apple, Tim Armstrong, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10404

to look at the new patch set (#5).

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..

Moving default sanitizer options into init.cc from shell scripts.

When running tests with ASAN, you need to set ASAN_OPTIONS explicitly,
to avoid various failures. In particular, backend tests fail complaining
about memory leaks and tests that use the parquet-reader binary complain
similarly. It turns out that we can shove the default options into our
code base directly, avoiding the need for users to set it explicitly.

I've done the same thing for TSAN and UBSAN.

I've manually checked that these are being read. In the UBSAN case, I
checked both with gdb and with inotifywatch on the suppressions file.

Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
---
M be/src/common/init.cc
M bin/run-backend-tests.sh
M bin/start-catalogd.sh
M bin/start-impalad.sh
M bin/start-statestored.sh
5 files changed, 27 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/10404/5
--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10404 )

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10404/4/be/src/common/init.cc
File be/src/common/init.cc:

http://gerrit.cloudera.org:8080/#/c/10404/4/be/src/common/init.cc@302
PS4, Line 302: string UBSAN_DEFAULT_OPTIONS = Substitute(
nit: const



--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 18:48:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Philip Zeyliger (Code Review)
Hello Jim Apple, Tim Armstrong, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10404

to look at the new patch set (#4).

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..

Moving default sanitizer options into init.cc from shell scripts.

When running tests with ASAN, you need to set ASAN_OPTIONS explicitly,
to avoid various failures. In particular, backend tests fail complaining
about memory leaks and tests that use the parquet-reader binary complain
similarly. It turns out that we can shove the default options into our
code base directly, avoiding the need for users to set it explicitly.

I've done the same thing for TSAN and UBSAN.

I've manually checked that these are being read. In the UBSAN case, I
checked both with gdb and with inotifywatch on the suppressions file.

Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
---
M be/src/common/init.cc
M bin/run-backend-tests.sh
M bin/start-catalogd.sh
M bin/start-impalad.sh
M bin/start-statestored.sh
5 files changed, 27 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/10404/4
--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Moving default sanitizer options into init.cc from shell scripts.

2018-05-17 Thread Philip Zeyliger (Code Review)
Hello Jim Apple, Tim Armstrong, Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10404

to look at the new patch set (#3).

Change subject: Moving default sanitizer options into init.cc from shell 
scripts.
..

Moving default sanitizer options into init.cc from shell scripts.

When running tests with ASAN, you need to set ASAN_OPTIONS explicitly,
to avoid various failures. In particular, backend tests fail complaining
about memory leaks and tests that use the parquet-reader binary complain
similarly. It turns out that we can shove the default options into our
code base directly, avoiding the need for users to set it explicitly.

I've done the same thing for TSAN and UBSAN.

I've manually checked that these are being read. In the UBSAN case, I
checked both with gdb and with inotifywatch on the suppressions file.

Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
---
M be/src/common/init.cc
M bin/run-backend-tests.sh
M bin/start-catalogd.sh
M bin/start-impalad.sh
M bin/start-statestored.sh
5 files changed, 26 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/10404/3
--
To view, visit http://gerrit.cloudera.org:8080/10404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3cbbd210c67750a48003f336bea1f3e1cb2d9e6b
Gerrit-Change-Number: 10404
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7000: [DOCS] Correct info about dedicated executors

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10357 )

Change subject: IMPALA-7000: [DOCS] Correct info about dedicated executors
..

IMPALA-7000: [DOCS] Correct info about dedicated executors

Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Reviewed-on: http://gerrit.cloudera.org:8080/10357
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M docs/topics/impala_scalability.xml
1 file changed, 7 insertions(+), 5 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Gerrit-Change-Number: 10357
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7000: [DOCS] Correct info about dedicated executors

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10357 )

Change subject: IMPALA-7000: [DOCS] Correct info about dedicated executors
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Gerrit-Change-Number: 10357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 18:31:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10418 )

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 17 May 2018 18:23:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7000: [DOCS] Correct info about dedicated executors

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10357 )

Change subject: IMPALA-7000: [DOCS] Correct info about dedicated executors
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/292/


--
To view, visit http://gerrit.cloudera.org:8080/10357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Gerrit-Change-Number: 10357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 18:22:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7000: [DOCS] Correct info about dedicated executors

2018-05-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10357 )

Change subject: IMPALA-7000: [DOCS] Correct info about dedicated executors
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10357
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b7e6c57188a41a45d5813882b6dbc37cf47cf1f
Gerrit-Change-Number: 10357
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 18:21:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6802 (part 4): Clean up authorization tests

2018-05-17 Thread Adam Holley (Code Review)
Adam Holley has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10442


Change subject: IMPALA-6802 (part 4): Clean up authorization tests
..

IMPALA-6802 (part 4): Clean up authorization tests

The third part of this patch is to rewrite the following authorization
tests:
- describe

Testing:
- Added new authorization tests
- Ran all front-end tests

Change-Id: Ic4cf3a4751b41908ef81ec35d89a2713d9fa0dc4
Cherry-picks: not for 2.x
---
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTestV2.java
1 file changed, 262 insertions(+), 35 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/10442/1
--
To view, visit http://gerrit.cloudera.org:8080/10442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic4cf3a4751b41908ef81ec35d89a2713d9fa0dc4
Gerrit-Change-Number: 10442
Gerrit-PatchSet: 1
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Fredy Wijaya 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 1: Code-Review+1

(1 comment)

Bikram may want to take a look too.

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@693
PS1, Line 693:   // We may start receiving status reports before all exec 
rpcs are complete.
 :   // Can't apply state transition until no more exec rpcs 
will be sent.
 :   exec_rpcs_complete_barrier_->Wait();
> Also note that we are still better off than before this patch because previ
Just to make sure I understand the bug correctly, we only need to wait for exec 
RPCs for this backend, right? I.e. waiting for all exec RPCs is sufficient but 
not necessary.

Not suggesting we need to change it.



--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 18:14:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 8
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 18:12:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Reviewed-on: http://gerrit.cloudera.org:8080/10393
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins 
---
M docs/topics/impala_tutorial.xml
1 file changed, 456 insertions(+), 634 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 9
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4025: Part 2: Add percentile cont & median aggregation functions

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/9778 )

Change subject: IMPALA-4025: Part 2: Add percentile_cont & median aggregation 
functions
..

IMPALA-4025: Part 2: Add percentile_cont & median aggregation functions

percentile_cont is implemented in the similar way as percentile_disc,
except for using a BE custom aggregation function for interpolating the
final result. median is rewritten into percentile_cont(0.5).

Some EE tests are added. Tests not related to error handling are
verified against PostgreSQL.

Change-Id: I2cc184682bb1bf4a5011b69a89e9ae253f3fd88d
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/PercentileAggExpr.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
11 files changed, 242 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/9778/6
--
To view, visit http://gerrit.cloudera.org:8080/9778
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2cc184682bb1bf4a5011b69a89e9ae253f3fd88d
Gerrit-Change-Number: 9778
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-4025: Part 1: Add percentile disc aggregation function

2018-05-17 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/9777 )

Change subject: IMPALA-4025: Part 1: Add percentile_disc aggregation function
..

IMPALA-4025: Part 1: Add percentile_disc aggregation function

This patch adds aggregation function percentile_disc. The implementation
is rewriting it into an inline view. The inline view computes the row
number on the ordering expr using analytic functions. The parent query
then picks the desired row using aggregation.
An Example of such rewrite is in StmtRewriter.java.

The behavior of this function is mostly the same as in PostgreSQL. The
handling of percentile expr not in [0, 1] is different: PostgreSQL
throws an error and impala returns NULL.

Some FE and EE tests are added. EE tests not related to the above
difference are verified against PostgreSQL.

Change-Id: Iacef7b3fcd74c4c73d88400ce27307c3baa0121e
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
A fe/src/main/java/org/apache/impala/analysis/PercentileAggExpr.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
22 files changed, 2,204 insertions(+), 1,021 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/9777/7
--
To view, visit http://gerrit.cloudera.org:8080/9777
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iacef7b3fcd74c4c73d88400ce27307c3baa0121e
Gerrit-Change-Number: 9777
Gerrit-PatchSet: 7
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 8:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/291/


--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 8
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 18:02:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-17 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 8
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 18:01:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6929: Support multi-column range partitions for Kudu

2018-05-17 Thread Thomas Marshall (Code Review)
Thomas Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10441


Change subject: IMPALA-6929: Support multi-column range partitions for Kudu
..

IMPALA-6929: Support multi-column range partitions for Kudu

Kudu allows specifying range partitions over multiple columns. Impala
already has support for doing this when the partitions are specified
with '=', but if the partitions are specified with '<' or '<=', the
parser would return an error.

This patch modifies the parser to allow for creating Kudu tables like:
create table kudu_test (a int, b int, primary key(a, b))
  partition by range(a, b) (partition (0, 0) <= values < (1, 1));
and similary to alter partitions like:
alter table kudu_test add range partition (1, 1) <= values < (2, 2);

Testing:
- Modified functional_kudu.jointbl's schema so that we have a table
  in functional with a multi-column range partition to test things
  against.
- Added FE and E2E tests for CREATE and ALTER.

Change-Id: I0141dd3344a4f22b186f513b7406f286668ef1e7
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/RangePartition.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
6 files changed, 104 insertions(+), 30 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/10441/1
--
To view, visit http://gerrit.cloudera.org:8080/10441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0141dd3344a4f22b186f513b7406f286668ef1e7
Gerrit-Change-Number: 10441
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Marshall 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@693
PS1, Line 693:   // We may start receiving status reports before all exec 
rpcs are complete.
 :   // Can't apply state transition until no more exec rpcs 
will be sent.
 :   exec_rpcs_complete_barrier_->Wait();
> this is the fix for IMPALA-7030/IMPALA-7033 race. Also, I plan to add a reg
Also note that we are still better off than before this patch because 
previously, the Coordinator::lock_ was held while sending exec rpcs and also 
taken by this callback, and so all status report RPCs would block until we 
finish exec rpcs. Now, we only block a failure status.

Also, i plan to improve this further with IMPALA-6788 (which will stop sending 
exec rpcs once we get the first failure report or exec rpc fails).



--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Thu, 17 May 2018 17:48:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-17 Thread Alex Rodoni (Code Review)
Hello Michael Brown, Jim Apple, Harsh J, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10393

to look at the new patch set (#8).

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..

IMPALA-6827: [DOCS] Updated the download link for the tutorial data

Updated the link to download the Parquet airline files for tutorial.

Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
---
M docs/topics/impala_tutorial.xml
1 file changed, 456 insertions(+), 634 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/10393/8
--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 8
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10440 )

Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..


Patch Set 1:

(4 comments)

This is the same as the previous reviewed patch at 
https://gerrit.cloudera.org/#/c/10158/ except where noted in this comment. I 
backed out the change in order to add a fix for IMPALA-7030/7033.

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.h@351
PS1, Line 351: Cannot finalize execution until exec RPCs are no longer
 :   /// being sent.
this comment is new


http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@523
PS1, Line 523:   // Can't transition until the exec RPCs are no longer in 
progress. Otherwise, a
 :   // cancel RPC could be missed, and resources freed before a 
backend has had a chance
 :   // to take a resource reference.
 :   DCHECK(exec_rpcs_complete_barrier_ != nullptr &&
 :   exec_rpcs_complete_barrier_->pending() <= 0) << "exec rpcs 
not completed";
this DCHECK is new


http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@664
PS1, Line 664: PrintId(query_id())
I folded in the PrintId() into this version of the change.


http://gerrit.cloudera.org:8080/#/c/10440/1/be/src/runtime/coordinator.cc@693
PS1, Line 693:   // We may start receiving status reports before all exec 
rpcs are complete.
 :   // Can't apply state transition until no more exec rpcs 
will be sent.
 :   exec_rpcs_complete_barrier_->Wait();
this is the fix for IMPALA-7030/IMPALA-7033 race. Also, I plan to add a 
regression test specifically for this race with IMPALA-7046.



--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Comment-Date: Thu, 17 May 2018 17:45:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

2018-05-17 Thread Dan Hecht (Code Review)
Dan Hecht has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10440


Change subject: IMPALA-5384, part 2: Simplify Coordinator locking and clarify 
state
..

IMPALA-5384, part 2: Simplify Coordinator locking and clarify state

The is the final change to clarify and break up the Coordinator's lock.
The state machine for the coordinator is made explicit, distinguishing
between executing state and multiple terminal states. Logic to
transition into a terminal state is centralized in one location and
executes exactly once for each coordinator object.

Derived from a patch for IMPALA_5384 by Marcel Kornacker.

Testing:
- exhaustive functional tests
- stress test on minicluster with memory overcommitment. Verified from
  the logs that this exercises all these paths:
  - successful queries
  - client requested cancellation
  - error from exec FInstances RPC
  - error reported asynchronously via report status RPC
  - eos before backend execution completed
- loop query_test & failure for 12 hours with no dchecks or crashes
  (This had previously reproduced IMPALA-7030 and IMPALA-7033 with
  the previous version of this change).

Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
---
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/impala-server.h
M be/src/util/counting-barrier.h
6 files changed, 398 insertions(+), 391 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/10440/1
--
To view, visit http://gerrit.cloudera.org:8080/10440
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6dc08da1295f1df3c9dce6d35d65d887b2c00a1c
Gerrit-Change-Number: 10440
Gerrit-PatchSet: 1
Gerrit-Owner: Dan Hecht 


[Impala-ASF-CR] IMPALA-7035: Configure jceks.key.serialFilter for KMS.

2018-05-17 Thread Philip Zeyliger (Code Review)
Hello Joe McDonnell,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10418

to look at the new patch set (#2).

Change subject: IMPALA-7035: Configure jceks.key.serialFilter for KMS.
..

IMPALA-7035: Configure jceks.key.serialFilter for KMS.

Configures a Java property for KMS to account for JDK 8u171's security fixes. I
was seeing impala-py.test tests/metadata/test_hdfs_encryption.py fail with the
following error:

  AssertionError: Error creating encryption zone: RemoteException: Can't 
recover key for testkey1 from keystore 
file:/home/impdev/Impala/testdata/cluster/cdh6/node-1/data/kms.keystore

The issue is described in HDFS-13494, and I imagine it'll be fixed in due time. 
In the
meanwhile, setting this property seems to do the trick.

Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
---
M testdata/cluster/node_templates/cdh5/etc/init.d/kms
M testdata/cluster/node_templates/cdh6/etc/init.d/kms
2 files changed, 12 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/10418/2
--
To view, visit http://gerrit.cloudera.org:8080/10418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2d21c9cce3b91e8fd8b2b4f1cda75e3958c977d5
Gerrit-Change-Number: 10418
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2493/


--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 20
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 May 2018 16:30:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5842: Write page index in Parquet files

2018-05-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
..


Patch Set 20: Code-Review+2

Fixed clang-tidy issue, adjusted stats-extrapolation test
Carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 20
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 17 May 2018 16:30:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6827: [DOCS] Updated the download link for the tutorial data

2018-05-17 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10393 )

Change subject: IMPALA-6827: [DOCS] Updated the download link for the tutorial 
data
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10393/7/docs/topics/impala_tutorial.xml
File docs/topics/impala_tutorial.xml:

http://gerrit.cloudera.org:8080/#/c/10393/7/docs/topics/impala_tutorial.xml@1823
PS7, Line 1823: >   GROUOP BY
This typo is still here.



--
To view, visit http://gerrit.cloudera.org:8080/10393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6823d1688169e0a6f09d5b552026bc18a3770828
Gerrit-Change-Number: 10393
Gerrit-PatchSet: 7
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Harsh J 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Thu, 17 May 2018 15:31:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

2018-05-17 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9943 )

Change subject: IMPALA-5706: Spilling sort optimisations
..


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/9943/8/be/src/runtime/sorter.cc@1676
PS8, Line 1676: --num_runs_in_one_merge;
> It's possible to have var-len slots but no var-len pages if all strings are
Thanks for the heads-up! I count for 2 buffers for an intermediate result then.


http://gerrit.cloudera.org:8080/#/c/9943/12/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

http://gerrit.cloudera.org:8080/#/c/9943/12/tests/query_test/test_sort.py@135
PS12, Line 135: doesn't consume all the memory from the second Sort. This 
query takes approx. 25s to
I gave this a second look and apparently I mis-measured something the last 
time. This in fact runs for at leas 25s according to my measurements. I spent a 
few hours to reduce this but according to my observations in order to see this 
behavior manifest in the # of merges we need this scale of data and runtime.
I'm open for suggestions about the next step here. I thought this shouldn't run 
in core tests for sure, so I restricted it for the exhaustive runs. However, 
that would run this test in a number of permutations of test configs and 
multiplying that with this 25s per run I find it too much.
Currently I'd prefer just to drop this test.

What do you think?



--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 12
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 17 May 2018 15:21:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5706: Spilling sort optimisations

2018-05-17 Thread Gabor Kaszab (Code Review)
Hello Tim Armstrong, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9943

to look at the new patch set (#12).

Change subject: IMPALA-5706: Spilling sort optimisations
..

IMPALA-5706: Spilling sort optimisations

This patch covers multiple changes with the purpose of optimizing
spilling sort mechanism:
  - Remove the hard-coded maximum limit of buffers that can be used
for merging the sorted runs. Instead this number is calculated
based on the available memory through buffer pool.
  - The already sorted runs are distributed more optimally between
the last intermediate merge and the final merge to avoid that a
heavy intermediate merge is followed by a light final merge.
  - Right before starting the merging phase Sorter tries to allocate
additional memory through the buffer pool.
  - An output run is not allocated anymore for the final merge.

Note, double-buffering the runs during a merge was also planned with
this patch. However, performance testing showed that except some
exotic queries with unreasonably small amount of buffer pool memory
available double-buffering doesn't add to the overall performance.
It's basically because the half of the available buffers have to be
sacrificed to do double-buffering and as a result the merge tree can
get deeper. In addition the amount of I/O wait time is not reaching
the level where double-buffering could countervail the reduced number
of runs during a particular merge.

Performance measurements were made during manual testing to verify
that this is in fact an optimization:
  - In case doing a sort on top of a join when working with a
restricted amount of memory then the Sort node successfully
allocates additional memory right before the merging phase. This
is feasible because once Join finishes sending new input data and
calls InputDone() then it releases memory that can be picked up
by the Sorter. This results in shallower merging trees (more runs
grabbed for a merge).
  - On a multi-node cluster I verified that in cases when at least one
merging step is done then this change reduces the execution time
for sorts.
  - The more merging steps are done the bigger the performance gain is
compared to the baseline.

Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
---
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M tests/query_test/test_sort.py
3 files changed, 188 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/9943/12
--
To view, visit http://gerrit.cloudera.org:8080/9943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74857c1694802e81f1cfc765d2b4e8bc644387f9
Gerrit-Change-Number: 9943
Gerrit-PatchSet: 12
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 


  1   2   >