[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..


IMPALA-5031: unsafe random number generation in buffer-pool-test

The bug is that ConcurrentRegistration shares one random number
generator between all the threads. This isn't safe and UBSAN was
unhappy with it.

The fix is to create one RNG per thread. We already do that in a
different test so the code is factored out into a utility function.

Testing:
Ran buffer-pool-test under UBSAN

Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Reviewed-on: http://gerrit.cloudera.org:8080/7596
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/testutil/rand-util.h
2 files changed, 22 insertions(+), 11 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Sailesh Mukil: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4703: reservation denial debug action

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4703: reservation denial debug action
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/993/

-- 
To view, visit http://gerrit.cloudera.org:8080/7022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ied39bb091b12156e5dc61b528c6c0cd8de3fe657
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4703: reservation denial debug action

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4703: reservation denial debug action
..


Patch Set 12: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/7022
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ied39bb091b12156e5dc61b528c6c0cd8de3fe657
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5768: Better developer documentation

2017-08-04 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5768: Better developer documentation
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7350/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 456: LIBHDFS_OPTS="${LIBHDFS_OPTS}:${IMPALA_HOME}/be/build/debug/service"
> I agree that it's wrong, and /latest/ is the correct way to point it at tha
I got what appears to be a clean and functional build out of it - pretty sure 
all this stuff that keeps mentioning libbackend.so is obsolete.


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


IMPALA-4674: Part 2: port backend exec to BufferPool

Always create global BufferPool at startup using 80% of memory and
limit reservations to 80% of query memory (same as BufferedBlockMgr).
The query's initial reservation is computed in the planner, claimed
centrally (managed by the InitialReservations class) and distributed
to query operators from there.

min_spillable_buffer_size and default_spillable_buffer_size query
options control the buffer size that the planner selects for
spilling operators.

Port ExecNodes to use BufferPool:
  * Each ExecNode has to claim its reservation during Open()
  * Port Sorter to use BufferPool.
  * Switch from BufferedTupleStream to BufferedTupleStreamV2
  * Port HashTable to use BufferPool via a Suballocator.

This also makes PAGG memory consumption more efficient (avoid wasting buffers)
and improve the spilling algorithm:
* Allow preaggs to execute with 0 reservation - if streams and hash tables
  cannot be allocated, it will pass through rows.
* Halve the buffer requirement for spilling aggs - avoid allocating
  buffers for aggregated and unaggregated streams simultaneously.
* Rebuild spilled partitions instead of repartitioning (IMPALA-2708)

TODO in follow-up patches:
* Rename BufferedTupleStreamV2 to BufferedTupleStream
* Implement max_row_size query option.

Testing:
* Updated tests to reflect new memory requirements

Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Reviewed-on: http://gerrit.cloudera.org:8080/5801
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/partitioned-hash-join-node.inline.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/buffered-block-mgr-test.cc
D be/src/runtime/buffered-block-mgr.cc
D be/src/runtime/buffered-block-mgr.h
D be/src/runtime/buffered-tuple-stream-test.cc
D be/src/runtime/buffered-tuple-stream.cc
D be/src/runtime/buffered-tuple-stream.h
D be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
A be/src/runtime/initial-reservations.cc
A be/src/runtime/initial-reservations.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/service/client-request-state.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/bloom-filter.h
M be/src/util/static-asserts.cc
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M 

[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 40: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 40
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-04 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 1:

> I'm testing this out.

https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/13

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5598: Fix excessive dumping in MemLimitExceeded

2017-08-04 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7597

Change subject: IMPALA-5598: Fix excessive dumping in MemLimitExceeded
..

IMPALA-5598: Fix excessive dumping in MemLimitExceeded

ExecQueryFInstances RPC timeouts in stress tests were
traced to excessive dumping in MemLimitExceeded() and
LogUsage(). The QueryState::Init() hits the process
memory limit, and this causes MemLimitExceeded to dump
the process memory tracker. On the stress test, this
involves dumping hundreds of queries and all the
structures underneath. This is expensive and the
contention between threads accessing these structures
causes RPC timeouts.

This adds the ability to the limit the recursive depth
when dumping memory trackers. It also modifies the
dumping in MemLimitExceeded() to have the following
semantics:
1. The query memory tracker is always logged in full
if it exists.
2. The process memory tracker is logged if the query
memory tracker doesn't exist or if the process memory
limit is being hit. The process memory tracker is
limited to dumping only its immediate children.

Other uses of LogUsage() are not limited (e.g. /memz
and the query memory page on the web UI).

A stress test run with the process memory tracker
limited to dumping its immediate children showed no
RPC timeouts.

Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
---
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
2 files changed, 39 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/7597/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-5768: Better developer documentation

2017-08-04 Thread Zach Amsden (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7350

to look at the new patch set (#4).

Change subject: IMPALA-5768: Better developer documentation
..

IMPALA-5768: Better developer documentation

Guide to important environment variables for
build, impala paths and config cleanup.

Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
---
M README.md
M bin/impala-config.sh
2 files changed, 75 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/7350/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5768

2017-08-04 Thread Zach Amsden (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7350

to look at the new patch set (#3).

Change subject: IMPALA-5768
..

IMPALA-5768

Guide to important environment variables for
build, impala paths and config cleanup.

Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
---
M README.md
M bin/impala-config.sh
2 files changed, 75 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/7350/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5572: Timestamp codegen for text scanner

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5572: Timestamp codegen for text scanner
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/991/

-- 
To view, visit http://gerrit.cloudera.org:8080/7556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I00cbf8ec7784ca9594e14e952f46dc54a5ede44b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5666: ASAN poisoning for MemPool and BufferPool

2017-08-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5666: ASAN poisoning for MemPool and BufferPool
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7591/1//COMMIT_MSG
Commit Message:

Line 24:   these checks uncovered.
> Did you run the full test suite with ASAN?
Not yet, will do (and update) before I commit. Have run a good subset of the 
be-tests and some full queries.


http://gerrit.cloudera.org:8080/#/c/7591/1/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

Line 400: // Ensure that the memory is unpoisoned when it's next allocated 
by the system.
> Is this one needed? Won't the system allocator call free(), which will pois
You would have thought so, but I thought I saw one case where memory returned 
to the system was re-allocated as poisoned. Better safe than sorry.


http://gerrit.cloudera.org:8080/#/c/7591/1/be/src/runtime/bufferpool/buffer-pool.h
File be/src/runtime/bufferpool/buffer-pool.h:

Line 414:   void Poison() { ASAN_POISON_MEMORY_REGION(data(), len()); }
> I doubt it makes much of a difference but it would be nice if we could make
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7591
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8d5a28dfee2b7c631981aac75524bde9acc0b36a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-04 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7581/1//COMMIT_MSG
Commit Message:

Line 9: For allowing multiple different distributions to build against the same
> Maybe briefly mention the relevant environment variables so it's more disco
Done


Line 14: 
> can you comment on how other 'packages' can be set up? e.g. what's required
Yeah, these two changes are somewhat intertwined.  I'll put most of the 
explanation in README.md rather than the changelog which nobody will read ;)


http://gerrit.cloudera.org:8080/#/c/7581/1/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

Line 39: HOST = "https://native-toolchain.s3.amazonaws.com/build;
> It might make sense to also allow configuring this URL, so it can be pointe
I considered it, but it's easy to open a slippery slope to 'Everything is an 
environment variable!' and this gets used for both the components and the 
toolchain with different paths, so I refrained from letting the change snowball.


PS1, Line 340: llama,
> nit: remove llama (though not llama-minikdc)
Done


PS1, Line 341: minikidc
> typo: minikdc
Done.  Wait - that wasn't even my tyop!  Will fix anyway.


http://gerrit.cloudera.org:8080/#/c/7581/1/bin/impala-config.sh
File bin/impala-config.sh:

PS1, Line 131: cdh
> Keeping it as cdh_components kind-of makes sense to me as long as we're put
My intention was to leave this alone so that the existing workflows for both 
external and internal developers continue to work.

The override I had in mind for C6 looks something like:

PACKAGED_COMPONENTS_NAME=cdh6
PACKAGED_COMPONENTS_PATH="/cdh6_components/"
PACKAGED_COMPONENTS="avro,hadoop,hbase,hive,sentry"

Then we can have both C5 & C6 builder jobs run separately and deposit 
components where they won't conflict.  Also note we can pull in avro and drop 
llama-minikdc (or also pull in hbase-minikdc, if someone is brave enough to set 
that up)

And of course external thirdparty components are pretty easy to add as well, 
but with this arrangement (components override toolchain, as I want to happen 
for avro), one might argue for adding the other Apache components directly into 
the toolchain and letting the build override them.

But one step at a time, this should suffice for now.


-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5696: Enable cipher configuration when using TLS / Thrift

2017-08-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5696: Enable cipher configuration when using TLS / Thrift
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I735ae36eebfdf7228f235686c9c69642c3c9d84f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/992/

-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Tim Armstrong (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7596

to look at the new patch set (#3).

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..

IMPALA-5031: unsafe random number generation in buffer-pool-test

The bug is that ConcurrentRegistration shares one random number
generator between all the threads. This isn't safe and UBSAN was
unhappy with it.

The fix is to create one RNG per thread. We already do that in a
different test so the code is factored out into a utility function.

Testing:
Ran buffer-pool-test under UBSAN

Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
---
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/testutil/rand-util.h
2 files changed, 22 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/7596/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..


Patch Set 1:

Added a testing section to clarify Jim's question.

-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Tim Armstrong (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7596

to look at the new patch set (#2).

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..

IMPALA-5031: unsafe random number generation in buffer-pool-test

The bug is that ConcurrentRegistration shares one random number
generator between all the threads. This isn't safe and UBSAN was
unhappy with it.

The fix is to create one RNG per thread. We already do that in a
different test so the code is factored out into a utility function.

Testing:
Ran buffer-pool-test under UBSAN

Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
---
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/testutil/rand-util.h
2 files changed, 21 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/7596/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7596/1/be/src/testutil/rand-util.h
File be/src/testutil/rand-util.h:

PS1, Line 34:   static void SeedRng(const char* env_var, std::mt19937* rng) {
: const char* seed_str = getenv(env_var);
: int64_t seed;
: if (seed_str != nullptr) {
:   seed = atoi(seed_str);
: } else {
:   seed = time(nullptr);
: }
: LOG(INFO) << "Random seed (overridable with " << env_var << 
"): " << seed;
: rng->seed(seed);
:   }
Why not change this to use std::random_device instead?

See 
https://github.com/apache/incubator-impala/blob/master/be/src/rpc/authentication.cc#L493-L494


-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..


Patch Set 1: Code-Review+2

You tested with UBSan?

-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-ASF-CR] Guide to important environment variables for build, test, and mini-cluster operations.

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Guide to important environment variables for build, test, and 
mini-cluster operations.
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7350/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 456: LIBHDFS_OPTS="${LIBHDFS_OPTS}:${IMPALA_HOME}/be/build/debug/service"
> mabe safer to put be/build/latest/service ?
I agree that it's wrong, and /latest/ is the correct way to point it at that 
directly, but I was thinking that the wrongness provided evidence that it 
wasn't actually needed.

You're right that we should probably validate it before changing it, so it's a 
bad idea to tack it onto this change.


Line 473: 
LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${IMPALA_HOME}/be/build/debug/service"
> Agree, but unsure about the combinations of shared / static builds needed t
Yeah probably best not to tack it on if we're not confident.


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5031: unsafe random number generation in buffer-pool-test

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7596

Change subject: IMPALA-5031: unsafe random number generation in buffer-pool-test
..

IMPALA-5031: unsafe random number generation in buffer-pool-test

The bug is that ConcurrentRegistration shares one random number
generator between all the threads. This isn't safe and UBSAN was
unhappy with it.

The fix is to create one RNG per thread. We already do that in a
different test so the code is factored out into a utility function.

Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
---
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/testutil/rand-util.h
2 files changed, 21 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/7596/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7596
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0eb3bec152a58d9ec39413780cb2c431dd8d4fa8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5696: Enable cipher configuration when using TLS / Thrift

2017-08-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5696: Enable cipher configuration when using TLS / Thrift
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7524/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS2, Line 172: ssl_cipher_list
> I'm wondering if we should default to "intermediate" compatibility:
It's a good idea, but I'm really keen not to break any existing clients. I'll 
leave a comment for 3.0!


-- 
To view, visit http://gerrit.cloudera.org:8080/7524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I735ae36eebfdf7228f235686c9c69642c3c9d84f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5696: Enable cipher configuration when using TLS / Thrift

2017-08-04 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#3).

Change subject: IMPALA-5696: Enable cipher configuration when using TLS / Thrift
..

IMPALA-5696: Enable cipher configuration when using TLS / Thrift

The 'cipher suite' is a description of the set of algorithms used by SSL
and TLS to execute key exchange, encryption, message authentication, and
random number generation functions. SSL implementations allow the cipher
suite to be configured so that ciphers may be removed from the whitelist
if they are shown to be weak.

* Add a flag --ssl_cipher_list which controls cipher selection for both
  thrift servers and clients. Default is blank, which means use all
  available cipher suites.
* Add ThriftServerBuilder to simplify construction of
  ThriftServers (whose constructors were otherwise getting very long).

Testing: new tests added to thrift-server-test. Test cases added follow:

* A client cannot connect to a server which does not have any ciphers in
  common with it.
* If ciphers are identical on clients and servers, that ssl connections
  can be made.
* Bad cipher strings lead to errors on both client and server.

Change-Id: I735ae36eebfdf7228f235686c9c69642c3c9d84f
---
M be/src/benchmarks/network-perf-benchmark.cc
M be/src/catalog/catalogd-main.cc
M be/src/rpc/thrift-client.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/runtime/data-stream-test.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
A be/src/testutil/scoped-flag-setter.h
M be/src/util/webserver-test.cc
13 files changed, 427 insertions(+), 142 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/7524/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I735ae36eebfdf7228f235686c9c69642c3c9d84f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] Guide to important environment variables for build, test, and mini-cluster operations.

2017-08-04 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: Guide to important environment variables for build, test, and 
mini-cluster operations.
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7350/2/bin/impala-config.sh
File bin/impala-config.sh:

Line 343: if type nproc >/dev/null 2>&1; then
> I used "getconf _NPROCESSORS_ONLN" in infra/python/bootstrap_virtualenv.py.
Done


PS2, Line 433: nproc
> Hmm, should this use $CORES?
No, I think we only want to reduce concurrency for tests, which might actually 
start doing multi-threaded things, but for the build, full steam ahead.


Line 456: LIBHDFS_OPTS="${LIBHDFS_OPTS}:${IMPALA_HOME}/be/build/debug/service"
> Pretty sure we don't need this, since everything works on a release build. 
mabe safer to put be/build/latest/service ?


Line 473: 
LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${IMPALA_HOME}/be/build/debug/service"
> Same here. Pretty sure that we don't need to add this to LD_LIBRARY_PATH.
Agree, but unsure about the combinations of shared / static builds needed to 
test this hypothesis.


Line 491: alias 
gerrit-verify-only="${IMPALA_AUX_TEST_HOME}/jenkins/gerrit-verify-only.sh"
> We can remove this while we're here too - it's referring to a Cloudera-inte
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5666: ASAN poisoning for MemPool and BufferPool

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5666: ASAN poisoning for MemPool and BufferPool
..


Patch Set 1:

(4 comments)

Had a few very minor comments. This should be very helpful in catching any 
use-after-free bugs.

http://gerrit.cloudera.org:8080/#/c/7591/1//COMMIT_MSG
Commit Message:

Line 7: IMPALA-5666: ASAN poisoning for MemPool and BufferPool
The JIRA number and name are a bit ominous


Line 24: Testing:
Did you run the full test suite with ASAN?


http://gerrit.cloudera.org:8080/#/c/7591/1/be/src/runtime/bufferpool/buffer-allocator.cc
File be/src/runtime/bufferpool/buffer-allocator.cc:

Line 400: buffer.Unpoison();
Is this one needed? Won't the system allocator call free(), which will poison 
it anyway? Or is my understanding faulty?


http://gerrit.cloudera.org:8080/#/c/7591/1/be/src/runtime/bufferpool/buffer-pool.h
File be/src/runtime/bufferpool/buffer-pool.h:

Line 414:   void Poison();
I doubt it makes much of a difference but it would be nice if we could make 
this an inline function and optimise it out for non-ASAN builds.


-- 
To view, visit http://gerrit.cloudera.org:8080/7591
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8d5a28dfee2b7c631981aac75524bde9acc0b36a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5744: Add dummy 'use krpc' flag and create DataStream interface

2017-08-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#8).

Change subject: IMPALA-5744: Add dummy 'use_krpc' flag and create DataStream 
interface
..

IMPALA-5744: Add dummy 'use_krpc' flag and create DataStream interface

This patch introduces a dummy 'use_krpc' flag and creates an abstract
interface for the DataStreamRecvr/Mgr.

The 'use_krpc' flag defaults to 'false'. Cluster startup will abort
with an error if the flag is switched to 'true'.

The DataStreamSender implements the same virtual interface as the
DataSink, so a pure virtual class for the DataStreamSender would
essentially be an empty class. Therefore, it is not implemented.

The new interfaces are pure virtual base classes and are named
DataStream*Base.

Stubs for the Krpc implementations are also introduced and are named
KrpcDataStream*. They currently only abort with a fatal error if they
are used. Their actual implementations will be filled in a later
patch.

Since having both the Thrift and KRPC implementations of the
DataStream* classes are only expected to be temporary for now, this
was written and optimized with the end goal of having only the KRPC
versions of the DataStreamMgr/Recvr, at which point we will get rid
of the DataStream*Base classes, the Thrift versions of the classes
and rename KrpcDataStream* to DataStream*. We will also rename all
the references that the clients have to DataStream*Base to DataStream*.

Also did some spurious includes cleanup.

Change-Id: I5d52245154e910529a68f53049520238eca16241
---
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
A be/src/runtime/data-stream-recvr-base.h
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
A be/src/runtime/krpc-data-stream-mgr.cc
A be/src/runtime/krpc-data-stream-mgr.h
A be/src/runtime/krpc-data-stream-recvr.cc
A be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/service/impala-server.cc
20 files changed, 442 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/7542/8
-- 
To view, visit http://gerrit.cloudera.org:8080/7542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5744: Add dummy 'use krpc' flag and create DataStream interface

2017-08-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5744: Add dummy 'use_krpc' flag and create DataStream 
interface
..


Patch Set 7:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-mgr-base.h
File be/src/runtime/data-stream-mgr-base.h:

PS7, Line 36: Thrfit and KRPC
: /// respectively. Remove this when we evenually get rid of the 
Thrift implementation and
: /// replace client references to this class with the KRPC's 
version of the
: /// DataStreamMgrBase.
> same comments and typos as elsewhere.
Done


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-mgr.h
File be/src/runtime/data-stream-mgr.h:

PS7, Line 22: #include "runtime/data-stream-mgr-base.h"
> Move to line 33, with the other Impala includes.
Done


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-recvr-base.h
File be/src/runtime/data-stream-recvr-base.h:

PS7, Line 33: Thrfit
> typo
Done


PS7, Line 34: respectively
> remove (there's nothing in this sentence for them to be respective to).
Done


PS7, Line 34: when we evenually get rid of the Thrift implementation and
: /// replace client references to this class with the KRPC's 
version of the
: /// DataStreamRecvrBase.
> "in favor of the KRPC implementation when possible".
Done


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-recvr.h
File be/src/runtime/data-stream-recvr.h:

PS7, Line 42: /// Single receiver of an m:n data stream.
> This should say something about the fact that it's the thrift implementatio
Done


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

PS7, Line 460: ImpalaTestBackend
> Is it hard to convert ImpalaTestBackend to accept a DataStreamMgrBase ?
That would mean I have to do the dynamic cast of the 'mgr_' object in L100 from 
DataStreamMgrBase to DataStreamMgr, which I thought was a little more ugly than 
this test client class explicitly using a DataStreamMgr.


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

PS7, Line 93: enforced by
> 'part of' ?
Done


PS7, Line 94: (This is due to the stream mgrs using different AddData() 
signatures)
> Would remove this.
Done


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

Line 52: }
> return Status::OK()? Otherwise clang-tidy might complain. Here and elsewher
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update .gitinore files

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Update .gitinore files
..


Patch Set 3: Code-Review+2 Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Update .gitinore files

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged.

Change subject: Update .gitinore files
..


Update .gitinore files

I noticed a bunch of new things had crept in.

Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Reviewed-on: http://gerrit.cloudera.org:8080/7590
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M .gitignore
M be/.gitignore
M testdata/.gitignore
3 files changed, 13 insertions(+), 0 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Update .gitinore files

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Update .gitinore files
..


Patch Set 2: Verified+1

Hit the flaky TPC-DS data loading. It built fine. This change as no influence 
on tests so will manually verify.

-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Update .gitinore files

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Update .gitinore files
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/989/

-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-04 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7587/1/bin/bootstrap_development.sh
File bin/bootstrap_development.sh:

PS1, Line 51: postgresql
> Someone (maybe Bikram) mentioned to me that they needed to specify postgres
I tested; it worked with both 14.04 and 16.04


-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5744: Add dummy 'use krpc' flag and create DataStream interface

2017-08-04 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5744: Add dummy 'use_krpc' flag and create DataStream 
interface
..


Patch Set 7:

(10 comments)

Looks pretty close to me.

http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-mgr-base.h
File be/src/runtime/data-stream-mgr-base.h:

PS7, Line 36: Thrfit and KRPC
: /// respectively. Remove this when we evenually get rid of the 
Thrift implementation and
: /// replace client references to this class with the KRPC's 
version of the
: /// DataStreamMgrBase.
same comments and typos as elsewhere.


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-mgr.h
File be/src/runtime/data-stream-mgr.h:

PS7, Line 22: #include "runtime/data-stream-mgr-base.h"
Move to line 33, with the other Impala includes.


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-recvr-base.h
File be/src/runtime/data-stream-recvr-base.h:

PS7, Line 33: Thrfit
typo


PS7, Line 34: respectively
remove (there's nothing in this sentence for them to be respective to).


PS7, Line 34: when we evenually get rid of the Thrift implementation and
: /// replace client references to this class with the KRPC's 
version of the
: /// DataStreamRecvrBase.
"in favor of the KRPC implementation when possible".


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-recvr.h
File be/src/runtime/data-stream-recvr.h:

PS7, Line 42: /// Single receiver of an m:n data stream.
This should say something about the fact that it's the thrift implementation.


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

PS7, Line 460: ImpalaTestBackend
Is it hard to convert ImpalaTestBackend to accept a DataStreamMgrBase ?


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

PS7, Line 93: enforced by
'part of' ?


PS7, Line 94: (This is due to the stream mgrs using different AddData() 
signatures)
Would remove this.


http://gerrit.cloudera.org:8080/#/c/7542/7/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

Line 52: }
return Status::OK()? Otherwise clang-tidy might complain. Here and elsewhere.


-- 
To view, visit http://gerrit.cloudera.org:8080/7542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5725: coalesce() with outer join incorrectly rewritten

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5725: coalesce() with outer join incorrectly rewritten
..


IMPALA-5725: coalesce() with outer join incorrectly rewritten

A recent change, IMPALA-5016, added an expr rewrite rule to simplfy
coalesce(). This rule eliminates the coalesce() when its first
parameter (that isn't constant null) is a SlotRef pointing to a
SlotDescriptor that is non-nullable (for example because it is from
a non-nullable Kudu column or because it is from an HDFS partition
column with no null partitions), under the assumption that the SlotRef
could never have a null value.

This assumption is violated when the SlotRef is the output of an
outer join, leading to incorrect results being returned. The problem
is that the nullability of a SlotDescriptor (which determines whether
there is a null indicator bit in the tuple for that slot) is a
slightly different property than the nullability of a SlotRef pointing
to that SlotDescriptor (since the SlotRef can still be NULL if the
entire tuple is NULL).

This patch removes the portion of the rewrite rule that considers
the nullability of the SlotDescriptor. This means that we're missing
out on some optimizations opportunities and we should revisit this in
a way that works with outer joins (IMPALA-5753)

Testing:
- Updated FE tests.
- Added regression tests to exprs.test

Change-Id: I1ca6df949f9d416ab207016236dbcb5886295337
Reviewed-on: http://gerrit.cloudera.org:8080/7567
Reviewed-by: Matthew Jacobs 
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
4 files changed, 29 insertions(+), 44 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Matthew Jacobs: Looks good to me, approved
  Thomas Tauber-Marshall: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I1ca6df949f9d416ab207016236dbcb5886295337
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5725: coalesce() with outer join incorrectly rewritten

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5725: coalesce() with outer join incorrectly rewritten
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1ca6df949f9d416ab207016236dbcb5886295337
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-04 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 1:

I'm testing this out.

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7587/1/bin/bootstrap_development.sh
File bin/bootstrap_development.sh:

PS1, Line 51: postgresql
Someone (maybe Bikram) mentioned to me that they needed to specify 
postgresql-9.5 on Ubuntu 16.04. I'm unsure how accurate that is.


-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5666: ASAN poisoning for MemPool and BufferPool

2017-08-04 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7591

Change subject: IMPALA-5666: ASAN poisoning for MemPool and BufferPool
..

IMPALA-5666: ASAN poisoning for MemPool and BufferPool

* Use ASAN_[UN]POISON_MEMORY_REGION to indicate to ASAN runtime that
  memory is not 'valid' from Impala's perspective (but still valid from
  kernel's perspective).

* Add this to MemPool and BufferPoolAllocator. For both, memory goes
  through the following lifecycle: when it is allocated from the OS and
  returned to the user, it must be unpoisoned. When the user returns it
  to a cache, it becomes poisoned. When the cache is freed to the OS, it
  must be unpoisoned again (otherwise future allocations elsewhere in
  the system might return poisoned memory).

* Also add checks to FreePool which uses a MemPool underneath, but has
  its own freelist cache. Fix a couple of bugs in free-pool-test that
  these checks uncovered.

Testing:

* Tests that only run if ASAN is enabled to confirm that poisoning
  catches simple use-after-return errors. These are 'death' tests which
  catch expected process exits.

Change-Id: I8d5a28dfee2b7c631981aac75524bde9acc0b36a
---
M be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/free-list.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
M be/src/runtime/mem-pool-test.cc
M be/src/runtime/mem-pool.cc
M be/src/runtime/mem-pool.h
10 files changed, 152 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/7591/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7591
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8d5a28dfee2b7c631981aac75524bde9acc0b36a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-5696: Enable cipher configuration when using TLS / Thrift

2017-08-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5696: Enable cipher configuration when using TLS / Thrift
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7524/1/be/src/rpc/thrift-server.h
File be/src/rpc/thrift-server.h:

PS1, Line 254: /// Sets the auth provider for this server. Default is the 
system global auth provider.
 :   ThriftServerBuilder& auth_provider(AuthProvider* provider) {
 : auth_provider_ = provider;
> The idea is that the c'tors take mandatory parameters. Doing it this way me
I was alluding to the caller explicitly setting what options they want. But I 
guess we can argue for it either way, so I'm fine leaving it the way it is too.


http://gerrit.cloudera.org:8080/#/c/7524/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS1, Line 1947: be_builder.metrics(exec_env->metrics())
> Hm, why? That introduces an extra statement, and is kind of against the poi
I was thinking about it from a readability point of view, but this is fine too.


http://gerrit.cloudera.org:8080/#/c/7524/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS2, Line 172: ssl_cipher_list
I'm wondering if we should default to "intermediate" compatibility:

Ciphersuites: 
ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS

This is the recommended yet conservative set of ciphers.

>From https://wiki.mozilla.org/Security/Server_Side_TLS

Older ciphers are not recommended, and if anyone wants to use any of them, they 
will have to specifically add them to this list.

Risk: Might break existing clients after they upgrade.
Reward: Better default security.

What do you think?


http://gerrit.cloudera.org:8080/#/c/7524/1/be/src/statestore/statestore-subscriber.cc
File be/src/statestore/statestore-subscriber.cc:

PS1, Line 203: ThriftServer* server;
 : RETURN_IF_ERROR(builder.Build());
 : heartbeat_server_.reset(server);
> Not sure this is warranted (you could argue the same thing about 'builder')
Yes, but 'server' will be an invalid pointer after the reset(). But it's fine 
to leave it as it is.


-- 
To view, visit http://gerrit.cloudera.org:8080/7524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I735ae36eebfdf7228f235686c9c69642c3c9d84f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5572: Timestamp codegen for text scanner

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5572: Timestamp codegen for text scanner
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/991/

-- 
To view, visit http://gerrit.cloudera.org:8080/7556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I00cbf8ec7784ca9594e14e952f46dc54a5ede44b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5572: Timestamp codegen for text scanner

2017-08-04 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change.

Change subject: IMPALA-5572: Timestamp codegen for text scanner
..


Patch Set 4:

> (1 comment)

Done

-- 
To view, visit http://gerrit.cloudera.org:8080/7556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I00cbf8ec7784ca9594e14e952f46dc54a5ede44b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5572: Timestamp codegen for text scanner

2017-08-04 Thread Tianyi Wang (Code Review)
Hello Impala Public Jenkins, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7556

to look at the new patch set (#4).

Change subject: IMPALA-5572: Timestamp codegen for text scanner
..

IMPALA-5572: Timestamp codegen for text scanner

Currently codegen is disabled when scanning text tables with timestamp
columns. The message is "Timestamp not yet supported for codegen."
This patch adds support for timestamp codegen.
A simple query in the comment section of this issue performs a little
better (4%) than interpreted version.

Testing: The patch passed test with exhaustive workload exploration
strategy.

Change-Id: I00cbf8ec7784ca9594e14e952f46dc54a5ede44b
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/text-converter.cc
M be/src/util/string-parser.h
5 files changed, 34 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/7556/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I00cbf8ec7784ca9594e14e952f46dc54a5ede44b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5572: Timestamp codegen for text scanner

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5572: Timestamp codegen for text scanner
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7556/3/be/src/exec/text-converter.cc
File be/src/exec/text-converter.cc:

Line 269: if (parse_fn->arg_size() == 3) builder.CreateStore(parse_return, 
slot);
There was a warning from clang-tidy:
  20:19:26 ] /home/ubuntu/Impala/be/src/exec/text-converter.cc:269:56: warning: 
variable 'parse_return' may be uninitialized when used here 
[clang-diagnostic-conditional-uninitialized]

It's not actually possible, but we could initialise parse_return to nullptr to 
avoid it.


-- 
To view, visit http://gerrit.cloudera.org:8080/7556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I00cbf8ec7784ca9594e14e952f46dc54a5ede44b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update .gitinore files

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Update .gitinore files
..


Patch Set 2: Code-Review+2

carry

-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 40:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/990/

-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 40
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 40: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/988/

-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 40
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5572: Timestamp codegen for text scanner

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5572: Timestamp codegen for text scanner
..


Patch Set 3: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/986/

-- 
To view, visit http://gerrit.cloudera.org:8080/7556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I00cbf8ec7784ca9594e14e952f46dc54a5ede44b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Update .gitinore files

2017-08-04 Thread Tim Armstrong (Code Review)
Hello Bharath Vissapragada,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7590

to look at the new patch set (#2).

Change subject: Update .gitinore files
..

Update .gitinore files

I noticed a bunch of new things had crept in.

Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
---
M .gitignore
M be/.gitignore
M testdata/.gitignore
3 files changed, 13 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/7590/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 


[Impala-ASF-CR] Update .gitinore files

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: Update .gitinore files
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/989/

-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Update .gitinore files

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Update .gitinore files
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7590/1//COMMIT_MSG
Commit Message:

PS1, Line 7: a
> remove
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5757: Make tbl property toSql deterministic

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5757: Make tbl property toSql deterministic
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2391e04eb40e398098704b77588ad352d514df7d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5757: Make tbl property toSql deterministic

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5757: Make tbl property toSql deterministic
..


IMPALA-5757: Make tbl property toSql deterministic

On Ubuntu 16.04, it seems that table properties may be
returned in a different order from that expected by
TestShowCreateTable in test_kudu.py.

The fix is to change ToSqlUtils to print properties in
a deterministic way.

Change-Id: I2391e04eb40e398098704b77588ad352d514df7d
Reviewed-on: http://gerrit.cloudera.org:8080/7580
Reviewed-by: Matthew Jacobs 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
2 files changed, 16 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Matthew Jacobs: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I2391e04eb40e398098704b77588ad352d514df7d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] Update .gitinore filesa

2017-08-04 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: Update .gitinore filesa
..


Patch Set 1: Code-Review+2

(1 comment)

Could you also add *.orig and *.rej files? (Outcome of git rebase/merge)

http://gerrit.cloudera.org:8080/#/c/7590/1//COMMIT_MSG
Commit Message:

PS1, Line 7: a
remove


-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update .gitinore filesa

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7590

Change subject: Update .gitinore filesa
..

Update .gitinore filesa

I noticed a bunch of new things had crept in.

Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
---
M .gitignore
M be/.gitignore
M testdata/.gitignore
3 files changed, 13 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/7590/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie6ef085357a3bf026f2b42689ee642192a7791e7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] Problem: Improve error message when subquery is used in the ON clause

2017-08-04 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: Problem: Improve error message when subquery is used in the ON 
clause
..


Patch Set 1:

(2 comments)

The fix looks Ok. Couple of nits in the commit message. Additionally,

- Can you please add a unit test in AnalyzeStmtsTest#TestOnClause()? Generally 
unit tests are expected for every commit, unless it is pretty difficult to 
reproduce.

- Please update your gerrit profile to include your name, else it shows up as 
"Anonymous Coward" in email updates.

http://gerrit.cloudera.org:8080/#/c/7588/1//COMMIT_MSG
Commit Message:

PS1, Line 7: Problem
Could you replace it with the jira ID?  Typically we include jira ID in the 
messages to make git log searchable.  Something like,

IMPALA-:  Improve error message when subquery is used in the ON clause


PS1, Line 10: Fix: Print the error stating that "Suquery not allowed in the ON 
clause"
Move above the change-Id.


-- 
To view, visit http://gerrit.cloudera.org:8080/7588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 40:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/988/

-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 40
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 40:

Michael Brown and Matt Mulder have been helping me out with testing. So far 
this has held up to stress testing and it looks like performance under 
concurrency is similar to master.

-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 40
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..


Patch Set 40: Code-Review+2

Rebased.

-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 40
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7079/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 1086: void Coordinator::ReleaseResources() {
I'm having trouble understanding whether this runs before or after 
QueryState::ReleaseResources(). It seems like we to release memory tracked by 
filter_mem_tracker_ before QueryState::ReleaseResources(), but then we need to 
release the admission control resources after QueryState::ReleaseResources() is 
called.

There's a good chance I'm misunderstanding something but that's what I'm stuck 
on right now.


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool

2017-08-04 Thread Tim Armstrong (Code Review)
Hello Thomas Tauber-Marshall, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5801

to look at the new patch set (#40).

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
..

IMPALA-4674: Part 2: port backend exec to BufferPool

Always create global BufferPool at startup using 80% of memory and
limit reservations to 80% of query memory (same as BufferedBlockMgr).
The query's initial reservation is computed in the planner, claimed
centrally (managed by the InitialReservations class) and distributed
to query operators from there.

min_spillable_buffer_size and default_spillable_buffer_size query
options control the buffer size that the planner selects for
spilling operators.

Port ExecNodes to use BufferPool:
  * Each ExecNode has to claim its reservation during Open()
  * Port Sorter to use BufferPool.
  * Switch from BufferedTupleStream to BufferedTupleStreamV2
  * Port HashTable to use BufferPool via a Suballocator.

This also makes PAGG memory consumption more efficient (avoid wasting buffers)
and improve the spilling algorithm:
* Allow preaggs to execute with 0 reservation - if streams and hash tables
  cannot be allocated, it will pass through rows.
* Halve the buffer requirement for spilling aggs - avoid allocating
  buffers for aggregated and unaggregated streams simultaneously.
* Rebuild spilled partitions instead of repartitioning (IMPALA-2708)

TODO in follow-up patches:
* Rename BufferedTupleStreamV2 to BufferedTupleStream
* Implement max_row_size query option.

Testing:
* Updated tests to reflect new memory requirements

Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/partitioned-hash-join-node.inline.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/buffered-block-mgr-test.cc
D be/src/runtime/buffered-block-mgr.cc
D be/src/runtime/buffered-block-mgr.h
D be/src/runtime/buffered-tuple-stream-test.cc
D be/src/runtime/buffered-tuple-stream.cc
D be/src/runtime/buffered-tuple-stream.h
D be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
A be/src/runtime/initial-reservations.cc
A be/src/runtime/initial-reservations.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/service/client-request-state.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/bloom-filter.h
M be/src/util/static-asserts.cc
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M common/thrift/generate_error_codes.py
M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java
M 

[Impala-ASF-CR] IMPALA-5749: coordinator race hits DCHECK 'num remaining backends > 0'

2017-08-04 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5749: coordinator race hits DCHECK 
'num_remaining_backends_ > 0'
..


Patch Set 2:

(1 comment)

> > Does this trigger only when there are two concurrent calls to
 > > UpdateBackendExecStatus() from the same backend? If so, do we
 > > understand why that happens so often?
 > 
 > My understanding is this:
 > A fragment instance sends reports every 'n' seconds. Due to a
 > congested network, two of these reports for the same fragment
 > instance from a backend can arrive at the coordinator and start
 > being processed at around the same time, hence leading to this
 > issue.
 > 
 > Ideally a second report cannot be send until the first one is ACKd
 > by the coordinator, since a lock is held until the report is ACKd,
 > in the ReportProfileThread(); but there is only one case where a
 > second report will be sent before the first one is responded to,
 > i.e.  from FragmentInstanceState::Finalize().
 > 
 > So ReportProfileThread() sends the one report of the last
 > finstance, then Finalize() sends the second report of the same
 > finstance before the first one is responded to.

This may be possible, but the query I've been using to repro it runs fast 
enough that there are never any updates sent by the report thread, just from 
Finalize, so I haven't observed it.

The reason I've been seeing it fail is: there are two different fragments from 
the same backend that send reports at the same time. When they both reach 
Coordinator::UpdateBackendExecStatus(), they both get past the IsDone() check.

Then ApplyExecStatusReport() gets called twice but both calls return true in 
the out parameter 'done', because the first call contained a fragment with an 
error status which causes the backend to be considered done, so both threads 
end up decreasing num_remaining_backends_ for the same backend.

http://gerrit.cloudera.org:8080/#/c/7577/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

Line 75:   /// this update was not applied because this backend has previously 
completed. This can
> Please also add:
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7577
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1528661e5df6d9732ebfeb414576c82ec5c92241
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Problem: Improve error message when subquery is used in the ON clause

2017-08-04 Thread Anonymous Coward (Code Review)
psi...@cloudera.com has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7588

Change subject: Problem: Improve error message when subquery is used in the ON 
clause
..

Problem: Improve error message when subquery is used in the ON clause

Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2
Fix: Print the error stating that "Suquery not allowed in the ON clause"
---
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
1 file changed, 4 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7588/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: psi...@cloudera.com


[Impala-ASF-CR] IMPALA-5749: coordinator race hits DCHECK 'num remaining backends > 0'

2017-08-04 Thread Thomas Tauber-Marshall (Code Review)
Hello Michael Ho, Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7577

to look at the new patch set (#2).

Change subject: IMPALA-5749: coordinator race hits DCHECK 
'num_remaining_backends_ > 0'
..

IMPALA-5749: coordinator race hits DCHECK 'num_remaining_backends_ > 0'

In Coordinator::UpdateBackendExecStatus(), we check if the backend
has already completed with BackendState::IsDone() and return without
applying the update if so to avoid updating num_remaining_backends_
twice for the same completed backend.

The problem is that the value of BackendState::IsDone() is updated by
the call to BackendState::ApplyExecStatusReport() that comes after it,
but these operations are not performed atomically, so if there are
two simultaneous calls to UpdateBackendExecStatus(), they can both
call IsDone(), both get 'false', and then proceed to erroneously both
update num_remaining_backends_, hitting a DCHECK.

The solution is to perform both the call to IsDone() and the update to
it atomically by holding the BackendState::lock_.

Testing:
- Ran test_finst_cancel_when_query_complete 10,000 times without
  hitting the DCHECK (previously, it would hit about once per 300
  runs).

Change-Id: I1528661e5df6d9732ebfeb414576c82ec5c92241
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
3 files changed, 18 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/7577/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7577
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1528661e5df6d9732ebfeb414576c82ec5c92241
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5715: (mitigation only) defer destruction of MemTrackers

2017-08-04 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-5715: (mitigation only) defer destruction of MemTrackers
..


Patch Set 3:

Thanks for updating it. Will do a pass today.

-- 
To view, visit http://gerrit.cloudera.org:8080/7492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I205abb0076d1ffd08cb93c0f1671c8b81e7fba0f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5725: coalesce() with outer join incorrectly rewritten

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5725: coalesce() with outer join incorrectly rewritten
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/987/

-- 
To view, visit http://gerrit.cloudera.org:8080/7567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1ca6df949f9d416ab207016236dbcb5886295337
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5725: coalesce() with outer join incorrectly rewritten

2017-08-04 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5725: coalesce() with outer join incorrectly rewritten
..


Patch Set 2: Code-Review+2

GVO failed due to IMPALA-5760

-- 
To view, visit http://gerrit.cloudera.org:8080/7567
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1ca6df949f9d416ab207016236dbcb5886295337
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-04 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7587

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..

IMPALA-4407: Move Impala setup procedures to main repo

Before this change, Impala has relied on a chef setup in
https://github.com/awleblang/impala-setup for setting up a development
environment. This has a number of downsides:

1. It makes understanding what the script is doing difficult: there
are 40k or so lines in that repo last I checked.

2. It makes porting to new distributions difficult unless the
providers of various chef "recipes" have already ported their code.

3. It makes coordinated changes between the main repo and the
impala-setup repo more awkward.

This patch adds a shell script to replace that repo. It works on
Ubuntu 14.04 and 16.04, while impala-setup repo only works on 14.04
and the now-unmaintained Ubuntu 15.04.

Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
---
M bin/bootstrap_development.sh
1 file changed, 97 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/7587/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-5602: Fix kudu queries being incorrectly optimized as small query

2017-08-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5602: Fix kudu queries being incorrectly optimized as 
small query
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7560/3/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

Line 527: return !kuduConjuncts_.isEmpty();
nit oneline


http://gerrit.cloudera.org:8080/#/c/7560/3/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

PS3, Line 219: (){
> nit: missing space
You are correct, that should be fixed as well.


http://gerrit.cloudera.org:8080/#/c/7560/3/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

PS3, Line 1019: self.client.execute("set explain_level=3")
you can actually use client.set_configuration(config_map)

e.g. from test_ddl.py:

 def test_sync_ddl_drop(self, vector, unique_database):
"""Verifies the catalog gets updated properly when dropping objects with 
sync_ddl
enabled"""
self.client.set_configuration({'sync_ddl': 1})
# Drop the database immediately after creation (within a statestore 
heartbeat) and
# verify the catalog gets updated properly.
self.client.execute("drop database {0}".format(unique_database))


PS3, Line 1024: assert str(result).find("hosts=3 instances=3") != -1
the more python-y way to say this is

assert "hosts..." in str(result)


-- 
To view, visit http://gerrit.cloudera.org:8080/7560
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I93822d67ebda41d5d0456095c429e3915a3f40c4
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7581/1/bin/impala-config.sh
File bin/impala-config.sh:

PS1, Line 131: cdh
> Keeping it as cdh_components kind-of makes sense to me as long as we're put
I see, if that's the use case that makes sense. It might be helpful to have a 
comment indicating how you might also include X_components.


-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7581/1//COMMIT_MSG
Commit Message:

Line 9: For allowing multiple different distributions to build against the same
Maybe briefly mention the relevant environment variables so it's more 
discoverable to people looking at the commit message only.


http://gerrit.cloudera.org:8080/#/c/7581/1/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

Line 39: HOST = "https://native-toolchain.s3.amazonaws.com/build;
It might make sense to also allow configuring this URL, so it can be pointed to 
an alternative source of the components.


http://gerrit.cloudera.org:8080/#/c/7581/1/bin/impala-config.sh
File bin/impala-config.sh:

PS1, Line 131: cdh
> is it too painful to change this string? i guess renaming would break exist
Keeping it as cdh_components kind-of makes sense to me as long as we're putting 
CDH components in there, to allow parallel sets of components (e.g. 
apache_components).

Or I guess maybe it could also make sense to have hadoop_components/ 
minicluster_components or something like that with everything under it, 
distinguished by version numbers.

I don't feel strongly either way.


-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Guide to important environment variables for build, test, and mini-cluster operations.

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Guide to important environment variables for build, test, and 
mini-cluster operations.
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] Guide to important environment variables for build, test, and mini-cluster operations.

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Guide to important environment variables for build, test, and 
mini-cluster operations.
..


Patch Set 2:

(7 comments)

Looks good to me. I noticed a few more things we could clean up but we don't 
necessarily need to bundle them in this change.

http://gerrit.cloudera.org:8080/#/c/7350/2/bin/impala-config.sh
File bin/impala-config.sh:

PS2, Line 337: 
Nice catch


Line 460
Thank you for removing this.


Line 343: if type nproc >/dev/null 2>&1; then
I used "getconf _NPROCESSORS_ONLN" in infra/python/bootstrap_virtualenv.py. My 
understanding is that it works pretty much anywhere so we could avoid this 
branch.. Tangential but might as well simplify this script while we're here.


PS2, Line 433: nproc
Hmm, should this use $CORES?


Line 456: LIBHDFS_OPTS="${LIBHDFS_OPTS}:${IMPALA_HOME}/be/build/debug/service"
Pretty sure we don't need this, since everything works on a release build. It's 
kind of tangential but while we're here...


Line 473: 
LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${IMPALA_HOME}/be/build/debug/service"
Same here. Pretty sure that we don't need to add this to LD_LIBRARY_PATH.


Line 491: alias 
gerrit-verify-only="${IMPALA_AUX_TEST_HOME}/jenkins/gerrit-verify-only.sh"
We can remove this while we're here too - it's referring to a Cloudera-internal 
script which a) doesn't belong in the ASF repo and b) no longer works since we 
moved to jenkins.impala.io


-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5602: Fix kudu queries being incorrectly optimized as small query

2017-08-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5602: Fix kudu queries being incorrectly optimized as 
small query
..


Patch Set 3:

(3 comments)

Looking pretty good.

http://gerrit.cloudera.org:8080/#/c/7560/3/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

Line 219:   public boolean hasPushedConjuncts(){
I think DataSourceScanNode can also push conjuncts.

The distinction might not be important for now since I believe that always runs 
on the coordinator but should probably be precise.


PS3, Line 219: (){
nit: missing space


http://gerrit.cloudera.org:8080/#/c/7560/3/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 1007:   def test_missing_table_stats(self, cursor, kudu_client, 
unique_database):
This needs @SkipIfNotHdfsMinicluster.plans since it's assuming a 3 node 
minicluster.


-- 
To view, visit http://gerrit.cloudera.org:8080/7560
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I93822d67ebda41d5d0456095c429e3915a3f40c4
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5572: Timestamp codegen for text scanner

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5572: Timestamp codegen for text scanner
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/986/

-- 
To view, visit http://gerrit.cloudera.org:8080/7556
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I00cbf8ec7784ca9594e14e952f46dc54a5ede44b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7581/1//COMMIT_MSG
Commit Message:

Line 14: 
> can you comment on how other 'packages' can be set up? e.g. what's required
Nevermind, it looks like you started this in a separate review. That was lower 
down in my inbox :)


-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7581/1//COMMIT_MSG
Commit Message:

Line 14: 
can you comment on how other 'packages' can be set up? e.g. what's required, 
and in what form. It might be useful to leave a brief description here and 
start a wiki page on the topic.


http://gerrit.cloudera.org:8080/#/c/7581/1/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

PS1, Line 340: llama,
nit: remove llama (though not llama-minikdc)


PS1, Line 341: minikidc
typo: minikdc


http://gerrit.cloudera.org:8080/#/c/7581/1/bin/impala-config.sh
File bin/impala-config.sh:

PS1, Line 131: cdh
is it too painful to change this string? i guess renaming would break existing 
dev environments? maybe it's ok if we just force folks to bootstrap again. It'd 
be nice to remove 'cdh'.


-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5757: Make tbl property toSql deterministic

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5757: Make tbl property toSql deterministic
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/985/

-- 
To view, visit http://gerrit.cloudera.org:8080/7580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2391e04eb40e398098704b77588ad352d514df7d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5757: Make tbl property toSql deterministic

2017-08-04 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5757: Make tbl property toSql deterministic
..


Patch Set 2: Code-Review+2

I only ran backend tests locally so I missed ToSqlTest. Updated in ps2.

-- 
To view, visit http://gerrit.cloudera.org:8080/7580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2391e04eb40e398098704b77588ad352d514df7d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5757: Make tbl property toSql deterministic

2017-08-04 Thread Matthew Jacobs (Code Review)
Hello Impala Public Jenkins, Jim Apple,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7580

to look at the new patch set (#2).

Change subject: IMPALA-5757: Make tbl property toSql deterministic
..

IMPALA-5757: Make tbl property toSql deterministic

On Ubuntu 16.04, it seems that table properties may be
returned in a different order from that expected by
TestShowCreateTable in test_kudu.py.

The fix is to change ToSqlUtils to print properties in
a deterministic way.

Change-Id: I2391e04eb40e398098704b77588ad352d514df7d
---
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
2 files changed, 16 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/7580/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2391e04eb40e398098704b77588ad352d514df7d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-5757: Make tbl property toSql deterministic

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5757: Make tbl property toSql deterministic
..


Patch Set 1:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/984/

-- 
To view, visit http://gerrit.cloudera.org:8080/7580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2391e04eb40e398098704b77588ad352d514df7d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5742: De-allocate buffer in parquet-reader on exit

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5742: De-allocate buffer in parquet-reader on exit
..


IMPALA-5742: De-allocate buffer in parquet-reader on exit

Testing: ran with ASAN with detect_leaks=1. Leak does not reproduce with
fix.

Change-Id: Iedf163f858b42d2ca63a7a65d6e457539de59ab9
Reviewed-on: http://gerrit.cloudera.org:8080/7572
Reviewed-by: Henry Robinson 
Tested-by: Impala Public Jenkins
---
M be/src/util/parquet-reader.cc
1 file changed, 20 insertions(+), 20 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Henry Robinson: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7572
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iedf163f858b42d2ca63a7a65d6e457539de59ab9
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5742: De-allocate buffer in parquet-reader on exit

2017-08-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5742: De-allocate buffer in parquet-reader on exit
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7572
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iedf163f858b42d2ca63a7a65d6e457539de59ab9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5744: Add dummy 'use krpc' flag and create DataStream interface

2017-08-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5744: Add dummy 'use_krpc' flag and create DataStream 
interface
..


Patch Set 7:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7542/6/be/src/exec/exchange-node.h
File be/src/exec/exchange-node.h:

PS6, Line 39: 
> nit: long line
Done


http://gerrit.cloudera.org:8080/#/c/7542/6/be/src/runtime/data-stream-mgr-base.h
File be/src/runtime/data-stream-mgr-base.h:

PS6, Line 36:  one each for Thrfit and KRPC
> for Thrift and KRPC respectively.
Done


PS6, Line 36: implementations 
> typo.
Done


http://gerrit.cloudera.org:8080/#/c/7542/6/be/src/runtime/data-stream-recvr-base.h
File be/src/runtime/data-stream-recvr-base.h:

PS6, Line 33:  one each for Thrfit and KRPC
> for Thrift and KRPC respectively.
Done


PS6, Line 33: implementations 
> implementations
Done


PS6, Line 47:  
> extra /
Done


http://gerrit.cloudera.org:8080/#/c/7542/6/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

PS6, Line 79: Kudu
> "Kudu RPC" for clarity.
Done


http://gerrit.cloudera.org:8080/#/c/7542/6/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

PS6, Line 92:   /// Clients of the implementations of DataStreamMgrBase should 
not use these functions
:   /// unless they need to access members that are not enforced by 
the DataStreamMgrBase
:   /// interface. (This is due to the stream mgrs using different 
AddData() signatures)
:   DataStreamMgr* ThriftStreamMgr();
> IMHO, this doesn't seem to provide much value in explaining it here. Can yo
I added it because we have a habit of including getters in the header file, and 
this is an exception. But I too agree that it's not very necessary to explain, 
so I removed it.

Added the extra explanation too.


http://gerrit.cloudera.org:8080/#/c/7542/6/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

PS6, Line 48: 
> nit: blank space
Done


http://gerrit.cloudera.org:8080/#/c/7542/6/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

PS6, Line 28: Shouldn't reach here unless startup flag 'use_krpc' "
:   "is true.";
> Can be simplified to "Shouldn't reach here unless startup flag 'use_krpc' i
Done


http://gerrit.cloudera.org:8080/#/c/7542/6/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

PS6, Line 30: KRPC
> nit: KRPC
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5744: Add dummy 'use krpc' flag and create DataStream interface

2017-08-04 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#7).

Change subject: IMPALA-5744: Add dummy 'use_krpc' flag and create DataStream 
interface
..

IMPALA-5744: Add dummy 'use_krpc' flag and create DataStream interface

This patch introduces a dummy 'use_krpc' flag and creates an abstract
interface for the DataStreamRecvr/Mgr.

The 'use_krpc' flag defaults to 'false'. Cluster startup will abort
with an error if the flag is switched to 'true'.

The DataStreamSender implements the same virtual interface as the
DataSink, so a pure virtual class for the DataStreamSender would
essentially be an empty class. Therefore, it is not implemented.

The new interfaces are pure virtual base classes and are named
DataStream*Base.

Stubs for the Krpc implementations are also introduced and are named
KrpcDataStream*. They currently only abort with a fatal error if they
are used. Their actual implementations will be filled in a later
patch.

Since having both the Thrift and KRPC implementations of the
DataStream* classes are only expected to be temporary for now, this
was written and optimized with the end goal of having only the KRPC
versions of the DataStreamMgr/Recvr, at which point we will get rid
of the DataStream*Base classes, the Thrift versions of the classes
and rename KrpcDataStream* to DataStream*. We will also rename all
the references that the clients have to DataStream*Base to DataStream*.

Also did some spurious includes cleanup.

Change-Id: I5d52245154e910529a68f53049520238eca16241
---
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-mgr.h
A be/src/runtime/data-stream-recvr-base.h
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
A be/src/runtime/krpc-data-stream-mgr.cc
A be/src/runtime/krpc-data-stream-mgr.h
A be/src/runtime/krpc-data-stream-recvr.cc
A be/src/runtime/krpc-data-stream-recvr.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/service/impala-server.cc
20 files changed, 442 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/7542/7
-- 
To view, visit http://gerrit.cloudera.org:8080/7542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5d52245154e910529a68f53049520238eca16241
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil