[Impala-ASF-CR] IMPALA-5754: Improve randomness of rand()/random()

2017-11-07 Thread Kim Jin Chul (Code Review)
Hello Jim Apple,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8355

to look at the new patch set (#4).

Change subject: IMPALA-5754: Improve randomness of rand()/random()
..

IMPALA-5754: Improve randomness of rand()/random()

Currently implementation of rand/random built-in functions
use rand_r of C library. We recognized its randomness was poor.
std::mt19937 in the C++11 standard libarary shows better randomness
than rand_r.

Testing:
rand-util-test is newly addded. It checks randomness,
deterministic and range.

Change-Id: Idafdd5fe7502ff242c76a91a815c565146108684
---
M be/src/exprs/expr-test.cc
M be/src/exprs/math-functions-ir.cc
M be/src/util/CMakeLists.txt
A be/src/util/rand-util-test.cc
A be/src/util/rand-util.cc
A be/src/util/rand-util.h
6 files changed, 185 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8355/4
--
To view, visit http://gerrit.cloudera.org:8080/8355
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idafdd5fe7502ff242c76a91a815c565146108684
Gerrit-Change-Number: 8355
Gerrit-PatchSet: 4
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
..

IMPALA-2281: Replace FNV with FastHash in exchange nodes

FNV is not a good enough hash function. This patch introduces FastHash
into the codebase and uses it in exchange nodes.

Testing: Two test cases involving arbitrary ordering are changed.
Single node performance benchmark shows no performance difference.

Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Reviewed-on: http://gerrit.cloudera.org:8080/8417
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins
---
M LICENSE.txt
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/raw-value-test.cc
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/util/hash-util-ir.cc
M be/src/util/hash-util.h
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
13 files changed, 209 insertions(+), 103 deletions(-)

Approvals:
  Jim Apple: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 7
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 07:39:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 16: Code-Review+2

Forgot that this needed a coordinated change with the Impala-lzo plugin. That 
has one callsite:

 context_->ReleaseCompletedResources(nullptr, false);

So I just added a wrapper function with the old API that we can remove once 
it's not needed.


--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 06:45:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8023

to look at the new patch set (#10).

Change subject: IMPALA-4856: Port data stream service to KRPC
..

IMPALA-4856: Port data stream service to KRPC

This patch implements a new data stream service which utilizes KRPC.
Similar to the thrift RPC implementation, there are 3 major components
to the data stream services: KrpcDataStreamSender serializes and sends
row batches materialized by a fragment instance to a KrpcDataStreamRecvr.
KrpcDataStreamMgr is responsible for routing an incoming row batch to
the appropriate receiver. The data stream service runs on the port
FLAGS_krpc_port which is 29000 by default.

Unlike the implementation with thrift RPC, KRPC provides an asynchronous
interface for invoking remote methods. As a result, KrpcDataStreamSender
doesn't need to create a thread per connection. There is one connection
between two Impalad nodes for each direction (i.e. client and server).
Multiple queries can multi-plex on the same connection for transmitting
row batches between two Impalad nodes. The asynchronous interface also
prevents some issues with thrift RPC in which a thread may be stuck in
an RPC call without being able to check for cancellation. A TransmitData()
call with KRPC is in essence a trio of RpcController, a serialized protobuf
request buffer and a protobuf response buffer. The call is invoked via a
DataStreamService proxy object. The serialized tuple offsets and row batches
are sent via "sidecars" in KRPC to avoid extra copy into the serialized
request buffer.

Each impalad node creates a singleton DataStreamService object at start-up
time. All incoming calls are served by a service thread pool created as part
of DataStreamService. By default, there are 64 service threads. The service
threads are shared across all queries so the RPC handler should avoid
blocking as much as possible. In thrift RPC implementation, we make a thrift
thread handling a TransmitData() RPC to block for extended period of time
when the receiver is not yet created when the call arrives. In KRPC
implementation, we store TransmitData() or EndDataStream() requests
which arrive before the receiver is ready in a per-receiver early sender list
stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to
when the receiver is created or when timeout occurs.

Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr.
If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit
to exceed, the request will be stashed in a blocked_sender_ queue to be 
processed
later. The stashed RPC request will not be responded to until it is processed
so as to exert back pressure to the client. An alternative would be to reply 
with
an error and the request / row batches need to be sent again. This may end up
consuming more network bandwidth than the thrift RPC implementation. This change
adopts the behavior of allowing one stashed request per sender.

All rpc requests and responses are serialized using protobuf. The equivalent of
TRowBatch would be ProtoRowBatch which contains a serialized header about the
meta-data of the row batch and two Kudu Slice objects which contain pointers to
the actual data (i.e. tuple offsets and tuple data).

This patch is based on an abandoned patch by Henry Robinson.

TESTING
---

* Build passed with FLAGS_use_krpc=true.

TO DO
-

* Port some BE tests to KRPC services.

Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
---
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/kudu-util.h
M be/src/rpc/CMakeLists.txt
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/krpc-data-stream-sender.cc
A be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/CMakeLists.txt
A be/src/service/data-stream-service.cc
A be/src/service/data-stream-service.h
M be/src/service/impala-server.cc
M cmake_modules/FindProtobuf.cmake
M common/protobuf/CMakeLists.txt
A common/protobuf/common.proto
A common/protobuf/data_stream_service.proto
A common/protobuf/row_batch.proto
M common/thrift/generate_error_codes.py
33 files changed, 3,159 insertions(+), 183 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8023/10
--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, 

[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 10:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc@62
PS9, Line 62:   FLAGS_datastream_service_num_deserialization_threads, 1,
> why disallow the setting of this greater than num_cores? Given that it take
Yes. The assumption is that the critical section is small and most threads 
won't block for too long so no point in pushing above number of cores. To be 
consistent with other knobs we have (e.g. # service threads, reactor threads), 
we shouldn't impose an implicit upper bound.


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@67
PS9, Line 67: the
> the resources
Done


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@195
PS9, Line 195: current_ba
> current_batch_
Done


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@225
PS9, Line 225: DCHECK
> shoudl we also DCHECK that num_pending_enqueue_ == 0 (otherwise, we could h
Done


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@248
PS9, Line 248: current_batch_.reset(resu
> this is part of the same operation as line 244 (both are adjusting the queu
Done


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@252
PS9, Line 252:   // It's important that the dequeuing of 'deferred_rpcs_' is 
done after the entry
> how about moving this to line 250 now that it's that scope that drops the l
Done



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 10
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 03:35:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

2017-11-07 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
..


Patch Set 6: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/util/hash-util.h
File be/src/util/hash-util.h:

http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/util/hash-util.h@32
PS5, Line 32:  public:
> Done. I followed the practice in be/src/kudu/security/init.cc and assumed t
I'm not 100% on that, since http://www.apache.org/legal/src-headers.html#3party 
says, "Do not modify or remove any copyright notices or licenses within 
third-party works," but if Kudu does it, it's probably the right thing to do.



--
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 04:17:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Alex Behm, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8172

to look at the new patch set (#16).

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..

IMPALA-5307: Part 4: copy out uncompressed text and seq

This is the final patch for IMPALA-5307.

Text and Seq scanners are converted to use the same approach
as Avro.

contains_tuple_data is now false so a bunch of dead code in
ScannerContext can be removed. We also no longer attach I/O
buffers to row batches so that logic can be removed.

Testing:
Ran core ASAN tests.

Perf:
I reran the same benchmarks as in Part 2. There was no measurable
difference before and after - for both text and seq processing time
is dominated by text parsing.

Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/text-converter.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
23 files changed, 144 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/16
--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 16:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1447/


--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 06:45:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-07 Thread Kim Jin Chul (Code Review)
Kim Jin Chul has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..

IMPALA-6084: Avoid using of global namespace for llvm

There are a large number of places in the backend where
we import everything from the llvm namespace into the
global namespace. (e.g. using namespace llvm;)

Here are the reasons why we don't prefer this:
* It could make symbol conflicts if a newly added code
has same symbole name.
* It makes code readability uncomfortable. We may not
recognize a symbol came from.

We adopt a sequence of namespace specifiers in each use case.

Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-callgraph.cc
M be/src/codegen/codegen-symbol-emitter.cc
M be/src/codegen/codegen-util.cc
M be/src/codegen/instruction-counter-test.cc
M be/src/codegen/instruction-counter.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/exec/blocking-join-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/filter-context.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/select-node.cc
M be/src/exec/text-converter.cc
M be/src/exec/topn-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/agg-fn.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr-test.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/types.cc
M be/src/util/tuple-row-compare.cc
41 files changed, 1,452 insertions(+), 1,378 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/8489/3
--
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 3
Gerrit-Owner: Kim Jin Chul 


[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

2017-11-07 Thread Bikramjeet Vig (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8487

to look at the new patch set (#2).

Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
..

IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

Some test runs are currently failing randomly in test_ir_functions due
to LLVM linkage error. This happens when impala tries to link a function
from  a lib file on hdfs, but it somehow got removed from the lib cache
before it could link. This results in a new file being cached but with
a slightly different local filename, and since impala only uses local
filenames to check for collision for linking of LLVM modules, it ends
up linking the same file twice and hence encounters an error.
This patch fixes this issue by using the hdfs file path to check for
collision of lib files.

Testing:
Added a backend test that tries to link the same lib twice.

Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
---
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/util/filesystem-util.cc
M be/src/util/filesystem-util.h
5 files changed, 39 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/8487/2
--
To view, visit http://gerrit.cloudera.org:8080/8487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Gerrit-Change-Number: 8487
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option

2017-11-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8490 )

Change subject: IMPALA-2248: Make idle_session_timeout a query option
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8490/3/fe/src/test/java/org/apache/impala/service/JdbcTest.java
File fe/src/test/java/org/apache/impala/service/JdbcTest.java:

http://gerrit.cloudera.org:8080/#/c/8490/3/fe/src/test/java/org/apache/impala/service/JdbcTest.java@594
PS3, Line 594:
"0, 5, 10"'s and "timeout * 1.5"'s relation could be more explicit, for example:
float timeout_tolerance = 0.5;
... Arrays.asList(0, 5, 5*(1 + 2*timeout_tolerance))
...
int sleepPeriod = (int)(timeout * (1 + timeout_tolerance));



--
To view, visit http://gerrit.cloudera.org:8080/8490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
Gerrit-Change-Number: 8490
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 00:21:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8487 )

Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Gerrit-Change-Number: 8487
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 02:36:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1446/


--
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 04:17:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 15: Code-Review+2

carry


--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 01:06:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8172

to look at the new patch set (#14).

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..

IMPALA-5307: Part 4: copy out uncompressed text and seq

This is the final patch for IMPALA-5307.

Text and Seq scanners are converted to use the same approach
as Avro.

contains_tuple_data is now false so a bunch of dead code in
ScannerContext can be removed. We also no longer attach I/O
buffers to row batches so that logic can be removed.

Testing:
Ran core ASAN tests.

Perf:
I reran the same benchmarks as in Part 2. There was no measurable
difference before and after - for both text and seq processing time
is dominated by text parsing.

Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/text-converter.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
23 files changed, 137 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/14
--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

2017-11-07 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/runtime/raw-value-test.cc
File be/src/runtime/raw-value-test.cc:

http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/runtime/raw-value-test.cc@104
PS5, Line 104: FastHash
> "FastHash"
Done


http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/runtime/raw-value.h@74
PS5, Line 74:   /// Get a 64-bit hash value using the FastHash function.
> Please add a reference. https://www.google.com/search?q=fasthash doesn't di
Done


http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/util/hash-util.h
File be/src/util/hash-util.h:

http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/util/hash-util.h@32
PS5, Line 32:  public:
> Please add the name of this file in the appropriate place in LICENSE.txt. Y
Done. I followed the practice in be/src/kudu/security/init.cc and assumed the 
license here should be removed.



--
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 01:14:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

2017-11-07 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
..

IMPALA-2281: Replace FNV with FastHash in exchange nodes

FNV is not a good enough hash function. This patch introduces FastHash
into the codebase and uses it in exchange nodes.

Testing: Two test cases involving arbitrary ordering are changed.
Single node performance benchmark shows no performance difference.

Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
---
M LICENSE.txt
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/raw-value-test.cc
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/util/hash-util-ir.cc
M be/src/util/hash-util.h
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
13 files changed, 209 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/8417/6
--
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8487 )

Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
..

IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

Some test runs are currently failing randomly in test_ir_functions due
to LLVM linkage error. This happens when impala tries to link a function
from  a lib file on hdfs, but it somehow got removed from the lib cache
before it could link. This results in a new file being cached but with
a slightly different local filename, and since impala only uses local
filenames to check for collision for linking of LLVM modules, it ends
up linking the same file twice and hence encounters an error.
This patch fixes this issue by using the hdfs file path to check for
collision of lib files.

Testing:
Added a backend test that tries to link the same lib twice.

Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Reviewed-on: http://gerrit.cloudera.org:8080/8487
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/util/filesystem-util.cc
M be/src/util/filesystem-util.h
5 files changed, 39 insertions(+), 35 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Gerrit-Change-Number: 8487
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..

IMPALA-6134: Update code base to use impala::ConditionVariable

boost::condtion_variable supports thread interruption
which has some overhead. In some places we already use
impala::ConditionVariable which is a very thin layer
around pthread, therefore it has less overhead.

This commit substitues every boost::condition_variable in
the codebase (except under kudu/) to impala::ConditionVariable.

It also extends impala::ConditionVariable class to support
waiting with a given timeout. The WaitFor function takes a duration
as parameter. The WaitUntil function takes an absolute time as
parameter.

Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Reviewed-on: http://gerrit.cloudera.org:8080/8428
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/rpc/thrift-server.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr-stress.cc
M be/src/runtime/disk-io-mgr-stress.h
M be/src/runtime/disk-io-mgr-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/statestore/statestore.h
M be/src/util/blocking-queue.h
M be/src/util/condition-variable.h
M be/src/util/promise.h
M be/src/util/thread-pool.h
31 files changed, 137 insertions(+), 122 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 08 Nov 2017 02:16:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 15: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1445/


--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 02:12:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.h@128
PS9, Line 128: Takes over the RPC payload of an early sender to 
'deferred_rpcs_' queue of the
Takes over the RPC state of an early sender for deferred processing and kicks...



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 00:54:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 22: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 22
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 08 Nov 2017 00:55:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h@274
PS13, Line 274: //
> nit: ///
Done



--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 01:06:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1445/


--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 08 Nov 2017 01:07:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 9:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc@62
PS9, Line 62:   min(FLAGS_datastream_service_num_deserialization_threads, 
CpuInfo::num_cores()),
why disallow the setting of this greater than num_cores? Given that it takes 
locks, there could be some benefit to doing so, right?


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@67
PS9, Line 67: data
the resources


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@195
PS9, Line 195: cur_batch_
current_batch_


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@225
PS9, Line 225: DCHECK
shoudl we also DCHECK that num_pending_enqueue_ == 0 (otherwise, we could have 
acquired the lock while it was dropped for deserialization and there is still a 
row batch)?


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@248
PS9, Line 248: batch_queue_.pop_front();
this is part of the same operation as line 244 (both are adjusting the queue 
state), so mind moving it to be adjacent?


http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@252
PS9, Line 252: // Don't hold lock when calling EnqueueDeserializeTask() as 
it may block.
how about moving this to line 250 now that it's that scope that drops the lock. 
Also, it's important that this happens after batch_queue_.pop_front(), right? 
maybe note that.



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 9
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 01:32:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Mostafa Mokhtar, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8023

to look at the new patch set (#9).

Change subject: IMPALA-4856: Port data stream service to KRPC
..

IMPALA-4856: Port data stream service to KRPC

This patch implements a new data stream service which utilizes KRPC.
Similar to the thrift RPC implementation, there are 3 major components
to the data stream services: KrpcDataStreamSender serializes and sends
row batches materialized by a fragment instance to a KrpcDataStreamRecvr.
KrpcDataStreamMgr is responsible for routing an incoming row batch to
the appropriate receiver. The data stream service runs on the port
FLAGS_krpc_port which is 29000 by default.

Unlike the implementation with thrift RPC, KRPC provides an asynchronous
interface for invoking remote methods. As a result, KrpcDataStreamSender
doesn't need to create a thread per connection. There is one connection
between two Impalad nodes for each direction (i.e. client and server).
Multiple queries can multi-plex on the same connection for transmitting
row batches between two Impalad nodes. The asynchronous interface also
prevents some issues with thrift RPC in which a thread may be stuck in
an RPC call without being able to check for cancellation. A TransmitData()
call with KRPC is in essence a trio of RpcController, a serialized protobuf
request buffer and a protobuf response buffer. The call is invoked via a
DataStreamService proxy object. The serialized tuple offsets and row batches
are sent via "sidecars" in KRPC to avoid extra copy into the serialized
request buffer.

Each impalad node creates a singleton DataStreamService object at start-up
time. All incoming calls are served by a service thread pool created as part
of DataStreamService. By default, there are 64 service threads. The service
threads are shared across all queries so the RPC handler should avoid
blocking as much as possible. In thrift RPC implementation, we make a thrift
thread handling a TransmitData() RPC to block for extended period of time
when the receiver is not yet created when the call arrives. In KRPC
implementation, we store TransmitData() or EndDataStream() requests
which arrive before the receiver is ready in a per-receiver early sender list
stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to
when the receiver is created or when timeout occurs.

Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr.
If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit
to exceed, the request will be stashed in a blocked_sender_ queue to be 
processed
later. The stashed RPC request will not be responded to until it is processed
so as to exert back pressure to the client. An alternative would be to reply 
with
an error and the request / row batches need to be sent again. This may end up
consuming more network bandwidth than the thrift RPC implementation. This change
adopts the behavior of allowing one stashed request per sender.

All rpc requests and responses are serialized using protobuf. The equivalent of
TRowBatch would be ProtoRowBatch which contains a serialized header about the
meta-data of the row batch and two Kudu Slice objects which contain pointers to
the actual data (i.e. tuple offsets and tuple data).

This patch is based on an abandoned patch by Henry Robinson.

TESTING
---

* Build passed with FLAGS_use_krpc=true.

TO DO
-

* Port some BE tests to KRPC services.

Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
---
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/exchange-node.cc
M be/src/exec/kudu-util.h
M be/src/rpc/CMakeLists.txt
M be/src/rpc/rpc-mgr.cc
M be/src/rpc/rpc-mgr.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/data-stream-mgr-base.h
M be/src/runtime/data-stream-mgr.h
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/krpc-data-stream-mgr.cc
M be/src/runtime/krpc-data-stream-mgr.h
M be/src/runtime/krpc-data-stream-recvr.cc
M be/src/runtime/krpc-data-stream-recvr.h
A be/src/runtime/krpc-data-stream-sender.cc
A be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/service/CMakeLists.txt
A be/src/service/data-stream-service.cc
A be/src/service/data-stream-service.h
M be/src/service/impala-server.cc
M cmake_modules/FindProtobuf.cmake
M common/protobuf/CMakeLists.txt
A common/protobuf/common.proto
A common/protobuf/data_stream_service.proto
A common/protobuf/row_batch.proto
M common/thrift/generate_error_codes.py
33 files changed, 3,156 insertions(+), 183 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8023/9
--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, 

[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8484 )

Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover 
the query.
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
Gerrit-Change-Number: 8484
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Wed, 08 Nov 2017 00:13:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-11-07 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..

IMPALA-5017: Error on decimal overflow

Before this patch, decimal operations would either silently overflow (in
the case of sum() and avg()), or produce a warning.

In this patch, the behaviour is changed so that an error is produced in
the case of overflow when DECIMAL_v2 is enabled. Decimal v1 behaviour is
unchanged.

We introduce overflow checks when computing sum() and avg(). This
results in a ~30% performance regression when we are in decimal v2 mode
compared to decimal v1.

Benchmarks:
  Query:
  select sum(dec_38_19) from decimal_tbl
Decimal v1: 13.09s
Decimal v2: 17.10s

  Query:
  select avg(dec_38_19) from decimal_tbl
Decimal v1: 13.60s
Decimal v2: 19.10s

  Query:
  select avg(dec_9_5) from decimal_tbl
Decimal v1: 12.06s
Decimal v2: 18.07s

Testing:
- Added several end to end tests.
- Updated Expr tests to check for error in case of overflow.

Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr-test.cc
M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
M testdata/workloads/functional-query/queries/QueryTest/decimal.test
6 files changed, 214 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8404/2
--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-07 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8447/8/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8447/8/be/src/service/impala-server.cc@1215
PS8, Line 1215:   string_map["support_start_over"] = "false";
> This may not be related to this change, but it would be good to think a bit
I agree that support_start_over implementation is a bit weird.
Note1, this change is only about introducing visibility levels on query 
options, not to refactor how they are implemented.
Note2, currently this change only affects the Beeswax interface that is used by 
impala-shell, but doesn't change HS2 interface towards e.g. HUE. So this option 
won't be hidden from HUE.



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 08 Nov 2017 00:05:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8487 )

Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1444/


--
To view, visit http://gerrit.cloudera.org:8080/8487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Gerrit-Change-Number: 8487
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 23:15:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-11-07 Thread Vuk Ercegovac (Code Review)
Hello Juan Yu, Michael Brown, Philip Zeyliger, Balazs Jeszenszky, Dimitris 
Tsirogiannis, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8202

to look at the new patch set (#22).

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..

IMPALA-4704: Turns on client connections when local catalog initialized.

Currently, impalad starts beeswax and hs2 servers even if the
catalog has not yet been initialized. As a result, client
connections see an error message stating that the impalad
is not yet ready.

This patch changes the impalad startup sequence to wait
until the catalog is received before opening beeswax and hs2 ports
and starting their servers.

Testing:
- python e2e tests that start a cluster without a catalog
  and check that client connections are rejected as expected.

Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/common/global-flags.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M bin/start-impala-cluster.py
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_catalog_wait.py
M tests/custom_cluster/test_coordinators.py
20 files changed, 353 insertions(+), 216 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/8202/22
--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 22
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1443/


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:51:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 4:

Looks like it hit IMPALA-6092 plus some network flakiness trying to send RPCs 
to itself. I don't see how they're likely to be related to the change.


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:50:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-11-07 Thread Vuk Ercegovac (Code Review)
Hello Juan Yu, Michael Brown, Philip Zeyliger, Balazs Jeszenszky, Dimitris 
Tsirogiannis, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8202

to look at the new patch set (#21).

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..

IMPALA-4704: Turns on client connections when local catalog initialized.

Currently, impalad starts beeswax and hs2 servers even if the
catalog has not yet been initialized. As a result, client
connections see an error message stating that the impalad
is not yet ready.

This patch changes the impalad startup sequence to wait
until the catalog is received before opening beeswax and hs2 ports
and starting their servers.

Testing:
- python e2e tests that start a cluster without a catalog
  and check that client connections are rejected as expected.

Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/common/global-flags.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M bin/start-impala-cluster.py
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_cluster.py
M tests/common/impala_service.py
A tests/custom_cluster/test_catalog_wait.py
M tests/custom_cluster/test_coordinators.py
20 files changed, 355 insertions(+), 216 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/8202/21
--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 21
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-07 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8447/8/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8447/8/be/src/service/impala-server.cc@1215
PS8, Line 1215:   string_map["support_start_over"] = "false";
This may not be related to this change, but it would be good to think a bit 
about support_start_over option. Unlike other options, it is not an integer, 
and it should not be changed. It is read by Hue ccording to 
https://www.cloudera.com/documentation/enterprise/5-4-x/topics/impala_support_start_over.html

I do not know how Hue accesses query options, but if via the output of "set;" 
hiding it by default may actually change the behavior of Hue. Even if this is 
not an issue, it would be probably better to separate this option clearly e.g. 
by giving it a separate level like "capability" and treat these query options 
as read only.

( this comment added support_start_over a long time ago:
https://github.com/cloudera/Impala/commit/94a5ed487e04b7d2c9bf6bee399f0f464f289211b
 )



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 23:07:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 13: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h@274
PS13, Line 274: //
nit: ///



--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:50:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-11-07 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 20:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@89
PS19, Line 89: must call the following
 : /// methods:
> That DCHECK:
combined them.


http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc@1936
PS20, Line 1936:
> I think a quick comment is worth it to highlight the dependency between the
Done



--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 20
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 07 Nov 2017 23:08:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-docs-submit/171/


--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:22:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:29:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..

IMPALA-5473: [DOCS] Document TLS min version & cipher options

Under the doc JIRA IMPALA-6065.

Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Reviewed-on: http://gerrit.cloudera.org:8080/8401
Reviewed-by: Michael Brown 
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
---
M docs/topics/impala_ssl.xml
1 file changed, 71 insertions(+), 0 deletions(-)

Approvals:
  Michael Brown: Looks good to me, but someone else must approve
  Sailesh Mukil: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 4
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-07 Thread John Russell (Code Review)
Hello Greg Rahn, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8418

to look at the new patch set (#3).

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..

IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
---
M docs/topics/impala_known_issues.xml
1 file changed, 45 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/8418/3
--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-07 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> I think TSBs probably use a different scale though. Most of the medium seve
Sure! Changed.


http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> I agree it's a little strange. But Cloudera issued a technical service bull
Done


http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> Missed this on the first pass - shouldn't this be high?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:21:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1441/


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:07:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 9:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.h
File be/src/runtime/krpc-data-stream-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.h@435
PS8, Line 435: 'num_requ
> 'num_request' requests
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc
File be/src/runtime/krpc-data-stream-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc@62
PS8, Line 62: ds, C
> how was that chosen? do we have a test case that causes this queue to fill
Just a large enough number to reduce the chance of the queue filling up.  Yes, 
need to come up with a test case to fill up this queue.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc@109
PS8, Line 109:   // Let the receiver take over the RPC payloads of early 
senders and process them
 :   // asynchronously.
 :   for (unique_ptr this comment seems out of place. this is more an implementation detail of t
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h
File be/src/runtime/krpc-data-stream-recvr.h:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@129
PS8, Line 129: n
> and start a deserialization task to process it asynchronously.
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@130
PS8, Line 130: ask to p
> this "transfer" is in the oppose direction of how our "Transfer" methods us
Renamed to TakeOverEarlySender().


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@197
PS8, Line 197: eive
> cpu time or wall time?
wall-clock time.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@200
PS8, Line 200: _;
> same question
wall-clock time.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@67
PS8, Line 67: data
> the resources
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@72
PS8, Line 72: imit, the
> RPC state
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@73
PS8, Line 73: nto 'deferred_
> we shouldn't normally refer to private fields in public class comments, but
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@87
PS8, Line 87:   void TakeOverEarlySender(std::unique_ptr ctx);
> same comment as recvr header comment.
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@195
PS8, Line 195:   // cur_batch_ must be replaced with the
> I don't think we need this loop. see other comments in this function.
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@197
PS8, Line 197: atch = nullptr;
> nit: consider swapping the order of these so that the fast case comes first
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@210
PS8, Line 210: }
> nit: i think we could do without some of the blank lines in this method to
I leave a blank line between each loop exiting conditions.


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@211
PS8, Line 211:
> Actually, I think the loop exiting condition is not quite right, which led
There is an invariant that EOS cannot be sent by a sender when there is 
outstanding TransmitData() RPC so we should be able to get by by just checking 
for the termination condition of: (num_remaining_senders_ == 0 && 
batch_queue_.empty())


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@218
PS8, Line 218:
> given that we just checked the other two loop exit conditions, isn't this d
Converted to DCHECK().


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@229
PS8, Line 229: d
> to parallelize the CPU bound deserialization work.
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@239
PS8, Line 239: }
> once you get rid of the loop, I think you'll be able to eliminate this unlo
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@389
PS8, Line 389:   status.ToProto(ctx->response->mutable_status());
> at this point, 'ctx' effectively takes ownership, right? we should add a co
Done


http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@406
PS8, Line 406: const RowBatchHeaderPB& header = 

[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8484 )

Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover 
the query.
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1442/


--
To view, visit http://gerrit.cloudera.org:8080/8484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
Gerrit-Change-Number: 8484
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Tue, 07 Nov 2017 20:55:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.

2017-11-07 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8484 )

Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover 
the query.
..


Patch Set 3: Code-Review+1

I haven't dabbled enough with test files to know that VERIFY_IS_EQUAL_SORTED 
even exists. I agree with its use over changing the TPC-DS queries.


--
To view, visit http://gerrit.cloudera.org:8080/8484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
Gerrit-Change-Number: 8484
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Tue, 07 Nov 2017 20:53:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow

2017-11-07 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8404 )

Change subject: IMPALA-5017: Error on decimal overflow
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12
PS1, Line 12: In this patch, the beharior is changed so that an error is 
produced in
> I'll also accept behaviour ;)
Done


http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@27
PS1, Line 27:   select avg(dec_38_19) from decimal_tbl
> I guess these regressions occur regardless of the width of the input decima
Correct. Added another benchmark to illustrate this.


http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@413
PS1, Line 413:   abs(avg->sum_val16) > 
DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val4))) {
> This isn't correct if sum_val16 and src.val* have opposite sign, is it?
That's true. Fixed and added a test case.


http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@420
PS1, Line 420:   abs(avg->sum_val16) > 
DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val8))) {
> What do you think about only checking for overflow of the integer type at t
1. Are you suggesting to check for something like abs(avg->sum_val16) > 2**128 
- abs(src.val8))) ?
I don't really see why it is faster to check against 2**128 instead of against 
MAX_UNSCALED_DECIMAL.

2. What do you mean by initial check? Where would this check be done? Something 
like (decimal_v2 && sum_val16 > 2**32 && abs(avg->sum_val16) > 
MAX_UNSCALED_DECIMAL - abs(src.val8)))?



--
To view, visit http://gerrit.cloudera.org:8080/8404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019
Gerrit-Change-Number: 8404
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:59:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-07 Thread Gabor Kaszab (Code Review)
Hello Lars Volker, Laszlo Gaal, Zoltan Borok-Nagy, Philip Zeyliger, Attila 
Jeges, Tim Armstrong, Csaba Ringhofer, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8447

to look at the new patch set (#8).

Change subject: IMPALA-2181: Add query option levels for display
..

IMPALA-2181: Add query option levels for display

Three display levels are introduced for each query option: REGULAR, ADVANCED
and DEPRECATED. When the query options are diplayed in Impala shell using
SET then only the REGULAR options are shown. A new command called SET ALL
shows all the options grouped by their option levels.

Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
---
M be/src/service/child-query.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/beeswax.thrift
M shell/impala_client.py
M shell/impala_shell.py
M tests/shell/test_shell_interactive.py
10 files changed, 225 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/8447/8
--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8487 )

Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Gerrit-Change-Number: 8487
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:21:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..

IMPALA-6151: add query-level fragment/backend counters

This adds NumBackends, NumFragments and NumFragmentInstances
counters to the query profile to make it easier to manually or
programmatically analyse the query.

Also add a num-queries-registered metric to track the number
of queries that have been executed but are not yet unregistered.

Testing:
Ran "select count(*) from alltypessmall" and checked profile:

Backend startup latencies: Count: 3, min / max: 1ms / 1ms, 25th %-ile: 1ms, 
50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th 
%-ile: 1ms
Per Node Peak Memory Usage: tarmstrong-box:22000(1.10 MB) 
tarmstrong-box:22001(1.02 MB) tarmstrong-box:22002(1.02 MB)
 - FiltersReceived: 0 (0)
 - FinalizationTimer: 0.000ns
 - NumBackends: 3 (3)
 - NumFragmentInstances: 4 (4)
 - NumFragments: 2 (2)

Ran some query tests (both beeswax and HS2) and manually checked the
num-queries-registered metric on the /metrics page when the queries
were running and after they finished. Added the metric to
test_metrics_are_zero() to make sure that there are no accounting
errors.

Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Reviewed-on: http://gerrit.cloudera.org:8080/8461
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/coordinator.cc
M be/src/service/impala-server.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
M tests/verifiers/metric_verifier.py
6 files changed, 37 insertions(+), 5 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:44:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues

2017-11-07 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8487 )

Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen-test.cc
File be/src/codegen/llvm-codegen-test.cc:

http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen-test.cc@485
PS1, Line 485: NULL
> nullptr
Done


http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen.h@584
PS1, Line 584: LinkModule
> Maybe LinkModuleFromLocalFs() or something like that to disambiguate it the
Done



--
To view, visit http://gerrit.cloudera.org:8080/8487
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757
Gerrit-Change-Number: 8487
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:15:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

PS7:
> I assume this was a wholesale move of the class without modifications, exce
There was some minor cleanup - initialising member variables inline, adding a 
DCHECK in the destructor, etc. I created a diff here to see how the class 
definition changed: 
https://gist.github.com/timarmstrong/f640dd200260556e1167d8ca69a67f51/revisions?diff=split


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662
PS7, Line 662:   void UnregisterContext(DiskIoRequestContext* context);
> Given that the context cannot be used after this call, should we move the u
I don't feel strongly. I think your argument has merit but destroying the data 
structure goes (somewhat) against the idea that we shouldn't tear down control 
structures until the very end of the query.



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 21:10:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.

2017-11-07 Thread Tim Wood (Code Review)
Tim Wood has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8484 )

Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover 
the query.
..


Patch Set 3:

> Uploaded patch set 3.

Applied Tim A.'s suggestion.


--
To view, visit http://gerrit.cloudera.org:8080/8484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
Gerrit-Change-Number: 8484
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Tue, 07 Nov 2017 20:42:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.

2017-11-07 Thread Tim Wood (Code Review)
Hello Greg Rahn, Michael Brown, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8484

to look at the new patch set (#3).

Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover 
the query.
..

IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.

Use VERIFY_IS_EQUAL_SORTED tag on RESULTS section to allow low-order
sort deviations to compare.

Testing:
- Passed local tests/run-tests.py ...
- Changed order of expected rows by low-order column and verified test still 
passes.
- Analyzed all TPC-DS query files for similar patterns, found no others.

Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
---
M testdata/workloads/tpcds/queries/tpcds-q77a.test
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/8484/3
--
To view, visit http://gerrit.cloudera.org:8080/8484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
Gerrit-Change-Number: 8484
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Tim Wood 


[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8484 )

Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover 
the query.
..


Patch Set 3: Code-Review+2

Thanks Tim W.


--
To view, visit http://gerrit.cloudera.org:8080/8484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c
Gerrit-Change-Number: 8484
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Wood 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Tim Wood 
Gerrit-Comment-Date: Tue, 07 Nov 2017 20:54:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-07 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 8:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.h@516
PS7, Line 516:   /// Given a query option name this function gets the option 
level for it to use when
 :   /// displaying from Impala shell and adds the level to the 
ConfigVariable parameter
 :   /// in numeric format. For values see TQueryOptionLevel enum.
> nit: For me, this comment is too complex and tells too much implementation
Thx, that sounds better indeed.
Done


http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.cc@1228
PS7, Line 1228: const string& option_key
> Why do we pass option_key here? It is already there in config, isn't it?
In this case it is. But if I relied on that than it would introduce some risk 
when someone starts to rearrange the part where config is populated as this 
function can't guarantee that config contains the key every time.

To be on the safe side I think it's reasonable to keep the key as a parameter.


http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py@240
PS7, Line 240:   if print_mode == QueryOptionDisplayModes.ALL_OPTIONS:
 : if len(advanced_options) > 0:
 :   print '\nAdvanced Query Options:'
 :   self._print_option_group(advanced_options)
 : if len(deprecated_options) > 0:
 :   print '\nDeprecated Query Options:'
> nit: this can be simplified as:
Done


http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py@620
PS7, Line 620: ESS
 :
 : # Remove any extra spaces surrounding the tokens.
> These members are already available in _print_with_set through self. And as
Good point!
Done



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 20:51:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Bump Kudu version to 1520b39

2017-11-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8492


Change subject: Bump Kudu version to 1520b39
..

Bump Kudu version to 1520b39

Change-Id: Ib3d5d88bd4d3347447a64fac75ca3e0427b11ec6
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/8492/1
--
To view, visit http://gerrit.cloudera.org:8080/8492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib3d5d88bd4d3347447a64fac75ca3e0427b11ec6
Gerrit-Change-Number: 8492
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes

2017-11-07 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
..

IMPALA-2281: Replace FNV with FastHash in exchange nodes

FNV is not a good enough hash function. This patch introduces FastHash
into the codebase and uses it in exchange nodes.

Testing: Two test cases involving arbitrary ordering are changed.
Single node performance benchmark shows no performance difference.

Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/raw-value-test.cc
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/util/hash-util-ir.cc
M be/src/util/hash-util.h
M 
testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
M testdata/workloads/tpcds/queries/tpcds-q77a.test
13 files changed, 201 insertions(+), 101 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/8417/4
--
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..


Patch Set 11:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner-ir.cc
File be/src/exec/hdfs-scanner-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner-ir.cc@61
PS11, Line 61: return 0;
> Is there a reason why we don't treat this as parse error and returns -1 ?
No that's a good point. I changed it.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.h@347
PS11, Line 347:   /// Returns -1 if parsing should be aborted due to parse 
errors.
> May help to also add "Returns 0 if copying strings into 'pool' failed'.
Clarified that it could be a memory allocation error. Per the other comment, it 
probably makes most sense to just return -1.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-sequence-scanner.cc
File be/src/exec/hdfs-sequence-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-sequence-scanner.cc@a282
PS11, Line 282:
> It may be trivial but does it make sense to add DCHECK(row_batch->num_tuple
The assumption of 1 tuple per row for scans is pretty deeply baked into the 
scanners and planner, so it seemed silly to have this speculative generality 
here. I added a DCHECK to HdfsScanner to document that the assumption is valid 
for all HDFS scanners.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc@849
PS11, Line 849: tuples_returned
> There is a subtle behavior here if Tuple::CopyStrings() failed in WriteAlig
Yeah I think I just made a mistake returning 0. Returning -1 makes more sense. 
I manually tested with this query:
[localhost:21000] > set num_nodes=1; set disable_codegen=1; set 
mem_limit=5m; select * from widetable_1000_cols;
NUM_NODES set to 1
DISABLE_CODEGEN set to 1
MEM_LIMIT set to 5m
Query: select * from widetable_1000_cols
Query submitted at: 2017-11-07 11:52:01 (Coordinator: 
http://tarmstrong-box:25000)
Query progress can be monitored at: 
http://tarmstrong-box:25000/query_plan?query_id=9948285e1aa8e172:f73fe971
ERROR: Memory limit exceeded: HdfsScanner::WriteAlignedTuples() failed to 
allocate 125 bytes for strings.
HDFS_SCAN_NODE (id=0) could not allocate 125.00 B without exceeding limit.
Error occurred on backend tarmstrong-box:22000 by fragment 
9948285e1aa8e172:f73fe971
Memory left in process limit: 8.01 GB
Memory left in query limit: 947.59 KB
Query(9948285e1aa8e172:f73fe971): Limit=5.00 MB Reservation=0 
ReservationLimit=0 OtherMemory=4.07 MB Total=4.07 MB Peak=4.07 MB
  Fragment 9948285e1aa8e172:f73fe971: Reservation=0 
OtherMemory=4.07 MB Total=4.07 MB Peak=4.07 MB
HDFS_SCAN_NODE (id=0): Total=4.07 MB Peak=4.07 MB
PLAN_ROOT_SINK: Total=0 Peak=0

It would be nice to turn it into an end-to-end test but I cant see how to make 
it robustly fail in that place.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc@857
PS11, Line 857: num_tuples
> Why is this not tuples_returned here ? Is there any guarantee max_added_tup
I think num_fields is tracking the number of fields consumed from the input.



--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 19:54:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8172

to look at the new patch set (#13).

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
..

IMPALA-5307: Part 4: copy out uncompressed text and seq

This is the final patch for IMPALA-5307.

Text and Seq scanners are converted to use the same approach
as Avro.

contains_tuple_data is now false so a bunch of dead code in
ScannerContext can be removed. We also no longer attach I/O
buffers to row batches so that logic can be removed.

Testing:
Ran core ASAN tests.

Perf:
I reran the same benchmarks as in Part 2. There was no measurable
difference before and after - for both text and seq processing time
is dominated by text parsing.

Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner-ir.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/text-converter.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
23 files changed, 137 insertions(+), 212 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/13
--
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h@184
PS6, Line 184: //
> Can remove this TODO
Done



--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 19:55:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8424

to look at the new patch set (#7).

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..

IMPALA-4835 (prep only): create io subfolder and namespace

Instead of using the DiskIoMgr class as a namespace, which prevents
forward-declaration of inner classes, create an impala::io namespace
and unnested the inner class.

This is done in anticipation of DiskIoMgr depending on BufferPool. This
helps avoid a circular dependency between DiskIoMgr, TmpFileMgr and
BufferPool headers that could not be broken with forward declarations.

Testing:
Ran core tests.

Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
---
M be/CMakeLists.txt
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/disk-io-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
A be/src/runtime/io/CMakeLists.txt
R be/src/runtime/io/disk-io-mgr-internal.h
R be/src/runtime/io/disk-io-mgr-stress-test.cc
R be/src/runtime/io/disk-io-mgr-stress.cc
R be/src/runtime/io/disk-io-mgr-stress.h
R be/src/runtime/io/disk-io-mgr-test.cc
R be/src/runtime/io/disk-io-mgr.cc
A be/src/runtime/io/disk-io-mgr.h
R be/src/runtime/io/handle-cache.h
R be/src/runtime/io/handle-cache.inline.h
R be/src/runtime/io/request-context.cc
R be/src/runtime/io/request-context.h
A be/src/runtime/io/request-ranges.h
R be/src/runtime/io/scan-range.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
38 files changed, 1,416 insertions(+), 1,311 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/8424/7
-- 
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-07 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 6:

(1 comment)

One small comment

http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h@184
PS6, Line 184: /// TODO: Break this up the .h/.cc into multiple files under an 
/io subdirectory.
Can remove this TODO



--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 19:39:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8267 )

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..


Patch Set 14: Code-Review+1

Rebased, carry Lars' +1. Lars are you comfortable with bringing this to a +2?


--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 19:12:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8267

to look at the new patch set (#14).

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..

IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

Switch the decoders to using more batch-oriented interfaces. As an
intermediate step this doesn't make the interfaces of LevelDecoder
or DictDecoder batch-oriented, only the lower-level utility classes.

The next step would be to change those interfaces to be batch-oriented
and make according optimisations in parquet. This could deliver much
larger perf improvements than the current patch.

The high-level changes are.
* BitReader -> BatchedBitReader, which is built to unpack runs of 32
  bit-packed values efficiently.
* RleDecoder -> RleBatchDecoder, which exposes the repeated and literal
  runs to the caller and uses BatchedBitReader to unpack literal runs
  efficiently.
* Dict decoding uses RleBatchDecoder to decode repeated runs efficiently
  and uses the BitPacking utilities to unpack and encode in a single
  step.

Also removes an older benchmark that isn't too interesting (since
the batch-oriented approach to encoding and decoding is so much
faster than the value-by-value approach).

Testing:
* Ran core tests.
* Updated unit tests to exercise new code.
* Added test coverage for the deprecated bit-packed level encoding to
  that it still works (there was no coverage previously).

Perf:
Single-node benchmarks showed a few % performance gain. 16 node cluster
benchmarks only showed a gain for TPC-H nested.

Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
---
M be/src/benchmarks/CMakeLists.txt
M be/src/benchmarks/bit-packing-benchmark.cc
D be/src/benchmarks/rle-benchmark.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
D be/src/experiments/bit-stream-utils.8byte.h
D be/src/experiments/bit-stream-utils.8byte.inline.h
M be/src/util/bit-packing.h
M be/src/util/bit-packing.inline.h
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
M be/src/util/dict-encoding.h
M be/src/util/dict-test.cc
M be/src/util/parquet-reader.cc
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
M testdata/data/README
A testdata/data/alltypes_agg_bitpacked_def_levels.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
M tests/query_test/test_scanners.py
20 files changed, 1,251 insertions(+), 966 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/8267/14
--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8267 )

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h@96
PS12, Line 96: for different encodings.
> what do we mean by that? Isn't this just for RLE?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:57:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8408 )

Change subject: IMPALA-6121: remove I/O mgr request context cache
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h
File be/src/runtime/disk-io-mgr-reader-context.h:

PS7:
I assume this was a wholesale move of the class without modifications, except 
for the new methods defined in the .cc file?


http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662
PS7, Line 662:   void UnregisterContext(DiskIoRequestContext* context);
Given that the context cannot be used after this call, should we move the 
unique_ptr into this call and delete it then?



--
To view, visit http://gerrit.cloudera.org:8080/8408
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108
Gerrit-Change-Number: 8408
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:56:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-07 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8267

to look at the new patch set (#13).

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..

IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

Switch the decoders to using more batch-oriented interfaces. As an
intermediate step this doesn't make the interfaces of LevelDecoder
or DictDecoder batch-oriented, only the lower-level utility classes.

The next step would be to change those interfaces to be batch-oriented
and make according optimisations in parquet. This could deliver much
larger perf improvements than the current patch.

The high-level changes are.
* BitReader -> BatchedBitReader, which is built to unpack runs of 32
  bit-packed values efficiently.
* RleDecoder -> RleBatchDecoder, which exposes the repeated and literal
  runs to the caller and uses BatchedBitReader to unpack literal runs
  efficiently.
* Dict decoding uses RleBatchDecoder to decode repeated runs efficiently
  and uses the BitPacking utilities to unpack and encode in a single
  step.

Also removes an older benchmark that isn't too interesting (since
the batch-oriented approach to encoding and decoding is so much
faster than the value-by-value approach).

Testing:
* Ran core tests.
* Updated unit tests to exercise new code.
* Added test coverage for the deprecated bit-packed level encoding to
  that it still works (there was no coverage previously).

Perf:
Single-node benchmarks showed a few % performance gain. 16 node cluster
benchmarks only showed a gain for TPC-H nested.

Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
---
M be/src/benchmarks/CMakeLists.txt
M be/src/benchmarks/bit-packing-benchmark.cc
D be/src/benchmarks/rle-benchmark.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
D be/src/experiments/bit-stream-utils.8byte.h
D be/src/experiments/bit-stream-utils.8byte.inline.h
M be/src/util/bit-packing.h
M be/src/util/bit-packing.inline.h
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
M be/src/util/dict-encoding.h
M be/src/util/dict-test.cc
M be/src/util/parquet-reader.cc
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
M testdata/data/README
A testdata/data/alltypes_agg_bitpacked_def_levels.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
M tests/query_test/test_scanners.py
20 files changed, 1,252 insertions(+), 966 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/8267/13
--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1441/


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:49:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8428 )

Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable
..


Patch Set 4: Code-Review+2

Thanks Zoltan - I really appreciate the cleanup.


--
To view, visit http://gerrit.cloudera.org:8080/8428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9
Gerrit-Change-Number: 8428
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:49:26 +
Gerrit-HasComments: No


[native-toolchain-CR] Bump Kudu version to 1520b39

2017-11-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8491 )

Change subject: Bump Kudu version to 1520b39
..

Bump Kudu version to 1520b39

Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08
---
M buildall.sh
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Thomas Tauber-Marshall: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08
Gerrit-Change-Number: 8491
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Bump Kudu version to 1520b39

2017-11-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8491 )

Change subject: Bump Kudu version to 1520b39
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08
Gerrit-Change-Number: 8491
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:38:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option

2017-11-07 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8490 )

Change subject: IMPALA-2248: Make idle_session_timeout a query option
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc@205
PS3, Line 205:   int32_t requested_timeout = atoi(value.c_str());
Could you extract the body of the 'if' to a new function in order to keep this 
part more verbose?


http://gerrit.cloudera.org:8080/#/c/8490/3/fe/src/test/java/org/apache/impala/util/Metrics.java
File fe/src/test/java/org/apache/impala/util/Metrics.java:

http://gerrit.cloudera.org:8080/#/c/8490/3/fe/src/test/java/org/apache/impala/util/Metrics.java@26
PS3, Line 26: public class Metrics {
Could you please write a short comment to describe the purpose of this class?



--
To view, visit http://gerrit.cloudera.org:8080/8490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
Gerrit-Change-Number: 8490
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:36:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 8:

> It's still technically accurate

Yeah, I guess that's true.


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:34:06 +
Gerrit-HasComments: No


[native-toolchain-CR] Bump Kudu version to 1520b39

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8491 )

Change subject: Bump Kudu version to 1520b39
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08
Gerrit-Change-Number: 8491
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:33:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 8:

It's still technically accurate in that the queries consume some amount of 
memory, especially if the result cache is used. Could we tweak the wording 
somewhat, e.g. "To free any remaining resources they are using..."


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:32:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1440/


--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:24:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 6: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:24:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc@953
PS5, Line 953:   ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(1L);
> On the /queries webpage, I think "num_in_flight_queries" is this same numbe
Oh, I see, I didn't look at the queries json. The queries page seems internally 
inconsistent - num_in_flight_queries includes "queries in flight" and "queries 
waiting to be closed". But yeah, I deliberately avoided using the term "in 
flight" because it's unclear whether it should include queries that are waiting 
to be closed.

I don't like inventing a new term but this does seem less ambiguous going 
forward.

It would be nice to re-evaluate all the names but probably best to do that all 
at once instead of piecemeal, since it will require considering impact on other 
tools, like you mentioned.



--
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:24:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8202 )

Change subject: IMPALA-4704: Turns on client connections when local catalog 
initialized.
..


Patch Set 20:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@89
PS19, Line 89: must call the following
 : /// methods:
> good question-- I've kept it the same as before so not sure what the reason
That DCHECK:

  DCHECK(exec_env.process_mem_tracker() != nullptr)
  << "ExecEnv::StartServices() must be called before starting RPC services";

looks stale since the process_mem_tracker() is set in exec_env_.Init() which is 
called earlier in main (not to mention StartServices() doesn't exist anymore). 
I also don't see what the specific dependency is that this is trying to point 
out (this DCHECK should have been closer to the code that requires this to be 
true).

If you want to keep the dcheck, I see no reason it couldn't be moved before 
both Init and Start. Then we could combine them, since there doesn't appear to 
be any logic behind the separation of those two.  But you can defer cleaning 
this up if you prefer (though the dcheck wording should be updated at least).


http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc@1936
PS20, Line 1936:
I think a quick comment is worth it to highlight the dependency between these 
two lines:
// Subscribe to the catalog topic and then wait for the initial catalog update.
(or whatever is accurate)



--
To view, visit http://gerrit.cloudera.org:8080/8202
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Gerrit-Change-Number: 8202
Gerrit-PatchSet: 20
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 07 Nov 2017 18:04:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8461 )

Change subject: IMPALA-6151: add query-level fragment/backend counters
..


Patch Set 5: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc@953
PS5, Line 953:   ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(1L);
On the /queries webpage, I think "num_in_flight_queries" is this same number 
(the count of client_request_state_map_ entries). Should we make the names 
consistent? Probably "registered" is a better name than "in-flight" (since 
internally, we use in-flight to mean something slightly different and in-flight 
seems more ambiguous). But I'm not sure if we can change the webserver names 
without breaking something.



-- 
To view, visit http://gerrit.cloudera.org:8080/8461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9
Gerrit-Change-Number: 8461
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 17:32:27 +
Gerrit-HasComments: Yes


[native-toolchain-CR] Bump Kudu version to 1520b39

2017-11-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8491 )

Change subject: Bump Kudu version to 1520b39
..


Patch Set 1:

http://unittest.jenkins.cloudera.com/job/verify-impala-toolchain-package-build/482/


--
To view, visit http://gerrit.cloudera.org:8080/8491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08
Gerrit-Change-Number: 8491
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 07 Nov 2017 17:32:26 +
Gerrit-HasComments: No


[native-toolchain-CR] Bump Kudu version to 1520b39

2017-11-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8491


Change subject: Bump Kudu version to 1520b39
..

Bump Kudu version to 1520b39

Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08
---
M buildall.sh
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/91/8491/1
--
To view, visit http://gerrit.cloudera.org:8080/8491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08
Gerrit-Change-Number: 8491
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs

2017-11-07 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8456 )

Change subject: IMPALA-6148: Specifying thirdparty deps as URLs
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py@112
PS2, Line 112:   return v
> Why not cache v instead?
Also, would it be appropriate to cache the value as an attribute of Package() 
instead?



--
To view, visit http://gerrit.cloudera.org:8080/8456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422
Gerrit-Change-Number: 8456
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 07 Nov 2017 17:29:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 8:

Maybe this tooltip should be updated?

  document->AddMember("waiting-tooltip", "These queries are no longer 
executing, either "
  "because they encountered an error or because they have returned all of 
their "
  "results, but they are still active so that their results can be 
inspected. To "
  "free the resources they are using, they must be closed.",


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Tue, 07 Nov 2017 17:28:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-11-07 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Nov 2017 17:05:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option

2017-11-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8490 )

Change subject: IMPALA-2248: Make idle_session_timeout a query option
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc@204
PS3, Line 204: if (iequals(key, "idle_session_timeout")) {
Its unfortunate this is special cased here. Could you add a comment explaining 
why that's necessary.


http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc@213
PS3, Line 213: key, value,
 :   _->set_query_options,
single line


http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/impala-server.cc@189
PS3, Line 189: DEFINE_int32(idle_session_timeout, 0, "The time, in seconds, 
that a session may be idle"
Note how this interacts with the query option.


http://gerrit.cloudera.org:8080/#/c/8490/3/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/8490/3/common/thrift/ImpalaService.thrift@292
PS3, Line 292:   // The time, in seconds, that a session may be idle for before 
it is closed (and all
Note how this interacts with the flag.



--
To view, visit http://gerrit.cloudera.org:8080/8490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
Gerrit-Change-Number: 8490
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 17:03:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs

2017-11-07 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8456 )

Change subject: IMPALA-6148: Specifying thirdparty deps as URLs
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py@112
PS2, Line 112:   return v
Why not cache v instead?



--
To view, visit http://gerrit.cloudera.org:8080/8456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422
Gerrit-Change-Number: 8456
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 07 Nov 2017 16:45:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options

2017-11-07 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8401 )

Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options
..


Patch Set 3: Code-Review+1

Sailesh, if you are happy with this, want to +2 it? Thanks!


--
To view, visit http://gerrit.cloudera.org:8080/8401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e
Gerrit-Change-Number: 8401
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Nov 2017 16:34:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-07 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.h@516
PS7, Line 516:   /// Given a query option name this function gets the option 
level for it to use when
 :   /// displaying from Impala shell and adds the level to the 
ConfigVariable parameter
 :   /// in numeric format. For values see TQueryOptionLevel enum.
nit: For me, this comment is too complex and tells too much implementation 
details. How about something like this:
This function sets the option level for parameter 'option' based on the mapping 
stored in 'query_option_levels_'.
The option level is used by the Impala shell when it displays the options.


http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.cc@1228
PS7, Line 1228: const string& option_key
Why do we pass option_key here? It is already there in config, isn't it?


http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py@620
PS7, Line 620: self.imp_client.default_query_options,
 :   self.set_query_options,
 :   self.imp_client.query_option_levels
These members are already available in _print_with_set through self. And as far 
as I can see the same members are passed on every callsites.
I think it would be easier to read if _print_with_set didn't take these params.



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 7
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 16:25:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6105: Clarify argument order in single node perf run

2017-11-07 Thread Jim Apple (Code Review)
Jim Apple has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8470 )

Change subject: IMPALA-6105: Clarify argument order in single_node_perf_run
..

IMPALA-6105: Clarify argument order in single_node_perf_run

single_node_perf_run.py uses git_hash_A vs. git_hash_B, distinguish
them by their position in the command-line
arguments. single_node_perf_run.py calls report_benchmark_results.py,
which uses the "reference vs. input", distinguished by their
command-line flags. The output of report_benchmark_results.py uses
"{empty string} vs Base".

In the long run, I think it would be better to fix all three to use
the same terminology, but this comment hopefully adds clarity.

Change-Id: Ib236ce7e83dc193ef1382f630ce58759a639
Reviewed-on: http://gerrit.cloudera.org:8080/8470
Tested-by: Impala Public Jenkins
Reviewed-by: Jim Apple 
---
M bin/single_node_perf_run.py
1 file changed, 22 insertions(+), 2 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Jim Apple: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/8470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib236ce7e83dc193ef1382f630ce58759a639
Gerrit-Change-Number: 8470
Gerrit-PatchSet: 3
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6105: Clarify argument order in single node perf run

2017-11-07 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8470 )

Change subject: IMPALA-6105: Clarify argument order in single_node_perf_run
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib236ce7e83dc193ef1382f630ce58759a639
Gerrit-Change-Number: 8470
Gerrit-PatchSet: 2
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 16:16:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC

2017-11-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8023 )

Change subject: IMPALA-4856: Port data stream service to KRPC
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc
File be/src/runtime/krpc-data-stream-recvr.cc:

http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@211
PS8, Line 211: deferred_rpcs_.empty() && batch_queue_.empty()
> why not write this condition as:
Actually, I think the loop exiting condition is not quite right, which led to 
this confusin conditional.  The loop exiting condition for "we're done" should 
check that there are no more senders, and that there's nothing left to drain 
from the deferred_rpcs_ queue and that there's no pending insert into 
batch_queue_.

So, the third wait loop condition should be something like:

(num_remaining_senders > 0 || !deferred_rpcs_.empty() || num_pending_enqueue_ > 
0)

and then this if-stmt conditional can just be:
if (batch_queue_.empty())

and then the DCHECK can be the negation of that third wait loop conditional:

// Wait loop is exited with an empty batch_queue_ only when there will be no 
more batches.
DCHECK(num_remaining_senders == 0 && deferred_rpcs_.empty() && 
num_pending_enqueue_ == 0);

And then you can get rid of the outer loop. That outer loop should be removed 
since it's effectively a busy wait (and I think we could get into a busy wait 
state in the previous patchsets).



--
To view, visit http://gerrit.cloudera.org:8080/8023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1
Gerrit-Change-Number: 8023
Gerrit-PatchSet: 8
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Nov 2017 15:58:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option

2017-11-07 Thread Zoltan Borok-Nagy (Code Review)
Hello Michael Ho, Laszlo Gaal, Gabor Kaszab, Attila Jeges, Tim Armstrong, Csaba 
Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8490

to look at the new patch set (#3).

Change subject: IMPALA-2248: Make idle_session_timeout a query option
..

IMPALA-2248: Make idle_session_timeout a query option

This commit makes idle_session_timeout a query option.
idle_session_timeout currently can be set as a command line
option, which will be the default timeout for sessions.
HS2 sessions can override it with a smaller value by setting
it in the configuration overlay of HS2 OpenSession().

However, we can't override idle_session_timeout for JDBC/ODBC
connections, because we cannot put this in the connection string.

This commit is a workaround for this problem, it allows JDBC/ODBC
connections to set the session timeout as a query option
with the SET statement.

I created an automated test case in JdbcTest.java based on
test_hs2.py::test_concurrent_session_mixed_idle_timeout.

Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
---
M be/src/service/client-request-state.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/test/java/org/apache/impala/service/JdbcTest.java
A fe/src/test/java/org/apache/impala/util/Metrics.java
10 files changed, 236 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/8490/3
--
To view, visit http://gerrit.cloudera.org:8080/8490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
Gerrit-Change-Number: 8490
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-11-07 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7793 )

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..


Patch Set 11:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7793/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7793/9//COMMIT_MSG@27
PS9, Line 27:
> Thanks for explaining.
https://docs.google.com/document/d/1G-SPZelateebNxzVb67urEVtjc5Itw-B_jjfS85bSCE/edit?usp=sharing


http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/exec/filter-context.h
File be/src/exec/filter-context.h:

http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/exec/filter-context.h@118
PS11, Line 118: Materialize filter values.
> what does that mean?
Done


http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/runtime/runtime-filter-bank.h
File be/src/runtime/runtime-filter-bank.h:

http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/runtime/runtime-filter-bank.h@78
PS11, Line 78:   /// may both be NULL, representing a filter that allows all 
rows to pass.
> is it the case that at most one of bloom_filter and min_max_filter should b
Done


http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/util/min-max-filter.h@62
PS11, Line 62: Materialize filter values
> what does that mean?
Done


http://gerrit.cloudera.org:8080/#/c/7793/9/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

http://gerrit.cloudera.org:8080/#/c/7793/9/common/thrift/PlanNodes.thrift@103
PS9, Line 103:   12: optional string kudu_col_name
> case sensitive?
Done


http://gerrit.cloudera.org:8080/#/c/7793/11/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/7793/11/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@359
PS11, Line 359: public Operator getJoinOp() { return exprCmpOp_; }
> getExprCmpOp()
Done


http://gerrit.cloudera.org:8080/#/c/7793/11/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@602
PS11, Line 602: if (!(targetExpr instanceof SlotRef)
> I think only explicit casts are problematic. Implicit casts should be ok, o
Right, we should be able to support integer implicit casts.

The complication is that if, say, the calculated max is outside of the range 
for the type of the targeted column, we can't just pass that value into Kudu as 
it will complain.

In that case, we would need to convert the max we send to be the max for the 
type of the targeted column. I've added some code to the BE to do this 
conversion.


http://gerrit.cloudera.org:8080/#/c/7793/11/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test
File testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test:

http://gerrit.cloudera.org:8080/#/c/7793/11/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test@144
PS11, Line 144: on a.month = cast(b.month + 1 as int);
> Was this your change? Why the change?
This was necessary if we didn't support implicit casts on the target. Removed 
now.



--
To view, visit http://gerrit.cloudera.org:8080/7793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
Gerrit-Change-Number: 7793
Gerrit-PatchSet: 11
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Anonymous Coward #345
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 07 Nov 2017 15:06:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu

2017-11-07 Thread Thomas Tauber-Marshall (Code Review)
Hello Michael Ho, Lars Volker, Matthew Jacobs, Anonymous Coward #345, Tim 
Armstrong, Todd Lipcon, Mostafa Mokhtar, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/7793

to look at the new patch set (#12).

Change subject: IMPALA-4252: Min-max runtime filters for Kudu
..

IMPALA-4252: Min-max runtime filters for Kudu

This patch implements min-max filters for runtime filters. Each
runtime filter generates a bloom filter or a min-max filter,
depending on if it has HDFS or Kudu targets, respectively.

In RuntimeFilterGenerator in the planner, each hash join node
generates a bloom and min-max filter for each equi-join predicate, but
only those filters that end up being assigned to a target make it into
the final plan.

Min-max filters are only assigned to Kudu scans if the target expr is
a column, as Kudu doesn't support bounds on general exprs, and only if
the join op is '=' and not 'is distinct from', as Kudu doesn't support
returning NULLs if a bound is set.

Min-max filters are inserted into by the PartitionedHashJoinBuilder.
Codegen is used to eliminate branching on the type of filter. String
min-max filters truncate their bounds at 1024 chars, so that the max
amount of memory used by min-max filters is negligible.

For now, min-max filters are only applied at the KuduScanner, which
passes them into the Kudu client.

Future work will address applying min-max filters at HDFS scan nodes
and applying bloom filters at Kudu scan nodes.

Functional Testing:
- Added new planner tests and updated the old ones. (in old tests, a
  lot of runtime filters are renumbered as we always generate min-max
  filters even if they don't end up getting assigned and they take up
  some of the RF ids).
- Updated existing runtime filter tests to work with Kudu.
- Added e2e tests for min-max filter specific functionality.

Perf Testing:
- All tests run on Kudu stress cluster (10 nodes) and tpch_100_kudu,
  timings are averages of 3 runs.
- Ran a contrived query with a filter that does not eliminate any rows
  (full self join of lineitem). The difference in running time was
  negligible - 24.46s with filters on, 24.15s with filters off for
  a ~1% slowdown.
- Ran a contrived query with a filter that elimiates all rows (self
  join on lineitem with a join condition that never matches). The
  filters resulted in a significant speedup - 0.26s with filters on,
  1.46s with filters off for a ~5.6x speedup. This query is added to
  targeted-perf.

Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/filter-context.cc
M be/src/exec/filter-context.h
M be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-scan-node-mt.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-scanner.h
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/runtime/runtime-filter-ir.cc
M be/src/runtime/runtime-filter.cc
M be/src/runtime/runtime-filter.h
M be/src/runtime/runtime-filter.inline.h
M be/src/runtime/timestamp-value.h
M be/src/service/impala-internal-service.cc
M be/src/util/CMakeLists.txt
A be/src/util/min-max-filter-ir.cc
A be/src/util/min-max-filter-test.cc
A be/src/util/min-max-filter.cc
A be/src/util/min-max-filter.h
M common/thrift/Data.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 

[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-07 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py@240
PS7, Line 240:   if len(advanced_options) > 0 and print_mode == 
QueryOptionDisplayModes.ALL_OPTIONS:
 : print '\nAdvanced Query Options:'
 : self._print_option_group(advanced_options, set_options)
 :   if len(deprecated_options) > 0 and print_mode == 
QueryOptionDisplayModes.ALL_OPTIONS:
 : print '\nDeprecated Query Options:'
 : self._print_option_group(deprecated_options, set_options)
nit: this can be simplified as:

if print_mode == QueryOptionDisplayModes.ALL_OPTIONS:
  if advanced_options:
print '\nAdvanced Query Options:'
self._print_option_group(advanced_options, set_options)
  if deprecated_options:
print '\nDeprecated Query Options:'
self._print_option_group(deprecated_options, set_options)



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 7
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 07 Nov 2017 15:01:23 +
Gerrit-HasComments: Yes


  1   2   >