[Impala-ASF-CR] IMPALA-4259: build Impala without any test cluster setup.

2016-10-12 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4259: build Impala without any test cluster setup.
..


Patch Set 7: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4685
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 4:

Rebase onto Alex's change.

-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..


Patch Set 8: Code-Review+1

Carry

-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#4).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M testdata/workloads/functional-query/queries/QueryTest/uda.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
37 files changed, 744 insertions(+), 484 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java
M testdata/workloads/functional-query/queries/QueryTest/uda.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
38 files changed, 748 insertions(+), 486 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4123: Fast bit unpacking

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking
..


Patch Set 8:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/4494/8//COMMIT_MSG
Commit Message:

PS8, Line 17: 64
> 32, now?
Done


PS8, Line 18: ors at 64-bit boundaries
> What does it mean to have an or at a k-bit boundary?
Reworded to be hopefully clearer.


http://gerrit.cloudera.org:8080/#/c/4494/8/be/src/benchmarks/bit-packing-benchmark.cc
File be/src/benchmarks/bit-packing-benchmark.cc:

Line 273: #include "common/names.h"
> This isn't needed if using namespace std;, yes?
I think I prefer keeping this and removing the "using namespace" declarations 
below.


http://gerrit.cloudera.org:8080/#/c/4494/8/be/src/benchmarks/bswap-benchmark.cc
File be/src/benchmarks/bswap-benchmark.cc:

Line 123
> Thanks for fixing this. When this is committed, can you file a bug against 
Sure, will do.


http://gerrit.cloudera.org:8080/#/c/4494/8/be/src/testutil/mem-util.h
File be/src/testutil/mem-util.h:

PS8, Line 31: posix_memalign
> #include stdlib.h. I'd say cstdlib, but posix_memalign doesn't seem to be p
Added cstdlib. I think in practice it's reasonable to assume that cstdlib 
includes the same things as stdlib.h in some form.

The glibc one from the toolchain literally has #include  in it.


Line 45:  private:
> DISALLOW_COPY_AND_ASSIGN
Done


http://gerrit.cloudera.org:8080/#/c/4494/8/be/src/util/bit-packing-test.cc
File be/src/util/bit-packing-test.cc:

Line 39: void UnpackSubset(const uint32_t* in, const uint8_t* packed, int 
num_in_values,
> I think it would aid clarity to add a short description of what the meaning
Added descriptions and elaborated a little on what the function is doing.


PS8, Line 45: uint32_t
> const uint32_t * in
Done


PS8, Line 72: INFO
> Was this just for debugging, or did you mean to leave it in?
Didn't mean to leave it in - removed.


PS8, Line 84: 8
> Why 8 and not 4?
Used CHAR_BIT and added an intermediate variable to make it a bit clearer.


http://gerrit.cloudera.org:8080/#/c/4494/8/be/src/util/bit-packing.inline.h
File be/src/util/bit-packing.inline.h:

PS8, Line 64: //LOG(INFO) << "bit_width " << BIT_WIDTH << "\n"
:   //  << "in_bytes " << in_bytes << " num_values " << num_values 
<< " max_input_values " << max_input_values << "\n"
:// << "values_to_read " << values_to_read << " batches_to_read 
" << batches_to_read << "\n"
:// << "remainder_values " << remainder_values;
> remove
Oops, sorry missed this.


-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4123: Fast bit unpacking

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#9).

Change subject: IMPALA-4123: Fast bit unpacking
..

IMPALA-4123: Fast bit unpacking

Adds utility functions for fast unpacking of batches of bit-packed
values. These support reading batches of any number of values provided
that the start of the batch is aligned to a byte boundary. Callers that
want to read smaller batches that don't align to byte boundaries will
need to implement their own buffering.

The unpacking code uses only portable C++ and no SIMD intrinsics, but is
fairly efficient because unpacking a full batch of 32 values compiles
down to 32-bit loads, shifts by constants, masks by constants, bitwise
ors when a value straddles 32-bit words and stores. Further speedups
should be possible using SIMD intrinsics.

Testing:
Added unit tests for unpacking, exhaustively covering different
bitwidths with additional test dimensions (memory alignment, various
input sizes, etc).

Tested under ASAN to ensure the bit unpacking doesn't read past the end
of buffers.

Perf:
Added microbenchmark that shows on average an 8-9x speedup over the
existing BitReader code.

Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/bit-packing-benchmark.cc
M be/src/benchmarks/bswap-benchmark.cc
A be/src/testutil/mem-util.h
M be/src/util/CMakeLists.txt
A be/src/util/bit-packing-test.cc
A be/src/util/bit-packing.h
A be/src/util/bit-packing.inline.h
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
10 files changed, 883 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4494/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-12 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4678/5//COMMIT_MSG
Commit Message:

PS5, Line 10: use
Would "mount" be an appropriate verb. If so, I think it's more descriptive.


PS5, Line 19: need the
"need to move the"?


PS5, Line 21: rsync
Why rsync versus, say, just cp or mv?


http://gerrit.cloudera.org:8080/#/c/4678/5/tests/comparison/leopard/README
File tests/comparison/leopard/README:

PS5, Line 17: Basic Configuration
It's a nit, but it might be nicer if this line, and "External Volume 
Configuration" below, looked a little more like headers.


PS5, Line 43: use
"mount external Docker volumes that contain the necessary testdata"?


PS5, Line 54: path on TARGET_HOST where the
: external volume will reside
This is just a directory, right?


http://gerrit.cloudera.org:8080/#/c/4678/5/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

Line 298: if os.environ.get('KUDU_IS_SUPPORTED') == 'true':
Does it makes sense to reference KUDU-1419 here, or at least the README, to 
explain why the extra work is being done?


Line 308: 'rsync -e "ssh -i {priv_key} -o StrictHostKeyChecking=no '
I'm just curious -- how long does this process take?


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3002/1473: Cardinality observability cleanup

2016-10-12 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#2).

Change subject: IMPALA-3002/1473: Cardinality observability cleanup
..

IMPALA-3002/1473: Cardinality observability cleanup

IMPALA-3002:
The shell prints an incorrect value for '#Rows' in the exec
summary for broadcast nodes due to incorrect logic around
whether to use max or agg stats. This patch makes the behavior
consistent with the way the be treats exec summaries in
summary-util.cc

IMPALA-1473:
When there is a merging exchange with a limit, we may copy rows
into the output batch beyond the limit. In this case, we currently
update the output batch's size to reflect the limit, but we also
need to update ExecNode::num_rows_returned_ or the exec summary
may show that the exchange node returned more rows than it really
did.

Additionally, PlanFragmentExecutor::GetNext does not update
rows_produced_counter_ in some cases, leading the runtime profile
to display an incorrect value for 'RowsProduced'.

Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
---
M be/src/exec/exchange-node.cc
M be/src/runtime/plan-fragment-executor.cc
M shell/impala_client.py
A tests/query_test/test_observability.py
4 files changed, 45 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4679/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment

2016-10-12 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment
..


Patch Set 4:

Tim, I added you in case you wanted to talk about how this might work with 
Ubuntu 16.04

-- 
To view, visit http://gerrit.cloudera.org:8080/4674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If166a8a286d7559af547da39f6cc09e723f34c7e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4259: build Impala without any test cluster setup.

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4259: build Impala without any test cluster setup.
..


Patch Set 7: Code-Review+2

The merge failed because the merge jump runs buildall.sh from a different 
directory. Calling make without changing directories broke that.

-- 
To view, visit http://gerrit.cloudera.org:8080/4685
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()
..


Patch Set 3:

(8 comments)

Looking good, just some small things then I'll +1.

http://gerrit.cloudera.org:8080/#/c/4651/1/be/src/exec/aggregation-node.cc
File be/src/exec/aggregation-node.cc:

Line 174: void AggregationNode::Codegen(RuntimeState* state) {
> I thought about something similar but this has the side effect of putting "
Ah, I didn't think about that. This works ok for now then.


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/aggregation-node.cc
File be/src/exec/aggregation-node.cc:

Line 169: runtime_profile()->AddCodegenMsg(false, "disabled by query option 
DISABLE_CODEGEN");
Thanks, I think this will be clearer.

It's still a little weird that we don't print anything when expr codegen is 
disabled (since that can make a bit perf difference), but I guess that will be 
fixed naturally by a follow-up patch.


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/hash-join-node.cc
File be/src/exec/hash-join-node.cc:

Line 172:   Function* codegen_process_probe_batch_fn = 
CodegenProcessProbeBatch(codegen, hash_fn);
Long line.


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 622: return Status("Disabled by query option.");
Dead code I think.


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

Line 153: Status HdfsScanNodeBase::Prepare(RuntimeState* state) {
It doesn't look like we add the codegen disabled message hre.


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

Line 700: return Status("Disabled by query option.");
Dead code?


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

Line 562: #define ARG_DECL_LIST(n) \
Thanks for doing this.

Can you add a comment just giving an example of what the macros expand to?


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

Line 104:   state.CreateCodegen(), env, JniUtil::internal_exc_class(), 
result_bytes);
Tab?


-- 
To view, visit http://gerrit.cloudera.org:8080/4651
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I207566bc9f4c6a159271ecdbc4bbdba3d78c6651
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-12 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4678/4/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS4, Line 23: normpath
> Perhaps parentheses around these. It took me a little bit to parse this.
Done


PS4, Line 55: os.path.sep + join_path('home', DOCKER_USER_NAME, 'Impala', 
'testdata', 'cluster')
> This is a bit long, and make it hard to read. Maybe it could be saved as DE
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-12 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#5).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 198 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4259: build Impala without any test cluster setup.

2016-10-12 Thread Tim Armstrong (Code Review)
Hello David Knupp,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4685

to look at the new patch set (#6).

Change subject: IMPALA-4259: build Impala without any test cluster setup.
..

IMPALA-4259: build Impala without any test cluster setup.

The main outcome of this change is to avoid making unnecessary
modification to the Impala or other source trees when we don't need the
test cluster.

To achieve that, this refactors the script to make the flow easier
to understand and makes it more consistent which build steps are
executed in which modes.

Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
---
M bin/clean.sh
M buildall.sh
M ext-data-source/CMakeLists.txt
M fe/CMakeLists.txt
4 files changed, 179 insertions(+), 163 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/4685/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4685
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4259: build Impala without any test cluster setup.

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4259: build Impala without any test cluster setup.
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4685/5/buildall.sh
File buildall.sh:

Line 263:   NEED_TEST_CLUSTER=1
> NEED_MINICLUSTER?
Done


Line 316:   pushd "${IMPALA_FE_DIR}"
> I'm a little confused by having a "fe" cmake target that we are using in bu
Done. Also made the fe/ext-data-source targets use mvn-quiet.


-- 
To view, visit http://gerrit.cloudera.org:8080/4685
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3718: Add test cancellation tests for Kudu

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3718: Add test_cancellation tests for Kudu
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4700/1/tests/query_test/test_cancellation.py
File tests/query_test/test_cancellation.py:

Line 94:   pytest.skip()
What happens if we don't skip this? Are those edge cases potentially 
interesting?


-- 
To view, visit http://gerrit.cloudera.org:8080/4700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf3d3853e7075991f6d12f125407ebdbe6a287e2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4259: build Impala without any test cluster setup.

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4259: build Impala without any test cluster setup.
..


Patch Set 5:

(2 comments)

So much better

http://gerrit.cloudera.org:8080/#/c/4685/5/buildall.sh
File buildall.sh:

Line 263:   NEED_TEST_CLUSTER=1
NEED_MINICLUSTER?


Line 316:   pushd "${IMPALA_FE_DIR}"
I'm a little confused by having a "fe" cmake target that we are using in 
build_fe() but not here. Looks like building the fe with make would take care 
of building its dependencies also, so maybe we can simplify the flow here?


-- 
To view, visit http://gerrit.cloudera.org:8080/4685
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR](hadoop-next) IMPALA-4277: temporary hack to avoid compile errors for HDFS block location API

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4701

Change subject: IMPALA-4277: temporary hack to avoid compile errors for HDFS 
block location API
..

IMPALA-4277: temporary hack to avoid compile errors for HDFS block location API

This is a temporary workaround to get Impala building against HDFS 3.0
that can be undone once IMPALA-4172 is committed.

This builds if I put together a directory with the minimum files required to
build the backend against HDFS:

hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.a
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libhdfs.so.0.0.0
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.so
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.so.0.0.0
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libhdfs.so
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.la
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libhdfs.a
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.so.0
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/include/hdfs.h

Change-Id: Ice765b1be62c0c3d1b99282d8c652dd35d3d766d
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
3 files changed, 26 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/4701/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4701
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ice765b1be62c0c3d1b99282d8c652dd35d3d766d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: hadoop-next
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] WIP: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-12 Thread Todd Lipcon (Code Review)
Todd Lipcon has uploaded a new patch set (#2).

Change subject: WIP: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..

WIP: Use AUTO_FLUSH_BACKGROUND for Kudu sink

Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
---
M be/src/exec/kudu-table-sink-test.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
7 files changed, 73 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4670/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-3718: Add test cancellation tests for Kudu

2016-10-12 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4700

Change subject: IMPALA-3718: Add test_cancellation tests for Kudu
..

IMPALA-3718: Add test_cancellation tests for Kudu

Additional functional tests for Kudu.

Change-Id: Icf3d3853e7075991f6d12f125407ebdbe6a287e2
---
M tests/query_test/test_cancellation.py
1 file changed, 13 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/4700/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Icf3d3853e7075991f6d12f125407ebdbe6a287e2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4287: EE tests fail to run when KUDU IS SUPPORTED�lse

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4287: EE tests fail to run when KUDU_IS_SUPPORTED=false
..


Patch Set 1: Code-Review+2

Thanks!

-- 
To view, visit http://gerrit.cloudera.org:8080/4697
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib97572b94c54bb36ad1d7184aa45744e8a42ffa4
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3943: Address post-merge comments.

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3943: Address post-merge comments.
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4696
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I72ccf00191afddb8583ac961f1eaf11e5eb28791
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2789: More compact mem layout with null bits at the end.

2016-10-12 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-2789: More compact mem layout with null bits at the end.
..


Patch Set 7: Code-Review+1

Thanks!

-- 
To view, visit http://gerrit.cloudera.org:8080/4673
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2789: More compact mem layout with null bits at the end.

2016-10-12 Thread Alex Behm (Code Review)
Hello Matthew Jacobs, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4673

to look at the new patch set (#7).

Change subject: IMPALA-2789: More compact mem layout with null bits at the end.
..

IMPALA-2789: More compact mem layout with null bits at the end.

There are two motivations for this change:
1. Reduce memory consumption.
2. Pave the way for full memory layout compatibility between
   Impala and Kudu to eventually enable zero-copy scans. This
   patch is a only first step towards that goal.

New Memory Layout
Slots are placed in descending order by size with trailing bytes to
store null flags. Null flags are omitted for non-nullable slots. There
is no padding between tuples when stored back-to-back in a row batch.

Example: select bool_col, int_col, string_col, smallint_col
 from functional.alltypes
Slots:   string_col|int_col|smallint_col|bool_col|null_byte
Offsets: 0  16  20   22   23

The main change is to move the null indicators to the end of tuples.
The new memory layout is fully packed with no padding in between
slots or tuples.

Performance:
Our standard cluster perf tests showed no significant difference in
query response times as well as consumed cycles, and a slight
reduction in peak memory consumption.

Testing:
An exhaustive test run passed. Ran a few select tests like TPC-H/DS
with ASAN locally.

These follow-on changes are planned:
1. Planner needs to mark slots non-nullable if they correspond
   to a non-nullable Kudu column.
2. Update Kudu scan node to copy tuples with memcpy.
3. Kudu client needs to support transferring ownership of the
   tuple memory (maybe do direct and indirect buffers separately).
4. Update Kudu scan node to use memory transfer instead of copy

Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6
---
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-scanner.h
M be/src/exec/row-batch-list-test.cc
M be/src/exec/text-converter.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/collection-value-builder-test.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/row-batch-test.cc
M be/src/runtime/tuple.cc
M be/src/runtime/tuple.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/desc-tbl-builder.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
26 files changed, 402 insertions(+), 312 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/4673/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4673
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR](hadoop-next) IMPALA-4277: bump Hadoop component versions except for Hadoop itself

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4698

Change subject: IMPALA-4277: bump Hadoop component versions except for Hadoop 
itself
..

IMPALA-4277: bump Hadoop component versions except for Hadoop itself

The Hive version bump required some minor changes to deal with renames
and added function arguments. The other components required no changes
to build the frontend against their API (I haven't been able to test
the JNI backend stuff).

IMPALA-4172 is blocking the hadoop version bump.

This builds if I manually put hive_metastore.thrift in the right place
and set SKIP_TOOLCHAIN_BOOTSTRAP=false. Tests can't run since we don't
have the special tarballs available that are used for the test cluster.

I manually constructed the Hive build dependency by taking
hive_metastore.thrift and putting it at
$CDH_COMPONENTS_HOME/hive-2.1.0-cdh6.x-SNAPSHOT/src/metastore/if/hive_metastore.thrift

Change-Id: Ia0be6bbe76d929ceaeb2fa9ac4ebba1820c4dab7
---
M bin/impala-config.sh
M common/thrift/TCLIService.thrift
M common/thrift/cli_service.thrift
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TableName.java
M fe/src/main/java/org/apache/impala/analysis/TypeDef.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
16 files changed, 38 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/4698/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia0be6bbe76d929ceaeb2fa9ac4ebba1820c4dab7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: hadoop-next
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2789: More compact mem layout with null bits at the end.

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2789: More compact mem layout with null bits at the end.
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4673/6//COMMIT_MSG
Commit Message:

Line 9: The main motivation of this change is to pave the way for full
> There are 2 main motivations of this change, no?
Done


Line 12: that goal.
> Can you mention what is left after this? 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4673
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4287: EE tests fail to run when KUDU IS SUPPORTED�lse

2016-10-12 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4697

Change subject: IMPALA-4287: EE tests fail to run when KUDU_IS_SUPPORTED=false
..

IMPALA-4287: EE tests fail to run when KUDU_IS_SUPPORTED=false

When Kudu tests were added to the functional-query test
dimensions, we needed to make sure the dimension was skipped
when KUDU_IS_SUPPORTED=false.

Change-Id: Ib97572b94c54bb36ad1d7184aa45744e8a42ffa4
---
M tests/common/test_dimensions.py
1 file changed, 4 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/4697/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4697
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib97572b94c54bb36ad1d7184aa45744e8a42ffa4
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3943: Do not throw scan errors for empty Parquet files.

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3943: Do not throw scan errors for empty Parquet files.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4693/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 413: if (row_group.num_rows == 0 || file_metadata_.num_rows == 0) 
continue;
> I actually find it really confusing because without context it's almost imp
http://gerrit.cloudera.org:8080/4696


-- 
To view, visit http://gerrit.cloudera.org:8080/4693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I50ac3df6ff24bc5c384ef22e0f804a5132adb62e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3943: Address post-merge comments.

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4696

Change subject: IMPALA-3943: Address post-merge comments.
..

IMPALA-3943: Address post-merge comments.

Adds code comments and issues a warning for Parquet files
with num_rows=0 but at least one non-empty row group.

Change-Id: I72ccf00191afddb8583ac961f1eaf11e5eb28791
---
M be/src/exec/hdfs-parquet-scanner.cc
M common/thrift/generate_error_codes.py
M testdata/workloads/functional-query/queries/QueryTest/parquet-zero-rows.test
3 files changed, 29 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4696/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4696
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I72ccf00191afddb8583ac961f1eaf11e5eb28791
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] Impala-3342 Adding thread counters to measure time spent during plan fragment execution

2016-10-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Impala-3342 Adding thread counters to measure time spent during 
plan fragment execution
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4633/2//COMMIT_MSG
Commit Message:

Line 14: 
What does the profile look like? Would be helpful to see what the plan fragment 
profile looks like.


http://gerrit.cloudera.org:8080/#/c/4633/2/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

PS2, Line 360: 
SCOPED_THREAD_COUNTER_MEASUREMENT(runtime_state_->plan_fragment_counters());
> Is this thread safe?
Yep, scanner_thread_counters() is also shared between scanner threads.


http://gerrit.cloudera.org:8080/#/c/4633/2/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 213:   ADD_COUNTER(profile(), PER_HOST_PEAK_MEM_COUNTER, TUnit::BYTES);
Let's create the new counter here along with the other ones so that the setup 
happens in the same place.


http://gerrit.cloudera.org:8080/#/c/4633/2/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 241:   ///Fragment thread counters
The comment isn't too helpful. Maybe:

  /// Total CPU utilization for all threads in this plan fragment.


-- 
To view, visit http://gerrit.cloudera.org:8080/4633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-12 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4678/4/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS4, Line 23: normpath
Perhaps parentheses around these. It took me a little bit to parse this.


(join as join_path, normpath)


(Probably would have been easier if I were familiar with normpath.)


PS4, Line 55: os.path.sep + join_path('home', DOCKER_USER_NAME, 'Impala', 
'testdata', 'cluster')
This is a bit long, and make it hard to read. Maybe it could be saved as 
DEFAULT_TESTDATA_VOLUME_PATH?


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3644 Make predicate order deterministic

2016-10-12 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3644 Make predicate order deterministic
..


Patch Set 4:

This change now still passes on Java 7 and fixes all but 3 tests in 
PlannerTest. I will run the test again on Java 8 and attach the resulting error 
log to the Jira, so we can continue to investigate.

Can we go on and merge this one, so it doesn't rot?

-- 
To view, visit http://gerrit.cloudera.org:8080/4671
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id11010bfeaff368869e6d430eeb4773ddf41faff
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3644 Make predicate order deterministic

2016-10-12 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#3).

Change subject: IMPALA-3644 Make predicate order deterministic
..

IMPALA-3644 Make predicate order deterministic

This adds a tie-break to make sure that we sort predicates in a
deterministic order on Java 7 and 8. This was suggested by Alex in
IMPALA-3644.

There are still three broken tests, for which I will open subsequent
Jiras and link them in IMPALA-3644.

Change-Id: Id11010bfeaff368869e6d430eeb4773ddf41faff
---
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/lineage.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
23 files changed, 394 insertions(+), 388 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4671/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4671
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id11010bfeaff368869e6d430eeb4773ddf41faff
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4270: Gracefully fail unsupported queries with mt dop > 0.

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4270: Gracefully fail unsupported queries with mt_dop > 
0.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4677/1/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test:

Line 5:  PLAN
> but it's not executable, because the join sink isn't executable.
We need the current mode for executing joins in subplans where the build will 
hopefully never be in a separate fragment.


-- 
To view, visit http://gerrit.cloudera.org:8080/4677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-12 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#4).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 192 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables

2016-10-12 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-3719: Simplify CREATE TABLE statements with Kudu tables
..


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/4414/5/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

PS5, Line 73: parser.add_option("--kudu_masters", default="127.0.0.1",
Could we import the default value from tests.common instead of hardcoding it?


http://gerrit.cloudera.org:8080/#/c/4414/5/tests/common/kudu_test_suite.py
File tests/common/kudu_test_suite.py:

PS5, Line 38: @SkipIf.kudu_not_supported
This is extremely risky due to bugs in our version of old pytest. See 
IMPALA-3614 for an example of how this is risky. I think you will have to work 
around this by skipping in setup_class.


http://gerrit.cloudera.org:8080/#/c/4414/5/tests/custom_cluster/test_kudu.py
File tests/custom_cluster/test_kudu.py:

PS5, Line 1: # Copyright (c) 2016 Cloudera, Inc. All rights reserved.
Remove Cloudera copyright and make sure the license text is correct otherwise.


http://gerrit.cloudera.org:8080/#/c/4414/5/tests/metadata/test_ddl.py
File tests/metadata/test_ddl.py:

PS5, Line 220: 
Why remove this skip?


PS5, Line 221: 
Is this removal safe?


PS5, Line 221: self.expected_exceptions = 2
I can't see where this is used; delete?


http://gerrit.cloudera.org:8080/#/c/4414/5/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

PS5, Line 87: print("Describe formatted output:")
: pprint(table_desc)
Use LOG.info()


-- 
To view, visit http://gerrit.cloudera.org:8080/4414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4270: Gracefully fail unsupported queries with mt dop > 0.

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#3).

Change subject: IMPALA-4270: Gracefully fail unsupported queries with mt_dop > 
0.
..

IMPALA-4270: Gracefully fail unsupported queries with mt_dop > 0.

MT_DOP > 0 is only supported for plans without distributed joins
or table sinks. Adds validation to fail unsupported queries
gracefully in planning.

For scans in queries that are executable with MT_DOP > 0 we either
use the optimized MT scan node BE implementation (only Parquet), or
we use the conventional scan node with num_scanner_threads=1.

TODO: Still need to add end-to-end tests.

Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
---
M be/src/exec/exec-node.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
10 files changed, 367 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/4677/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-4270: Gracefully fail unsupported queries with mt dop > 0.

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4270: Gracefully fail unsupported queries with mt_dop > 
0.
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/4677/2/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 272:   DCHECK_EQ(state->query_options().num_scanner_threads, 1);
> roll into a single dcheck with a compound predicate
Done


http://gerrit.cloudera.org:8080/#/c/4677/2/common/thrift/PlanNodes.thrift
File common/thrift/PlanNodes.thrift:

Line 205:   
> trailing
Done


Line 207:   // If this is true then the MT_DOP query option must be > 0.
> leave todo that it should be removed later
Done


http://gerrit.cloudera.org:8080/#/c/4677/2/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

Line 314:* Retrurns a set of file formats being scanned.
> spelling
Done


Line 562: msg.hdfs_scan_node.setUse_mt_scan_nodeIsSet(useMtScanNode_);
> ..IsSet? it doesn't look like you're calling the actual setter
Weird. That seemed to work also. Fixed.


http://gerrit.cloudera.org:8080/#/c/4677/2/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 206: if (ctx_.getQueryOptions().mt_dop > 0) {
> the if is redundant, but you can make it a checkstate at the start of the f
Added Preconditions check.


http://gerrit.cloudera.org:8080/#/c/4677/2/fe/src/main/java/org/apache/impala/planner/PlannerContext.java
File fe/src/main/java/org/apache/impala/planner/PlannerContext.java:

Line 83:   public TQueryOptions getQueryOptions() { return 
getRootAnalyzer().getQueryOptions(); }
> why the change?
I added a convenience function Analyzer.getQueryOptions() and thought it would 
make sense to use it here.


http://gerrit.cloudera.org:8080/#/c/4677/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 155:* - MT_DOP > 0 is not supported for plans with distributed joins 
or table sinks.
> it's only supported for QueryStmts. take a look at Frontend.createExecReque
My thinking was that from a user's perspective running a statement with mt_dop 
> 0 should have one of two outcomes:
1. The statement runs fine in multi-threaded mode.
2. The statement returns an error indicating that mt_dop is not supported for 
that statement.

If that's the behavior we want, then it's irrelevant whether we can actually 
generate parallel plans for inserts/ctas.


http://gerrit.cloudera.org:8080/#/c/4677/2/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test:

Line 51: |  tuple-ids=0,1 row-size=8B cardinality=unavailable
> what happened here?
functional_parquet has no stats.

I preferred to write the tests on Parquet because that's the  preferred file 
format, happy to change it back or add more tests?


-- 
To view, visit http://gerrit.cloudera.org:8080/4677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4047: Remove occurrences of 'CDH'/'cdh' from repo

2016-10-12 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4047: Remove occurrences of 'CDH'/'cdh' from repo
..


Patch Set 9: Code-Review+2

Thanks Lars!

-- 
To view, visit http://gerrit.cloudera.org:8080/4187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icb37e2ef0cd9fa0e581d359c5dd3db7812b7b2c8
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2789: More compact mem layout with null bits at the end.

2016-10-12 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-2789: More compact mem layout with null bits at the end.
..


Patch Set 6: Code-Review+1

(2 comments)

Thanks! LGTM, just a few comments for the commit msg. I didn't look at the 
codegen code changes in detail.

http://gerrit.cloudera.org:8080/#/c/4673/6//COMMIT_MSG
Commit Message:

Line 9: The main motivation of this change is to pave the way for full
There are 2 main motivations of this change, no?
1) More compact memory layout.
2) Zero copy (or at least avoiding conversions) with Kudu scans... (as you 
already have here)


Line 12: that goal.
Can you mention what is left after this? 
My understanding is that it is (correct me if I'm wrong):
1) planner needs to set null flags for nullable columns
2) kudu scan node can now _copy_ mem, i.e. rip out conversion code
2) kudu client needs to support mem xfer
3) kudu scan node updated to use mem xfer instead of copying
(KUDU-1695)


-- 
To view, visit http://gerrit.cloudera.org:8080/4673
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4259: build Impala without any test cluster setup.

2016-10-12 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4259: build Impala without any test cluster setup.
..


Patch Set 4: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4685
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I429da7bc6681b16c07fe58bb3efac6d1a8579137
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment

2016-10-12 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment
..


Patch Set 4: Code-Review+1

(1 comment)

Carry David's +1

http://gerrit.cloudera.org:8080/#/c/4674/3/bin/bootstrap_development.sh
File bin/bootstrap_development.sh:

PS3, Line 24: # The intended user is a person who wants to start contributing 
code to Impala. This
: # script serves as an executable reference point for how to get 
started.
> I guess I would also add, in some fashion, the following assumptions that t
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If166a8a286d7559af547da39f6cc09e723f34c7e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment

2016-10-12 Thread Jim Apple (Code Review)
Hello David Knupp,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4674

to look at the new patch set (#4).

Change subject: IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment
..

IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment

This script bootstraps an Impala dev environment on Ubuntu 14.04. It
is not hermetic -- it changes some config files for the user and for
the OS.

It is green on Jenkins, and it runs in about 6.5 hours. The intention
is to have this script run in a CI tool for post-commit testing, with
the hope that this will make it easier for new developers to get a
working development environment. Previously, the new developer
workflow lived on wiki pages and tended to bit-rot.

Still left to do: migrating the install script into the official
Impala repo.

Change-Id: If166a8a286d7559af547da39f6cc09e723f34c7e
---
A bin/bootstrap_development.sh
1 file changed, 80 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/4674/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If166a8a286d7559af547da39f6cc09e723f34c7e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-3943: Do not throw scan errors for empty Parquet files.

2016-10-12 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3943: Do not throw scan errors for empty Parquet files.
..


IMPALA-3943: Do not throw scan errors for empty Parquet files.

For Parquet files with no row groups but with num_rows=0 in the
file footer the Parquet scanner returns an error indicating
that the file is invalid. This behavior is a regression from
previous Impala versions which used to accept such files.

This patch restores the previous behavior and adds tests.

Change-Id: I50ac3df6ff24bc5c384ef22e0f804a5132adb62e
Reviewed-on: http://gerrit.cloudera.org:8080/4693
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
M testdata/data/README
A testdata/data/zero_rows_one_row_group.parquet
A testdata/data/zero_rows_zero_row_groups.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-zero-rows.test
M tests/query_test/test_scanners.py
6 files changed, 65 insertions(+), 1 deletion(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I50ac3df6ff24bc5c384ef22e0f804a5132adb62e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-3943: Do not throw scan errors for empty Parquet files.

2016-10-12 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3943: Do not throw scan errors for empty Parquet files.
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I50ac3df6ff24bc5c384ef22e0f804a5132adb62e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4274: hang in buffered-block-mgr-test

2016-10-12 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4274: hang in buffered-block-mgr-test
..


IMPALA-4274: hang in buffered-block-mgr-test

We started seeing hangs in CreateDestroyMulti() where a
thread was recursively acquiring static_block_mgrs_lock_.
This is only possible because a shared_ptr is destroyed
while holding the lock.

The fix is to reset the shared_ptr only after releasing
the lock.

Testing:
I was unable to reproduce the hang locally, but the
callstack in the JIRA was a strong enough smoking gun
to feel confident that this should fix the hang.

Change-Id: I21f3da1d09cdd101a28ee850f46f24acd361b604
Reviewed-on: http://gerrit.cloudera.org:8080/4690
Reviewed-by: Alex Behm 
Reviewed-by: Marcel Kornacker 
Tested-by: Internal Jenkins
---
M be/src/runtime/buffered-block-mgr.cc
1 file changed, 7 insertions(+), 3 deletions(-)

Approvals:
  Marcel Kornacker: Looks good to me, approved
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/4690
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I21f3da1d09cdd101a28ee850f46f24acd361b604
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-4274: hang in buffered-block-mgr-test

2016-10-12 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4274: hang in buffered-block-mgr-test
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4690
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I21f3da1d09cdd101a28ee850f46f24acd361b604
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] Reduce LLVM module's preparation time by lazily creating the IRFunction::Type to LLVM::Function* mappings.

2016-10-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: Reduce LLVM module's preparation time by lazily creating the 
IRFunction::Type to LLVM::Function* mappings.
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4691/1/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

Line 206:   for (int i = IRFunction::FN_START; i < IRFunction::FN_END; ++i) {
> Maybe should be a new function like ValidateFunctionMappings()?
IMHO, this seems pretty readable  and self-contained without factoring it into 
a separate function. Will add some comments though.


Line 209: if (init_codegen->module_->getFunction(fn_name) == NULL) {
> We shouldn't hit this unless we messed up the build or the function names r
Or if the in memory copy of the bitcode is corrupted somehow. That said, it can 
get corrupted any time since this check and GetFunction() will return NULL if 
the bitcode is corrupted somehow so may be okay to keep it as a DCHECK.


PS1, Line 407: StringRef
> Should be able to construct the StringRef from the std::string directly wit
Done


Line 668: StringRef fn_name(FN_MAPPINGS[ir_type].fn_name.c_str());
> This conversion shouldn't be necessary either.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4691
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Reduce LLVM module's preparation time by lazily creating the IRFunction::Type to LLVM::Function* mappings.

2016-10-12 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#2).

Change subject: Reduce LLVM module's preparation time by lazily creating the 
IRFunction::Type to LLVM::Function* mappings.
..

Reduce LLVM module's preparation time by lazily
creating the IRFunction::Type to LLVM::Function*
mappings.

Previously, when creating a LlvmCodeGen object, we
run an O(mn) algorithm to map the IRFunction::Type
to the actual LLVM::Function object in the module.
m is the size of IRFunction::Type enum and n is
the total number of functions in the module. This
is a waste of time if we only use few functions
from the module.

This change reduces the preparation time of a simple
query from 23ms to 10ms.

select count(*) from tpch100_parquet.lineitem where l_orderkey > 20;

Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
2 files changed, 175 insertions(+), 123 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/4691/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4691
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3983/IMPALA-3974: Delete function jar resources after load

2016-10-12 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-3983/IMPALA-3974: Delete function jar resources after 
load
..


Patch Set 5:

Henry, do you have any further comments on this one? Thanks.

-- 
To view, visit http://gerrit.cloudera.org:8080/4617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5f9dedb5b342415380c83e61a72eb497371a8199
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()

2016-10-12 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#3).

Change subject: IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()
..

IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()

This patch is mostly mechanical move of codegen related logic
from each exec node's Prepare() to its Codegen() function.
After this change, code generation will no longer happen in
Prepare(). Instead, it will happen after Prepare() completes in
PlanFragmentExecutor. This is an intermediate step towards the
final goal of sharing compiled code among fragment instances in
multi-threading.

As part of the clean up, this change also removes the logic for
lazy codegen object creation. In other words, if codegen is enabled,
the codegen object will always be created. This simplifies some
of the logic in ScalarFnCall::Prepare() and various Codegen()
functions by reducing error checking needed. This change also
removes the logic added for tackling IMPALA-1755 as it's not
needed anymore after the clean up.

The clean up also rectifies a not so well documented situation.
Previously, even if a user explicitly sets DISABLE_CODEGEN to true,
we may still codegen a UDF if it was written in LLVM IR or if it
has more than 8 arguments. This patch enforces the query option
by failing the query in both cases. To run the query, the user
must enable codegen. This change also extends the number of
arguments supported in the interpretation path of ScalarFn to 20.

Change-Id: I207566bc9f4c6a159271ecdbc4bbdba3d78c6651
---
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregation-node.h
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/old-hash-table.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/exprs/case-expr.cc
M be/src/exprs/case-expr.h
M be/src/exprs/compound-predicates.cc
M be/src/exprs/compound-predicates.h
M be/src/exprs/conditional-functions.cc
M be/src/exprs/conditional-functions.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
M be/src/exprs/is-not-empty-predicate.cc
M be/src/exprs/is-not-empty-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/literal.h
M be/src/exprs/null-literal.cc
M be/src/exprs/null-literal.h
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/exprs/slot-ref.cc
M be/src/exprs/slot-ref.h
M be/src/exprs/tuple-is-null-predicate.cc
M be/src/exprs/tuple-is-null-predicate.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/runtime/tuple.cc
M be/src/runtime/tuple.h
M be/src/service/fe-support.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M be/src/util/tuple-row-compare.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/workloads/functional-query/queries/QueryTest/udf-errors.test
M testdata/workloads/functional-query/queries/QueryTest/udf.test
M tests/query_test/test_udfs.py
69 files changed, 597 insertions(+), 673 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/4651/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4651
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I207566bc9f4c6a159271ecdbc4bbdba3d78c6651
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong