[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others

2016-10-13 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all 
others
..


Patch Set 13: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4402/12/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 1522:   // prevent others from cancelling a second time
> I think it's ok the cancel fragments without the query being in error. The 
good explanation. could you put that somewhere, i agree that there's a bit of 
confusion around query states


http://gerrit.cloudera.org:8080/#/c/4402/12/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

Line 321: class HS2RowOrientedResultSet : public QueryResultSet {
> I think I'll do that separately, if you agree. Although it's mostly mechani
fine by me, but please make sure to do it right after this goes in.


http://gerrit.cloudera.org:8080/#/c/4402/12/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

Line 705: while ((num_rows < max_rows || max_rows <= 0)
where is this being done now?


-- 
To view, visit http://gerrit.cloudera.org:8080/4402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4123: Fast bit unpacking

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking
..


Patch Set 9:

> (11 comments)

It looks like you might have missed my comments on Base and PS7 in this group 
of comments.

-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() 
function.
..


Patch Set 8:

(24 comments)

http://gerrit.cloudera.org:8080/#/c/4474/8//COMMIT_MSG
Commit Message:

Line 14: All syntaxes confirm the standard SQL syntax (Core SQL feature ID
Please describe the existing syntax, too, and say that it already works.


PS8, Line 18: trailing
Please capitalize these.


PS8, Line 18: leading
Please try to match our existing formatting choices. This includes breaking 
lines at the 70-character mark, not arbitrarily early as in line 17.


PS8, Line 19:   
Do not indent lines after the first one in a paragraph. Please use git log to 
study what previous commit messages looked like.


PS8, Line 23: option
this word is not needed.


PS8, Line 22: empty
:   argument("" or '')
just say "the empty string"


PS8, Line 24: Syntax
Why is this word capitalized, here and below


PS8, Line 24: target
"target" was not used earlier. Do you mean string_to_be_trimmed?


PS8, Line 31: (standard space)
You don't have to say "standard space" here or below. I was asking about what 
the standard says. Are you sure it just says " "? What do postgres and mysql do?


PS8, Line 43: abcdefg
Make the parameter strings even shorter


PS8, Line 46: Syntax #2:
Please separate these sections by blank lines.


PS8, Line 51: "
What is this doing here? Same above. Please try to be more consciencious about 
these things.


http://gerrit.cloudera.org:8080/#/c/4474/8/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

PS8, Line 1977:  
Make the test cases even smaller.


http://gerrit.cloudera.org:8080/#/c/4474/8/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

PS8, Line 799: static
static in an unnamed namespace is redundant.


PS8, Line 817: static
Put this in the unnamed namespace


PS8, Line 817: *
&


PS8, Line 818: begin
Is this always 0? Here and below


PS8, Line 838: int begin = 0;
 :   begin = 
Make this one line


PS8, Line 849: StringVal
Does this do the right thing is is_null is true but ptr and len are set?


PS8, Line 862: option.ptr[i]
incorrect argument type


PS8, Line 865: trim_option
This is still not done at parsing/analysis time. That work happens in the 
front-end.


Line 893:   if (trim_option == TrimOption::LEADING || trim_option == 
TrimOption::BOTH) {
trim_option | TrimOption::LEADING


http://gerrit.cloudera.org:8080/#/c/4474/8/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

Line 429:   [['trim'], 'STRING', ['STRING'], 'impala::StringFunctions::Trim'],
Is this line still needed, or is it covered by the new grammar rules?


Line 430:   [['trim'], 'STRING', ['STRING', 'STRING', 'STRING'], 
'impala::StringFunctions::TrimStringWithOption',
Did you try changing this name and making it invisible? What happens?


-- 
To view, visit http://gerrit.cloudera.org:8080/4474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Youwei Wang 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4270: Gracefully fail unsupported queries with mt dop > 0.

2016-10-13 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4270: Gracefully fail unsupported queries with mt_dop > 
0.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4677/1/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test:

Line 5:  PLAN
> We need the current mode for executing joins in subplans where the build wi
the build registration, etc., isn't expected to get in before code freeze, so 
we won't be able to run this. i don't see an alternative to returning an error 
here.


-- 
To view, visit http://gerrit.cloudera.org:8080/4677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4123: Fast bit unpacking

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking
..


Patch Set 7:

(3 comments)

Sorry about that - I wish there was a good way to see whether I've responded to 
all of the comments on different patchset verisons.

http://gerrit.cloudera.org:8080/#/c/4494/7/be/src/util/bit-packing-test.cc
File be/src/util/bit-packing-test.cc:

PS7, Line 42: misaligned
> I think it would help the reader here to spell out that misaligned is with 
Done


PS7, Line 132: min(length, 1)
> Still, if we're only allowing those two you could switch it to a bool 'misa
Made the change, then switched to 'aligned' to avoid having to think about a 
double negative.


http://gerrit.cloudera.org:8080/#/c/4494/7/be/src/util/bit-packing.inline.h
File be/src/util/bit-packing.inline.h:

PS7, Line 69: in_bytes -= batch_size * (BIT_WIDTH / CHAR_BIT);
> Should there be an additional test that would demonstrate that bug?
Running the test under ASAN caught it once I fixed the bug in the test code. 
The bug was exactly the kind of bug I was trying to catch with the test so we 
do have the coverage. It seems weird (although not entirely without merit) to 
add a regression test for a test bug.


-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4123: Fast bit unpacking

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#10).

Change subject: IMPALA-4123: Fast bit unpacking
..

IMPALA-4123: Fast bit unpacking

Adds utility functions for fast unpacking of batches of bit-packed
values. These support reading batches of any number of values provided
that the start of the batch is aligned to a byte boundary. Callers that
want to read smaller batches that don't align to byte boundaries will
need to implement their own buffering.

The unpacking code uses only portable C++ and no SIMD intrinsics, but is
fairly efficient because unpacking a full batch of 32 values compiles
down to 32-bit loads, shifts by constants, masks by constants, bitwise
ors when a value straddles 32-bit words and stores. Further speedups
should be possible using SIMD intrinsics.

Testing:
Added unit tests for unpacking, exhaustively covering different
bitwidths with additional test dimensions (memory alignment, various
input sizes, etc).

Tested under ASAN to ensure the bit unpacking doesn't read past the end
of buffers.

Perf:
Added microbenchmark that shows on average an 8-9x speedup over the
existing BitReader code.

Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/bit-packing-benchmark.cc
M be/src/benchmarks/bswap-benchmark.cc
A be/src/testutil/mem-util.h
M be/src/util/CMakeLists.txt
A be/src/util/bit-packing-test.cc
A be/src/util/bit-packing.h
A be/src/util/bit-packing.inline.h
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
10 files changed, 884 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4494/10
-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Bump Bzip2 version

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4716

Change subject: Bump Bzip2 version
..

Bump Bzip2 version

This picks up the latest toolchain version. The only change is that
some symlinks in the previous version were broken.

Change-Id: I0c5e9ef10984fc8c6840acf285a04e472fc8b304
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/4716/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4716
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0c5e9ef10984fc8c6840acf285a04e472fc8b304
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] Bump Bzip2 version

2016-10-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: Bump Bzip2 version
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4716
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0c5e9ef10984fc8c6840acf285a04e472fc8b304
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] Bump Bzip2 version

2016-10-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: Bump Bzip2 version
..


Patch Set 1: Code-Review+1

Whoops, I had completely forgotten about fixing this on this end. Thanks.

-- 
To view, visit http://gerrit.cloudera.org:8080/4716
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0c5e9ef10984fc8c6840acf285a04e472fc8b304
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#6).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 199 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 6:

Patch set 6 is a rebase, and conflicts with the commit of 
https://gerrit.cloudera.org/#/c/4187/ were fixed.

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Michael Brown (Code Review)
Michael Brown has uploaded a new patch set (#7).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to mount an external Docker volume as a workaround to KUDU-1419.
This patch introduces a series of environment variables a user may tweak
in order to help with that purpose. The patch assumes a viable,
reasonable Docker container based on a standard Linux distribution like
Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need to copy the testdata off the
container onto the host running Docker Engine. To do that we suggest a
strategy using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 210 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4678/5//COMMIT_MSG
Commit Message:

PS5, Line 10: use
> Would "mount" be an appropriate verb. If so, I think it's more descriptive.
Done


PS5, Line 19: need the
> "need to move the"?
Done, but it's not a "move"; it's a copy.


PS5, Line 21: rsync
> Why rsync versus, say, just cp or mv?
This is over a networking virtual layer, so cp won't work. Again, we're not 
moving data, we're copying it. rsync is more efficient than scp once you've 
warmed a volume once. It's also convenient to be able to use rsync -a, --chown, 
and the usage is a lot better than SCP. If you want to do any serious 
transferring of data, rsync is the better tool.


http://gerrit.cloudera.org:8080/#/c/4678/5/tests/comparison/leopard/README
File tests/comparison/leopard/README:

PS5, Line 17: Basic Configuration
> It's a nit, but it might be nicer if this line, and "External Volume Config
Done


PS5, Line 43: use
> "mount external Docker volumes that contain the necessary testdata"?
Done


PS5, Line 54: path on TARGET_HOST where the
: external volume will reside
> This is just a directory, right?
Yes, and Leopard is responsible for creating it.


http://gerrit.cloudera.org:8080/#/c/4678/5/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

Line 298: if os.environ.get('KUDU_IS_SUPPORTED') == 'true':
> Does it makes sense to reference KUDU-1419 here, or at least the README, to
Done


Line 308: 'rsync -e "ssh -i {priv_key} -o StrictHostKeyChecking=no '
> I'm just curious -- how long does this process take?
I timed it last week and it was about 20 minutes in the cold case; almost 
instantly in the very hot case.


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()

2016-10-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()
..


Patch Set 3:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/hash-join-node.cc
File be/src/exec/hash-join-node.cc:

Line 172:   Function* codegen_process_probe_batch_fn = 
CodegenProcessProbeBatch(codegen, hash_fn);
> Long line.
Done


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 622: return Status("Disabled by query option.");
> Dead code I think.
Done


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

Line 153: Status HdfsScanNodeBase::Prepare(RuntimeState* state) {
> It doesn't look like we add the codegen disabled message hre.
Done


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

Line 700: return Status("Disabled by query option.");
> Dead code?
Done


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS3, Line 49: #define MAX_INTERP_ARGS 20
> Comments added in the next patch.
Done


Line 562: #define ARG_DECL_LIST(n) \
> Thanks for doing this.
Done


PS3, Line 574: MAX_INTERP_ARGS
> MAX_INTERP_ARGS+1
Done


PS3, Line 592: MAX_INTERP_ARGS
> MAX_INTERP_ARGS+1
Done


http://gerrit.cloudera.org:8080/#/c/4651/3/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

Line 104:   state.CreateCodegen(), env, JniUtil::internal_exc_class(), 
result_bytes);
> Tab?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4651
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I207566bc9f4c6a159271ecdbc4bbdba3d78c6651
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()

2016-10-13 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#4).

Change subject: IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()
..

IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()

This patch is mostly mechanical move of codegen related logic
from each exec node's Prepare() to its Codegen() function.
After this change, code generation will no longer happen in
Prepare(). Instead, it will happen after Prepare() completes in
PlanFragmentExecutor. This is an intermediate step towards the
final goal of sharing compiled code among fragment instances in
multi-threading.

As part of the clean up, this change also removes the logic for
lazy codegen object creation. In other words, if codegen is enabled,
the codegen object will always be created. This simplifies some
of the logic in ScalarFnCall::Prepare() and various Codegen()
functions by reducing error checking needed. This change also
removes the logic added for tackling IMPALA-1755 as it's not
needed anymore after the clean up.

The clean up also rectifies a not so well documented situation.
Previously, even if a user explicitly sets DISABLE_CODEGEN to true,
we may still codegen a UDF if it was written in LLVM IR or if it
has more than 8 arguments. This patch enforces the query option
by failing the query in both cases. To run the query, the user
must enable codegen. This change also extends the number of
arguments supported in the interpretation path of ScalarFn to 20.

Change-Id: I207566bc9f4c6a159271ecdbc4bbdba3d78c6651
---
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregation-node.h
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-join-node.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/old-hash-table.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/exprs/case-expr.cc
M be/src/exprs/case-expr.h
M be/src/exprs/compound-predicates.cc
M be/src/exprs/compound-predicates.h
M be/src/exprs/conditional-functions.cc
M be/src/exprs/conditional-functions.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
M be/src/exprs/is-not-empty-predicate.cc
M be/src/exprs/is-not-empty-predicate.h
M be/src/exprs/literal.cc
M be/src/exprs/literal.h
M be/src/exprs/null-literal.cc
M be/src/exprs/null-literal.h
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/exprs/slot-ref.cc
M be/src/exprs/slot-ref.h
M be/src/exprs/tuple-is-null-predicate.cc
M be/src/exprs/tuple-is-null-predicate.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/runtime/tuple.cc
M be/src/runtime/tuple.h
M be/src/service/fe-support.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M be/src/util/tuple-row-compare.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/workloads/functional-query/queries/QueryTest/udf-errors.test
M testdata/workloads/functional-query/queries/QueryTest/udf.test
M tests/query_test/test_udfs.py
69 files changed, 639 insertions(+), 686 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/4651/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4651
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I207566bc9f4c6a159271ecdbc4bbdba3d78c6651
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()
..


Patch Set 4: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4651
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I207566bc9f4c6a159271ecdbc4bbdba3d78c6651
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others

2016-10-13 Thread Henry Robinson (Code Review)
Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4402

to look at the new patch set (#14).

Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all 
others
..

IMPALA-2905: Handle coordinator fragment lifecycle like all others

The plan-root fragment instance that runs on the coordinator should be
handled like all others: started via RPC and run asynchronously. Without
this, the fragment requires special-case code throughout the
coordinator, and does not show up in system metrics etc.

This patch adds a new sink type, PlanRootSink, to the root fragment
instance so that the coordinator can pull row batches that are pushed by
the root instance. The coordinator signals completion to the fragment
instance via closing the consumer side of the sink, whereupon the
instance is free to complete.

Since the root instance now runs asynchronously wrt to the coordinator,
we add several coordination methods to allow the coordinator to wait for
a point in the instance's execution to be hit - e.g. to wait until the
instance has been opened.

Done in this patch:

* Add PlanRootSink
* Add coordination to PFE to allow coordinator to observe lifecycle
* Make FragmentMgr a singleton
* Removed dead code from Coordinator::Wait() and elsewhere.
* Moved result output exprs out of QES and into PlanRootSink.
* Remove special-case limit-based teardown of coordinator fragment, and
  supporting functions in PlanFragmentExecutor.
* Simplified lifecycle of PlanFragmentExecutor by separating Open() into
  OpenPlan() and Exec(), the latter of which drives the sink by reading
  rows from the plan tree.
* Add child profile to PlanFragmentExecutor to measure time spent in
  each lifecycle phase.
* Removed dependency between InitExecProfiles() and starting root fragment.

Not yet done:

* Fix planner tests to reflect new sink added at root.

Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
---
M be/src/exec/CMakeLists.txt
M be/src/exec/data-sink.cc
A be/src/exec/plan-root-sink.cc
A be/src/exec/plan-root-sink.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler.cc
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
M be/src/service/fragment-mgr.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
A be/src/service/query-result-set.h
M be/src/testutil/in-process-servers.cc
M common/thrift/DataSinks.thrift
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
A fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/complex-types-file-formats.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
M testdata/workloads/functional-planner/queries/PlannerTest/constant.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/disable-preaggregations.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/distinct-estimate.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mem-limit-broadcast-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test
M testdata/workloads/functional-pl

[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others

2016-10-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all 
others
..


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4402/12/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 1522:   // should explicitly pass Status::OK()). See IMPALA-4279.
> good explanation. could you put that somewhere, i agree that there's a bit 
Done


http://gerrit.cloudera.org:8080/#/c/4402/12/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

Line 705:   // higher in the call-stack.
> where is this being done now?
Changed the DCHECK to return an error Status, since we can't control the 
clients.


-- 
To view, visit http://gerrit.cloudera.org:8080/4402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others

2016-10-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all 
others
..


Patch Set 15: Code-Review+2

Rebase, carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 7: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3002/1473: Cardinality observability cleanup

2016-10-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3002/1473: Cardinality observability cleanup
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4679/2//COMMIT_MSG
Commit Message:

PS2, Line 7: 1473
nit: can you separate this into separate IMPALA-1473, sometimes ppl are 
grepping for a full JIRA reference.


http://gerrit.cloudera.org:8080/#/c/4679/2/shell/impala_client.py
File shell/impala_client.py:

PS2, Line 145: if node.is_broadcast:
I think you need to update impala_beeswax.py which duplicates this code 
(unfortunately) as well. Can you add a comment in the fn header that 
impala_beeswax.py copies this and changes should be made in both places.

I actually thought the test case you have would have exercised the fix but if 
you hadn't changed the impala_beeswax location then maybe not. It may need to 
be exercised with a different query.


http://gerrit.cloudera.org:8080/#/c/4679/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

PS2, Line 20: TestObservability
A bit odd but I don't see a great home for this test in an existing class. 
Maybe someone else will have a suggestion.


PS2, Line 25: test_merge_exchange_with_limit
test_merge_exchange_num_rows


PS2, Line 26: IMPALA-1473 
I think this tests both 1473 and IMPALA-3002. Can you verify that with a print 
in the impala_beeswax.py code you changed, and update this message if that's 
the case?


-- 
To view, visit http://gerrit.cloudera.org:8080/4679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR](hadoop-next) IMPALA-4277: bump Hadoop component versions except for Hadoop itself

2016-10-13 Thread Charlie Helin (Code Review)
Charlie Helin has posted comments on this change.

Change subject: IMPALA-4277: bump Hadoop component versions except for Hadoop 
itself
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4698/1/bin/impala-config.sh
File bin/impala-config.sh:

Line 299
Should this not also be bumped?

Also I would make these configurable using something like:

export FOO_VERSION=${FOO_VERSION:-2.0.0-...}


-- 
To view, visit http://gerrit.cloudera.org:8080/4698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0be6bbe76d929ceaeb2fa9ac4ebba1820c4dab7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: hadoop-next
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Charlie Helin 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR](hadoop-next) IMPALA-4277: bump Hadoop component versions except for Hadoop itself

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4277: bump Hadoop component versions except for Hadoop 
itself
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4698/1/bin/impala-config.sh
File bin/impala-config.sh:

Line 299
> Should this not also be bumped?
It's bumped in https://gerrit.cloudera.org/#/c/4701/ - I wanted that patch to 
be separate because it adds some nasty hacks.

I'll look at making that change on master, I think it's a good suggestion but 
we really need to sort out other issues before we can build against arbitrary 
versions without doing gymnastics to build component tarballs.


-- 
To view, visit http://gerrit.cloudera.org:8080/4698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0be6bbe76d929ceaeb2fa9ac4ebba1820c4dab7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: hadoop-next
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Charlie Helin 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4123: Fast bit unpacking

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking
..


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4494/10/be/src/util/bit-packing-test.cc
File be/src/util/bit-packing-test.cc:

PS10, Line 39: it
them


PS10, Line 39: Both
Both buffers


Line 60:   uint8_t* packed = storage.data() + aligned;
If aligned is true, then packed is not aligned


http://gerrit.cloudera.org:8080/#/c/4494/10/be/src/util/bit-packing.inline.h
File be/src/util/bit-packing.inline.h:

Line 41: #define UNPACK_VALUES_CASE(ignore1, i, ignore2) \
Here is a way to do this without any macros, but it's a bit heavyweight from a 
syntax POV. I found it was roughly the same speed as the macro approach:

--- a/be/src/util/bit-packing.inline.h
+++ b/be/src/util/bit-packing.inline.h
@@ -31,10 +31,33 @@
 
 namespace impala {
 
+template 
+constexpr std::array), sizeof...(I)> MakeArray(
+std::integer_sequence) {
+  return {&T::template f...};
+}
+
+template 
+struct BitPackStruct {
+  template 
+  static auto f(const uint8_t* __restrict__ in, int64_t in_bytes, int64_t 
num_values,
+  OutType* __restrict__ out) {
+return BitPacking::UnpackValues(in, in_bytes, num_values, 
out);
+  }
+};
+
 template 
 std::pair BitPacking::UnpackValues(int bit_width,
 const uint8_t* __restrict__ in, int64_t in_bytes, int64_t num_values,
 OutType* __restrict__ out) {
+  static constexpr auto UNPACKERS =
+  MakeArray>(std::make_integer_sequence());
+  if (bit_width < 0 || bit_width > 32) {
+DCHECK(false);
+return {nullptr, 0};
+  }
+  return UNPACKERS[bit_width](in, in_bytes, num_values, out);

Similar things can be done below. Line 146 can be done without 
std::make_integer_sequence.


-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3943: Address post-merge comments.

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3943: Address post-merge comments.
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4696
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I72ccf00191afddb8583ac961f1eaf11e5eb28791
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3342, IMPALA-3920: Adding thread counters to obtain thread stats per scanner thread, and to measure time spent in per-fragment-executor Open() and ProcessBuildInputAsync calls

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3342, IMPALA-3920:  Adding  thread counters to obtain 
thread stats per scanner thread, and to measure time spent in 
per-fragment-executor Open() and ProcessBuildInputAsync calls
..


Patch Set 1:

Is this superseded by https://gerrit.cloudera.org/#/c/4633/?

-- 
To view, visit http://gerrit.cloudera.org:8080/4561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1b148eafb76319e727e8d0ef33c49b300b0c887f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Reduce LLVM module's preparation time by lazily creating the IRFunction::Type to LLVM::Function* mappings.

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Reduce LLVM module's preparation time by lazily creating the 
IRFunction::Type to LLVM::Function* mappings.
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4691
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4080, IMPALA-3638: Introduce ExecNode::Codegen()
..


Patch Set 4:

It would be good to have an additional pair of eyes since it's a significant 
behaviour change.

-- 
To view, visit http://gerrit.cloudera.org:8080/4651
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I207566bc9f4c6a159271ecdbc4bbdba3d78c6651
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4291: Reduce LLVM module's preparation time

2016-10-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4291: Reduce LLVM module's preparation time
..


Patch Set 4: Code-Review+2

Carry +2 forward.

-- 
To view, visit http://gerrit.cloudera.org:8080/4691
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4291: Reduce LLVM module's preparation time

2016-10-13 Thread Michael Ho (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4691

to look at the new patch set (#4).

Change subject: IMPALA-4291: Reduce LLVM module's preparation time
..

IMPALA-4291: Reduce LLVM module's preparation time

Previously, when creating a LlvmCodeGen object, we
run an O(mn) algorithm to map the IRFunction::Type
to the actual LLVM::Function object in the module.
m is the size of IRFunction::Type enum and n is
the total number of functions in the module. This
is a waste of time if we only use few functions
from the module.

This change reduces the preparation time of a simple
query from 23ms to 10ms.

select count(*) from tpch100_parquet.lineitem where l_orderkey > 20;

Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
2 files changed, 175 insertions(+), 123 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/4691/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4691
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4231: fix codegen time regression

2016-10-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4231: fix codegen time regression
..


Patch Set 3: Code-Review+2

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4623/3/be/src/exec/partitioned-hash-join-builder-ir.cc
File be/src/exec/partitioned-hash-join-builder-ir.cc:

PS3, Line 35: if (!status->ok())
This path seems to be hot enough to warrant if (UNLIKELY(!status->ok()).


Line 51: if (!AppendRow(null_aware_partition_->build_rows(), build_row, 
&status)) {
UNLIKELY ?


Line 72: if (!AppendRow(partition->build_rows(), build_row, &status)) {
UNLIKELY ?


http://gerrit.cloudera.org:8080/#/c/4623/3/be/src/exec/partitioned-hash-join-node-ir.cc
File be/src/exec/partitioned-hash-join-node-ir.cc:

Line 286:   if (!AppendProbeRow(null_probe_rows_.get(), 
current_probe_row_, status)) {
May warrant an UNLIKELY here.


Line 312:   if (!AppendProbeRow(probe_rows, current_probe_row_, 
status)) {
May warrant an UNLIKELY here.


http://gerrit.cloudera.org:8080/#/c/4623/3/be/src/runtime/raw-value.inline.h
File be/src/runtime/raw-value.inline.h:

PS3, Line 215: RawValue::PrintValue
It appears that this really shouldn't be inlined either.


-- 
To view, visit http://gerrit.cloudera.org:8080/4623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idf0fdedabd488550b6db90167a30c582949d608d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3342, IMPALA-3920: Adding thread counters to obtain thread stats per scanner thread, and to measure time spent in per-fragment-executor Open() and ProcessBuildInputAsync calls

2016-10-13 Thread anujphadke (Code Review)
anujphadke has abandoned this change.

Change subject: IMPALA-3342, IMPALA-3920:  Adding  thread counters to obtain 
thread stats per scanner thread, and to measure time spent in 
per-fragment-executor Open() and ProcessBuildInputAsync calls
..


Abandoned

Code review moved here -https://gerrit.cloudera.org/#/c/4633/

-- 
To view, visit http://gerrit.cloudera.org:8080/4561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I1b148eafb76319e727e8d0ef33c49b300b0c887f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1702: Enforce single-table consistency in query analysis.

2016-10-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1702: Enforce single-table consistency in query analysis.
..


Patch Set 15:

Marcel, any more comments?

-- 
To view, visit http://gerrit.cloudera.org:8080/4349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifad648b72684ae495ec387590ab1bc58ce5b39e2
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Huaisi Xu 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4231: fix codegen time regression

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4231: fix codegen time regression
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4623/3/be/src/exec/partitioned-hash-join-builder-ir.cc
File be/src/exec/partitioned-hash-join-builder-ir.cc:

PS3, Line 35: if (!status->ok())
> This path seems to be hot enough to warrant if (UNLIKELY(!status->ok()).
Done.


Line 51: if (!AppendRow(null_aware_partition_->build_rows(), build_row, 
&status)) {
> UNLIKELY ?
Done


Line 72: if (!AppendRow(partition->build_rows(), build_row, &status)) {
> UNLIKELY ?
Done


http://gerrit.cloudera.org:8080/#/c/4623/3/be/src/exec/partitioned-hash-join-node-ir.cc
File be/src/exec/partitioned-hash-join-node-ir.cc:

Line 286:   if (!AppendProbeRow(null_probe_rows_.get(), 
current_probe_row_, status)) {
> May warrant an UNLIKELY here.
Done


Line 312:   if (!AppendProbeRow(probe_rows, current_probe_row_, 
status)) {
> May warrant an UNLIKELY here.
Done


http://gerrit.cloudera.org:8080/#/c/4623/3/be/src/runtime/raw-value.inline.h
File be/src/runtime/raw-value.inline.h:

PS3, Line 215: RawValue::PrintValue
> It appears that this really shouldn't be inlined either.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idf0fdedabd488550b6db90167a30c582949d608d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4231: fix codegen time regression

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4231: fix codegen time regression
..


Patch Set 5: Code-Review+2

Carry Michael's +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idf0fdedabd488550b6db90167a30c582949d608d
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4231: fix codegen time regression

2016-10-13 Thread Tim Armstrong (Code Review)
Hello Michael Ho,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4623

to look at the new patch set (#4).

Change subject: IMPALA-4231: fix codegen time regression
..

IMPALA-4231: fix codegen time regression

The commit "IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder"
slightly increased codegen time, which caused TPC-H Q2 to sometimes
regress significantly because of races in runtime filter arrival.

This patch attempts to fix the regression by improving codegen time in a
few places.

* Revert to using the old bool/Status return pattern. The regular Status
  return pattern results in significantly more complex IR because it has
  to emit code to copy and free statuses. I spent some time trying to
  convince it to optimise the extra code out, but didn't have much success.
* Remove some code that cannot be specialized from cross-compilation.
* Add noexcept to some functions that are used from the IR to ensure
  exception-handling IR is not emitted. This is less important after the
  first change but still should help produce cleaner IR.

Performance:
I was able to reproduce a regression locally, which is fixed by this
patch. I'm in the process of trying to verify the fix on a cluster.

Change-Id: Idf0fdedabd488550b6db90167a30c582949d608d
---
M be/src/common/status.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M be/src/util/bloom-filter.cc
M be/src/util/bloom-filter.h
19 files changed, 382 insertions(+), 368 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/4623/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idf0fdedabd488550b6db90167a30c582949d608d
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all 
others
..


Patch Set 14: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/284/

-- 
To view, visit http://gerrit.cloudera.org:8080/4402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3718: Add test cancellation tests for Kudu

2016-10-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3718: Add test_cancellation tests for Kudu
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4700/1/tests/query_test/test_cancellation.py
File tests/query_test/test_cancellation.py:

Line 94:   pytest.skip()
> What happens if we don't skip this? Are those edge cases potentially intere
We'll skip a few of the queries for Kudu, e.g. the one  with 'select 
count(l_returnflag)', but I don't think it's that concerning since these tests 
are going to exercise the cancellation of a kudu table sink and I don't think 
it matters too much if there was an agg in there as well.


-- 
To view, visit http://gerrit.cloudera.org:8080/4700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf3d3853e7075991f6d12f125407ebdbe6a287e2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Bump Bzip2 version

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: Bump Bzip2 version
..


Bump Bzip2 version

This picks up the latest toolchain version. The only change is that
some symlinks in the previous version were broken.

Change-Id: I0c5e9ef10984fc8c6840acf285a04e472fc8b304
Reviewed-on: http://gerrit.cloudera.org:8080/4716
Reviewed-by: Alex Behm 
Reviewed-by: Michael Brown 
Tested-by: Internal Jenkins
---
M bin/impala-config.sh
1 file changed, 2 insertions(+), 3 deletions(-)

Approvals:
  Michael Brown: Looks good to me, but someone else must approve
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4716
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0c5e9ef10984fc8c6840acf285a04e472fc8b304
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] Bump Bzip2 version

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: Bump Bzip2 version
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4716
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0c5e9ef10984fc8c6840acf285a04e472fc8b304
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: WIP: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4670/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS1, Line 225:   // TODO: instead of adding them here and subtracting errors in 
CheckForErrors,
 :   // would it be better to _set_ the value to the total number 
of written rows
 :   // minus the errors minus the current buffer size?
> Yeah, this is a bit weird as it is. I'd vote to compute this as you describ
Todd: Can we even get the current buffer size?

I was looking at CountBufferedOperations() but that doesn't appear to be what 
we'd want here. (Doc says it applies to MANUAL flush only.)


-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4678/7/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS7, Line 142: volume_ops = '-v ' + volume_ops
I feel like it's not elegant that we have to add another -v here. How about 
changing the template in the line above to '-v {host_path}:{container_path}' 
and removing '-v' from the line 138? This line can then be removed.


PS7, Line 171: ):
It's interesting that the closing brace is on a separate line. Is this some new 
style?


PS7, Line 312: '' 
it's a little confusing here, why are there two single quotes followed by a 
bunch of spaces?


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4277: allow overriding of Hive/Hadoop versions/locations

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4720

Change subject: IMPALA-4277: allow overriding of Hive/Hadoop versions/locations
..

IMPALA-4277: allow overriding of Hive/Hadoop versions/locations

This is to help with IMPALA-4277 to make it easier to build against
Hadoop/Hive distributions where the directory layout doesn't exactly
match our current CDH dependencies, or where we may want to
temporarily override a version without making a source change.

Change-Id: I7da10e38f9c4309f2d193dc25f14a6ea308c9639
---
M bin/impala-config.sh
M cmake_modules/FindHDFS.cmake
M common/thrift/CMakeLists.txt
3 files changed, 23 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/4720/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7da10e38f9c4309f2d193dc25f14a6ea308c9639
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4277: bump Hadoop component versions except for Hadoop itself

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4721

Change subject: IMPALA-4277: bump Hadoop component versions except for Hadoop 
itself
..

IMPALA-4277: bump Hadoop component versions except for Hadoop itself

The Hive version bump required some minor changes to deal with renames
and added function arguments. The other components required no changes
to build the frontend against their API (I haven't been able to test
the JNI backend stuff).

IMPALA-4172 is blocking the hadoop version bump.

This builds if I manually put hive_metastore.thrift in the right place
and set SKIP_TOOLCHAIN_BOOTSTRAP=false. Tests can't run since we don't
have the special tarballs available that are used for the test cluster.

I manually constructed the Hive build dependency by taking
hive_metastore.thrift and putting it at
$CDH_COMPONENTS_HOME/hive-2.1.0-cdh6.x-SNAPSHOT/src/metastore/if/hive_metastore.thrift

Change-Id: Ia0be6bbe76d929ceaeb2fa9ac4ebba1820c4dab7
---
M bin/impala-config.sh
M common/thrift/TCLIService.thrift
M common/thrift/cli_service.thrift
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TableName.java
M fe/src/main/java/org/apache/impala/analysis/TypeDef.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
16 files changed, 38 insertions(+), 40 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/4721/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia0be6bbe76d929ceaeb2fa9ac4ebba1820c4dab7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4277: temporary hack to avoid compile errors for HDFS block location API

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4722

Change subject: IMPALA-4277: temporary hack to avoid compile errors for HDFS 
block location API
..

IMPALA-4277: temporary hack to avoid compile errors for HDFS block location API

This is a temporary workaround to get Impala building against HDFS 3.0
that can be undone once IMPALA-4172 is committed.

This builds if I put together a directory with the minimum files required to
build the backend against HDFS:

hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.a
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libhdfs.so.0.0.0
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.so
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.so.0.0.0
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libhdfs.so
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.la
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libhdfs.a
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/lib/native/libgplcompression.so.0
hadoop-3.0.0-alpha1-cdh6.x-SNAPSHOT/include/hdfs.h

Change-Id: Ice765b1be62c0c3d1b99282d8c652dd35d3d766d
---
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
3 files changed, 26 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/4722/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4722
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ice765b1be62c0c3d1b99282d8c652dd35d3d766d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4277: bump Hadoop component versions except for Hadoop itself

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change.

Change subject: IMPALA-4277: bump Hadoop component versions except for Hadoop 
itself
..


Abandoned

wrong branch

-- 
To view, visit http://gerrit.cloudera.org:8080/4721
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Ia0be6bbe76d929ceaeb2fa9ac4ebba1820c4dab7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4277: temporary hack to avoid compile errors for HDFS block location API

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change.

Change subject: IMPALA-4277: temporary hack to avoid compile errors for HDFS 
block location API
..


Abandoned

wrong branch

-- 
To view, visit http://gerrit.cloudera.org:8080/4722
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Ice765b1be62c0c3d1b99282d8c652dd35d3d766d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others

2016-10-13 Thread Henry Robinson (Code Review)
Hello Marcel Kornacker, Internal Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4402

to look at the new patch set (#16).

Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all 
others
..

IMPALA-2905: Handle coordinator fragment lifecycle like all others

The plan-root fragment instance that runs on the coordinator should be
handled like all others: started via RPC and run asynchronously. Without
this, the fragment requires special-case code throughout the
coordinator, and does not show up in system metrics etc.

This patch adds a new sink type, PlanRootSink, to the root fragment
instance so that the coordinator can pull row batches that are pushed by
the root instance. The coordinator signals completion to the fragment
instance via closing the consumer side of the sink, whereupon the
instance is free to complete.

Since the root instance now runs asynchronously wrt to the coordinator,
we add several coordination methods to allow the coordinator to wait for
a point in the instance's execution to be hit - e.g. to wait until the
instance has been opened.

Done in this patch:

* Add PlanRootSink
* Add coordination to PFE to allow coordinator to observe lifecycle
* Make FragmentMgr a singleton
* Removed dead code from Coordinator::Wait() and elsewhere.
* Moved result output exprs out of QES and into PlanRootSink.
* Remove special-case limit-based teardown of coordinator fragment, and
  supporting functions in PlanFragmentExecutor.
* Simplified lifecycle of PlanFragmentExecutor by separating Open() into
  OpenPlan() and Exec(), the latter of which drives the sink by reading
  rows from the plan tree.
* Add child profile to PlanFragmentExecutor to measure time spent in
  each lifecycle phase.
* Removed dependency between InitExecProfiles() and starting root fragment.

Not yet done:

* Fix planner tests to reflect new sink added at root.

Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
---
M be/src/exec/CMakeLists.txt
M be/src/exec/data-sink.cc
A be/src/exec/plan-root-sink.cc
A be/src/exec/plan-root-sink.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler.cc
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
M be/src/service/fragment-mgr.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
A be/src/service/query-result-set.h
M be/src/testutil/in-process-servers.cc
M common/thrift/DataSinks.thrift
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
A fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/complex-types-file-formats.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
M testdata/workloads/functional-planner/queries/PlannerTest/constant.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/disable-preaggregations.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/distinct-estimate.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mem-limit-broadcast-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test
M testdata/workl

[Impala-ASF-CR] IMPALA-4123: Fast bit unpacking

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#11).

Change subject: IMPALA-4123: Fast bit unpacking
..

IMPALA-4123: Fast bit unpacking

Adds utility functions for fast unpacking of batches of bit-packed
values. These support reading batches of any number of values provided
that the start of the batch is aligned to a byte boundary. Callers that
want to read smaller batches that don't align to byte boundaries will
need to implement their own buffering.

The unpacking code uses only portable C++ and no SIMD intrinsics, but is
fairly efficient because unpacking a full batch of 32 values compiles
down to 32-bit loads, shifts by constants, masks by constants, bitwise
ors when a value straddles 32-bit words and stores. Further speedups
should be possible using SIMD intrinsics.

Testing:
Added unit tests for unpacking, exhaustively covering different
bitwidths with additional test dimensions (memory alignment, various
input sizes, etc).

Tested under ASAN to ensure the bit unpacking doesn't read past the end
of buffers.

Perf:
Added microbenchmark that shows on average an 8-9x speedup over the
existing BitReader code.

Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/bit-packing-benchmark.cc
M be/src/benchmarks/bswap-benchmark.cc
A be/src/testutil/mem-util.h
M be/src/util/CMakeLists.txt
A be/src/util/bit-packing-test.cc
A be/src/util/bit-packing.h
A be/src/util/bit-packing.inline.h
M be/src/util/bit-stream-utils.h
M be/src/util/bit-stream-utils.inline.h
10 files changed, 886 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4494/11
-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Michael Brown (Code Review)
Hello David Knupp,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4678

to look at the new patch set (#8).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to mount an external Docker volume as a workaround to KUDU-1419.
This patch introduces a series of environment variables a user may tweak
in order to help with that purpose. The patch assumes a viable,
reasonable Docker container based on a standard Linux distribution like
Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need to copy the testdata off the
container onto the host running Docker Engine. To do that we suggest a
strategy using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 209 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4123: Fast bit unpacking

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4123: Fast bit unpacking
..


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4494/10/be/src/util/bit-packing-test.cc
File be/src/util/bit-packing-test.cc:

PS10, Line 39: it
> them
Done


PS10, Line 39: Both
> Both buffers
Done


Line 60:   uint8_t* packed = storage.data() + aligned;
> If aligned is true, then packed is not aligned
Done


http://gerrit.cloudera.org:8080/#/c/4494/10/be/src/util/bit-packing.inline.h
File be/src/util/bit-packing.inline.h:

Line 41: #define UNPACK_VALUES_CASE(ignore1, i, ignore2) \
> Here is a way to do this without any macros, but it's a bit heavyweight fro
IMO the macro is easier to understand because the macro bears some relation to 
the code being stamped out. Also you have a better chance of debugging it by 
looking at the preprocessed output. I think the macro makes sense in this case 
since we're really just trying to stamp out repetitive code instead of build an 
abstraction.


-- 
To view, visit http://gerrit.cloudera.org:8080/4494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I12db69409483d208cd4c0f41c27a78aeb6cd3622
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 8:

(3 comments)

Thanks for the review, Taras. Please see patch set 8.

http://gerrit.cloudera.org:8080/#/c/4678/7/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS7, Line 142: art_command += (
> I feel like it's not elegant that we have to add another -v here. How about
Done


PS7, Line 171:   
> It's interesting that the closing brace is on a separate line. Is this some
It satisfies flake8 to switch to this style. If I join this line up to the line 
above and run flake8, I get an error like this:

  impala_docker_env.py:169:9: E125 continuation line with same indent as next 
logical line


PS7, Line 312: '--delete -
> it's a little confusing here, why are there two single quotes followed by a
I'm gonna call line this a Casey-ism: It's so I can align the ssh options under 
the ssh command, not include extra white space in the rendered Fabric command, 
and satisfy flake8

This:

  '   -o UserKnownHostsFile=/dev/null -p {ssh_port}" '

...causes a bunch of white space to exist in the rendered command; it can be 
seen when you are running ps.

And this:

  'rsync -e "ssh [etc]" '
 '-o UserKnownHostsFile [etc]" '

... is seen by flake8 as an over-indented line and produces en error.

This seemed the easiest way to satisfy all of the above.


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 8: Code-Review+1

Carry David's +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] Buffer pool: Add basic counters to buffer pool client

2016-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4714

Change subject: Buffer pool: Add basic counters to buffer pool client
..

Buffer pool: Add basic counters to buffer pool client

Change-Id: I9a5a57b7cfccf67ee498e68964f1e077075ee325
---
A be/src/bufferpool/buffer-pool-counters.h
M be/src/bufferpool/buffer-pool-test.cc
M be/src/bufferpool/buffer-pool.cc
M be/src/bufferpool/buffer-pool.h
M be/src/bufferpool/reservation-tracker-test.cc
5 files changed, 102 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/4714/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4714
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9a5a57b7cfccf67ee498e68964f1e077075ee325
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3943: Address post-merge comments.

2016-10-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3943: Address post-merge comments.
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4696
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I72ccf00191afddb8583ac961f1eaf11e5eb28791
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu

2016-10-13 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-3739: Enable stress tests on Kudu
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4327/5/testdata/datasets/tpcds/tpcds_kudu_template.sql
File testdata/datasets/tpcds/tpcds_kudu_template.sql:

PS5, Line 469: 'kudu.table_name'='customer',
> Here's one kudu table called 'customer'. Can this (and others) be expanded 
Done


http://gerrit.cloudera.org:8080/#/c/4327/5/testdata/datasets/tpch/tpch_kudu_template.sql
File testdata/datasets/tpch/tpch_kudu_template.sql:

PS5, Line 185:   'kudu.table_name' = 'customer',
> Here's another kudu table called "customer".
Added "{target_db_name}_" to all the Kudu table names.


-- 
To view, visit http://gerrit.cloudera.org:8080/4327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu

2016-10-13 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new patch set (#6).

Change subject: IMPALA-3739: Enable stress tests on Kudu
..

IMPALA-3739: Enable stress tests on Kudu

This commit modifies the stress test framework to run TPC-H and TPC-DS
workloads against Kudu. The follwing changes are included in this
commit:
1. Created template files with DDL and DML statements for loading TPC-H and
   TPC-DS data in Kudu
2. Created a script (load-tpc-kudu.py) to load data in Kudu. The
   script is invoked by the stress test runner to load test data in an
   existing Impala/Kudu cluster (both local and CM-managed clusters are
   supported).
3. Created SQL files with TPC-DS queries to be executed in Kudu. SQL
   files with TPC-H queries for Kudu were added in a previous patch.
4. Modified the stress test runner to take additional parameters
   specific to Kudu (e.g. kudu master addr)

The stress test runner for Kudu was tested on EC2 clusters for both TPC-H
and TPC-DS workloads.

Missing functionality:
* No CRUD operations in the existing TPC-H/TPC-DS workloads for Kudu.
* Not all supported TPC-DS queries are included. Currently, only the
  TPC-DS queries from the testdata/workloads/tpcds/queries directory
  were modified to run against Kudu.

Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
---
A testdata/bin/load-tpc-kudu.py
A testdata/datasets/tpcds/tpcds_kudu_template.sql
A testdata/datasets/tpch/tpch_kudu_template.sql
A testdata/workloads/tpcds/queries/tpcds-kudu-q19.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q27.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q3.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q34.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q42.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q43.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q46.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q47.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q52.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q53.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q55.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q59.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q6.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q61.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q63.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q65.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q68.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q7.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q73.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q79.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q8.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q88.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q89.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q96.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q98.test
M tests/comparison/db_connection.py
M tests/stress/concurrent_select.py
30 files changed, 2,470 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/4327/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4270: Gracefully fail unsupported queries with mt dop > 0.

2016-10-13 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#4).

Change subject: IMPALA-4270: Gracefully fail unsupported queries with mt_dop > 
0.
..

IMPALA-4270: Gracefully fail unsupported queries with mt_dop > 0.

MT_DOP > 0 is only supported for plans without distributed joins
or table sinks. Adds validation to fail unsupported queries
gracefully in planning.

For scans in queries that are executable with MT_DOP > 0 we either
use the optimized MT scan node BE implementation (only Parquet), or
we use the conventional scan node with num_scanner_threads=1.

TODO: Still need to add end-to-end tests.

Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
---
M be/src/exec/exec-node.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
10 files changed, 308 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/4677/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-4270: Gracefully fail unsupported queries with mt dop > 0.

2016-10-13 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4270: Gracefully fail unsupported queries with mt_dop > 
0.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4677/1/testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test:

Line 5:  PLAN
> the build registration, etc., isn't expected to get in before code freeze, 
Ahh right. Done.


-- 
To view, visit http://gerrit.cloudera.org:8080/4677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

2016-10-13 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4678/7/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS7, Line 312: '--delete -
> I'm gonna call line this a Casey-ism: It's so I can align the ssh options u
Sounds good. I think it would be good to pick a standard and move to it slowly. 
(If flake8 does not like big indents we should stop making them).


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment
..


Patch Set 5: Code-Review+2

Carry +1 into +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If166a8a286d7559af547da39f6cc09e723f34c7e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Match .clang-format more closely to actual practice.

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Match .clang-format more closely to actual practice.
..


Patch Set 3: Code-Review+2

rebase, carry +1 into +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iccaec6c1673c3e08d2c39200b0c84437af629aed
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Match .clang-format more closely to actual practice.

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Match .clang-format more closely to actual practice.
..


Patch Set 3: Verified+1

Can't break tests with this change, so +1 verify manually

-- 
To view, visit http://gerrit.cloudera.org:8080/4590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iccaec6c1673c3e08d2c39200b0c84437af629aed
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Match .clang-format more closely to actual practice.

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has submitted this change and it was merged.

Change subject: Match .clang-format more closely to actual practice.
..


Match .clang-format more closely to actual practice.

In order to attempt to get code like

double VeryLongFunctionNames(double x1, double x2, double x3,
double x4) {
  return 1.0;
}

rather than

double VeryLongFunctionNames(
double x1, double x2, double x3, double x4) {
  return 1.0;
}

I wrote a small set of programs to infer which .clang-format params
fit the current Impala codebase most closely; this patch is the
result.

This patch is the best the inferencer found (while maintaining certain
enforced parameters, like 90-character lines). It is about 10% closer
to Impala's current code base than the .clang-format that is checked
in at the moment, as measured by number of lines in the diff.

Change-Id: Iccaec6c1673c3e08d2c39200b0c84437af629aed
Reviewed-on: http://gerrit.cloudera.org:8080/4590
Reviewed-by: Jim Apple 
Tested-by: Jim Apple 
---
M .clang-format
1 file changed, 35 insertions(+), 14 deletions(-)

Approvals:
  Jim Apple: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iccaec6c1673c3e08d2c39200b0c84437af629aed
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] DO NOT SUBMIT: Example diffs of http://gerrit.cloudera.org/#/c/4590/

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has abandoned this change.

Change subject: DO NOT SUBMIT: Example diffs of 
http://gerrit.cloudera.org/#/c/4590/
..


Abandoned

http://gerrit.cloudera.org/#/c/4590/ submitted

-- 
To view, visit http://gerrit.cloudera.org:8080/4591
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Ifedb15e9c50ffc2d764a54c168a7956a1adc0328
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] Match .clang-format more closely to actual practice.

2016-10-13 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Match .clang-format more closely to actual practice.
..


Patch Set 4:

When did we decide that the change in the comment was one we wanted? For one, I 
prefer having args on one line where possible.

-- 
To view, visit http://gerrit.cloudera.org:8080/4590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iccaec6c1673c3e08d2c39200b0c84437af629aed
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: WIP: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..


Patch Set 1:

(3 comments)

I'm taking over the patch and will update it soon.

http://gerrit.cloudera.org:8080/#/c/4670/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS1, Line 225:   // TODO: instead of adding them here and subtracting errors in 
CheckForErrors,
 :   // would it be better to _set_ the value to the total number 
of written rows
 :   // minus the errors minus the current buffer size?
> Todd: Can we even get the current buffer size?
On second though, I don't think we care about updating this constantly. We can 
just update it once in FinalFlush with total num written - total num errors.


PS1, Line 234:   if (session_->CountPendingErrors() == 0) {
 : return Status::OK();
 :   }
> nit: 1 line
Done


PS1, Line 281:   // TODO: these counters don't make so much sense anymore.
 :   SCOPED_TIMER(kudu_flush_timer_);
 :   COUNTER_ADD(kudu_flush_counter_, 1);
> can you remove them then?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-13 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#3).

Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..

IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink

Improves performance of writes to Kudu.

Testing: Manual cluster testing (which is where the default
mutation buffer value of 100mb was determined).

Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
---
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
2 files changed, 50 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/4670/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-4134: Use AUTO FLUSH BACKGROUND for Kudu sink

2016-10-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: IMPALA-4134: Use AUTO_FLUSH_BACKGROUND for Kudu sink
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4670/3/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 156: DCHECK(sink_action_ == TSinkAction::DELETE) << "Sink type not 
supported: "
spurious change here


Line 163:   SCOPED_TIMER(profile()->total_time_counter());
I wonder if we should still keep some kind of timer which is specifically 
referring to the time spent in Kudu. We can't specifically measure the 
flushing, but we could do something like have a 
vector> which we fill up by looping over the row 
batch, and then in a SCOPED_TIMER(write_time_counter) apply them all? That way 
if we are writing faster than Kudu can absorb the writes, we'll see the 
blocking on the Apply() call register as significant time, and can still 
distinguish this time from time spent in output expr computation.


Line 301:   
(*state->per_partition_status())[ROOT_PARTITION_KEY].num_appended_rows =
if we don't update this to the end, does the query summary page on the web UI 
still show the number of rows written by the sink? (I don't know if that 
counter was the output side or the input side of the sink, actually)


-- 
To view, visit http://gerrit.cloudera.org:8080/4670
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I665ce82d1caf64cf5bdf4bc0e15edbf9f1dc5ba0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Match .clang-format more closely to actual practice.

2016-10-13 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Match .clang-format more closely to actual practice.
..


Patch Set 4:

> When did we decide that the change in the comment was one we
 > wanted? For one, I prefer having args on one line where possible.

The change the commit comment? That was suggested by Marcel to me in a private 
email. I didn't question it, as it seemed to me this matched the way the code 
is currently styled. For better or worse, I couldn't get clang-format to 
actually follow that style, so we're stuck with (or "stuck with") the style you 
prefer.

-- 
To view, visit http://gerrit.cloudera.org:8080/4590
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iccaec6c1673c3e08d2c39200b0c84437af629aed
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4231: fix codegen time regression

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4231: fix codegen time regression
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idf0fdedabd488550b6db90167a30c582949d608d
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4231: fix codegen time regression

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4231: fix codegen time regression
..


IMPALA-4231: fix codegen time regression

The commit "IMPALA-3567 Part 2, IMPALA-3899: factor out PHJ builder"
slightly increased codegen time, which caused TPC-H Q2 to sometimes
regress significantly because of races in runtime filter arrival.

This patch attempts to fix the regression by improving codegen time in a
few places.

* Revert to using the old bool/Status return pattern. The regular Status
  return pattern results in significantly more complex IR because it has
  to emit code to copy and free statuses. I spent some time trying to
  convince it to optimise the extra code out, but didn't have much success.
* Remove some code that cannot be specialized from cross-compilation.
* Add noexcept to some functions that are used from the IR to ensure
  exception-handling IR is not emitted. This is less important after the
  first change but still should help produce cleaner IR.

Performance:
I was able to reproduce a regression locally, which is fixed by this
patch. I'm in the process of trying to verify the fix on a cluster.

Change-Id: Idf0fdedabd488550b6db90167a30c582949d608d
Reviewed-on: http://gerrit.cloudera.org:8080/4623
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/common/status.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
M be/src/exec/partitioned-hash-join-node-ir.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M be/src/util/bloom-filter.cc
M be/src/util/bloom-filter.h
19 files changed, 382 insertions(+), 368 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Idf0fdedabd488550b6db90167a30c582949d608d
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4291: Reduce LLVM module's preparation time

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4291: Reduce LLVM module's preparation time
..


IMPALA-4291: Reduce LLVM module's preparation time

Previously, when creating a LlvmCodeGen object, we
run an O(mn) algorithm to map the IRFunction::Type
to the actual LLVM::Function object in the module.
m is the size of IRFunction::Type enum and n is
the total number of functions in the module. This
is a waste of time if we only use few functions
from the module.

This change reduces the preparation time of a simple
query from 23ms to 10ms.

select count(*) from tpch100_parquet.lineitem where l_orderkey > 20;

Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
Reviewed-on: http://gerrit.cloudera.org:8080/4691
Reviewed-by: Michael Ho 
Tested-by: Internal Jenkins
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
2 files changed, 175 insertions(+), 123 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4691
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4291: Reduce LLVM module's preparation time

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4291: Reduce LLVM module's preparation time
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4691
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I61ab9fa8cca5a0909bb716c3c62819da3e3b3041
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all 
others
..


Patch Set 16: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/287/

-- 
To view, visit http://gerrit.cloudera.org:8080/4402
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4288: Separate conjunct registration from analysis.

2016-10-13 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#7).

Change subject: IMPALA-4288: Separate conjunct registration from analysis.
..

IMPALA-4288: Separate conjunct registration from analysis.

Our existing analyze() of statemens registers information
with the Analyzer needed for predicate assignment in
plan generation. This flow makes it difficult to rewrite
expressions after they have been analyzed because we'd
need to update all registered analysis state, too.

This change makes the registration of state needed for
predicate assignment a separate phase which may be nvoked
after analysis. This is a first step to introducing an
expression rewrite phase after analysis but before
conjunct registration.

Change-Id: I3c4a49f81bbcae999dda32fefbf34a0c9e032793
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
19 files changed, 209 insertions(+), 137 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4718/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3c4a49f81bbcae999dda32fefbf34a0c9e032793
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-3943: Address post-merge comments.

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3943: Address post-merge comments.
..


IMPALA-3943: Address post-merge comments.

Adds code comments and issues a warning for Parquet files
with num_rows=0 but at least one non-empty row group.

Change-Id: I72ccf00191afddb8583ac961f1eaf11e5eb28791
Reviewed-on: http://gerrit.cloudera.org:8080/4696
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
M common/thrift/generate_error_codes.py
M testdata/workloads/functional-query/queries/QueryTest/parquet-zero-rows.test
3 files changed, 29 insertions(+), 2 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4696
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I72ccf00191afddb8583ac961f1eaf11e5eb28791
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3943: Address post-merge comments.

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3943: Address post-merge comments.
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4696
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I72ccf00191afddb8583ac961f1eaf11e5eb28791
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If166a8a286d7559af547da39f6cc09e723f34c7e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment

2016-10-13 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment
..


IMPALA-3971, IMPALA-3229: Bootstrap an Impala dev environment

This script bootstraps an Impala dev environment on Ubuntu 14.04. It
is not hermetic -- it changes some config files for the user and for
the OS.

It is green on Jenkins, and it runs in about 6.5 hours. The intention
is to have this script run in a CI tool for post-commit testing, with
the hope that this will make it easier for new developers to get a
working development environment. Previously, the new developer
workflow lived on wiki pages and tended to bit-rot.

Still left to do: migrating the install script into the official
Impala repo.

Change-Id: If166a8a286d7559af547da39f6cc09e723f34c7e
Reviewed-on: http://gerrit.cloudera.org:8080/4674
Reviewed-by: Jim Apple 
Tested-by: Internal Jenkins
---
A bin/bootstrap_development.sh
1 file changed, 80 insertions(+), 0 deletions(-)

Approvals:
  Jim Apple: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4674
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If166a8a286d7559af547da39f6cc09e723f34c7e
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others

2016-10-13 Thread Henry Robinson (Code Review)
Hello Marcel Kornacker, Internal Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4402

to look at the new patch set (#17).

Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all 
others
..

IMPALA-2905: Handle coordinator fragment lifecycle like all others

The plan-root fragment instance that runs on the coordinator should be
handled like all others: started via RPC and run asynchronously. Without
this, the fragment requires special-case code throughout the
coordinator, and does not show up in system metrics etc.

This patch adds a new sink type, PlanRootSink, to the root fragment
instance so that the coordinator can pull row batches that are pushed by
the root instance. The coordinator signals completion to the fragment
instance via closing the consumer side of the sink, whereupon the
instance is free to complete.

Since the root instance now runs asynchronously wrt to the coordinator,
we add several coordination methods to allow the coordinator to wait for
a point in the instance's execution to be hit - e.g. to wait until the
instance has been opened.

Done in this patch:

* Add PlanRootSink
* Add coordination to PFE to allow coordinator to observe lifecycle
* Make FragmentMgr a singleton
* Removed dead code from Coordinator::Wait() and elsewhere.
* Moved result output exprs out of QES and into PlanRootSink.
* Remove special-case limit-based teardown of coordinator fragment, and
  supporting functions in PlanFragmentExecutor.
* Simplified lifecycle of PlanFragmentExecutor by separating Open() into
  Open() and Exec(), the latter of which drives the sink by reading
  rows from the plan tree.
* Add child profile to PlanFragmentExecutor to measure time spent in
  each lifecycle phase.
* Removed dependency between InitExecProfiles() and starting root
  fragment.
* Removed mostly dead-code handling of LIMIT 0 queries.
* Ensured that SET returns a result set in all cases.

Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
---
M be/src/exec/CMakeLists.txt
M be/src/exec/data-sink.cc
A be/src/exec/plan-root-sink.cc
A be/src/exec/plan-root-sink.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler.cc
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
M be/src/service/fragment-mgr.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
A be/src/service/query-result-set.h
M be/src/testutil/in-process-servers.cc
M common/thrift/DataSinks.thrift
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
A fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/complex-types-file-formats.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
M testdata/workloads/functional-planner/queries/PlannerTest/constant.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/disable-preaggregations.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/distinct-estimate.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mem-limit-broadcast-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/n