[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to clients

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4335: Don't send 0-row batches to clients
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to clients

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4335: Don't send 0-row batches to clients
..


IMPALA-4335: Don't send 0-row batches to clients

This patch restores some behaviour from pre-IMPALA-2905 where we would
not send 0-row batches to the client. Although 0-row batches are legal,
they're not very useful for clients to receive (and clients may not
correctly process them).

No query was found which reliably produced 0-row batches, so no test is
added.

Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Reviewed-on: http://gerrit.cloudera.org:8080/4787
Reviewed-by: Henry Robinson 
Tested-by: Internal Jenkins
---
M be/src/exec/plan-root-sink.cc
M be/src/runtime/coordinator.h
2 files changed, 6 insertions(+), 3 deletions(-)

Approvals:
  Henry Robinson: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT DOP > 0.

2016-10-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT_DOP 
> 0.
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4794/1/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

Line 586:   DCHECK_EQ(num_bytes & (num_bytes - 1), 0);
Could you factor this into BitUtil::IsPowerOfTwo() or something along those 
lines. That would be generally useful - e.g. I know some cases in the buffer 
pool code where I'd use it.


Line 591: return ConstantInt::get(context(), APInt(128, vals));
CodegenAnyVal::SetVal(int128_t) should use this approach. There is a TODO in 
there.


http://gerrit.cloudera.org:8080/#/c/4794/1/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

Line 387:   /// than 64-bit), an 128-bit value is formed by setting its upper 
and lower 64-bit set
This behaviour of copying the value into the upper and lower bits is pretty 
confusing - I would expect it to return an int128_t with all the upper bits 0. 
E.g. GetIntConstant(byte_size, 1) should always return 1 regardless of the 
byte_size. Maybe we should have a separate GetInt128Constant(int bytes_size, 
int64_t upper, int64_t lower) method that does this so the behaviour is more 
"obvious".


http://gerrit.cloudera.org:8080/#/c/4794/1/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

Line 597: // fall through to generate 128-bit 'fnv_seed'
May as well just call codegen->GetIntConstant() here, or 
codegen->GetInt128Constant() here. It only saves a line of code.


PS1, Line 608: fnv_seed_float
Thank you! This bugged me when reading the code before. Optional, but you could 
do the search-and-replace too in old-hash-table.cc


http://gerrit.cloudera.org:8080/#/c/4794/1/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

Line 723:  QUERY
I have a test in 
https://gerrit.cloudera.org/#/c/4655/6/tests/query_test/test_aggregation.py 
that disables the check if codegen is enabled for timestamps. Depending on 
which patch lands first we should make sure that's reenabled.


-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4295: XFAIL wildcard SSL test

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4295: XFAIL wildcard SSL test
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie809c6c6c967447d527927ebbc6b110095e7320a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4295: XFAIL wildcard SSL test

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4295: XFAIL wildcard SSL test
..


IMPALA-4295: XFAIL wildcard SSL test

commit 9f61397fc4d638aa78b37db2cd5b9c35b6deed94 exposed a bug (one
that was latent before the commit). I am XFAILing this now just to
green the build; IMPALA-4295 can be resolved when this issue is fixed
and not just XFAILed.

Change-Id: Ie809c6c6c967447d527927ebbc6b110095e7320a
Reviewed-on: http://gerrit.cloudera.org:8080/4784
Reviewed-by: Jim Apple 
Tested-by: Internal Jenkins
---
M tests/custom_cluster/test_client_ssl.py
1 file changed, 1 insertion(+), 0 deletions(-)

Approvals:
  Jim Apple: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie809c6c6c967447d527927ebbc6b110095e7320a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3884: Support TYPE TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

2016-10-21 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4794

Change subject: IMPALA-3884: Support TYPE_TIMESTAMP for 
HashTableCtx::CodegenAssignNullValue()
..

IMPALA-3884: Support TYPE_TIMESTAMP for HashTableCtx::CodegenAssignNullValue()

This change implements support for TYPE_TIMESTAMP for
HashTableCtx::CodegenAssignNullValue(). TimestampValue itself
is 16 bytes in size. To match RawValue::Write() in the
interpreted path, CodegenAssignNullValue() emits code to assign
HashUtil::FNV_SEED to both the upper and lower 64-bit of the
destination value. This change also fixes the handling of 128-bit
Decimal16Value in CodegenAssignNullValue() so the emitted code
matches the behavior of the interpreted path.

Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
---
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/hash-table.cc
M testdata/workloads/functional-query/queries/QueryTest/joins.test
4 files changed, 44 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/4794/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0211d38cbef46331e0006fa5ed0680e6e0867bc8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] IMPALA-4241: remove spurious child queries event

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4241: remove spurious child queries event
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3881d032622750444d750f161ad6843bdbd16c30
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4241: remove spurious child queries event

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4241: remove spurious child queries event
..


IMPALA-4241: remove spurious child queries event

"IMPALA-4037,IMPALA-4038: fix locking during query
cancellation" accidentally added the "Child queries
finished" event unconditionally. We should only do
this if there are actually child queries.

Change-Id: I3881d032622750444d750f161ad6843bdbd16c30
Reviewed-on: http://gerrit.cloudera.org:8080/4768
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/service/query-exec-state.cc
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3881d032622750444d750f161ad6843bdbd16c30
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function

2016-10-21 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function
..


Patch Set 4: Code-Review+2

Carry Dan's +2 forward.

-- 
To view, visit http://gerrit.cloudera.org:8080/4740
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function

2016-10-21 Thread Michael Ho (Code Review)
Hello Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4740

to look at the new patch set (#4).

Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function
..

IMPALA-4120: Incorrect results with LEAD() analytic function

This change fixes a memory management problem with LEAD()/LAG()
analytic functions which led to incorrect result. In particular,
the update functions specified for these analytic functions only
make a shallow copy of StringVal (i.e. copying only the pointer
and the length of the string) without copying the string itself.
This may lead to problem if the string is created from some UDFs
which do local allocations whose buffer may be freed and reused
before the result tuple is copied out. This change fixes the problem
above by allocating a buffer at the Init() functions of these
analytic functions to track the intermediate value. In addition,
when the value is copied out in GetValue(), it will be copied into
the MemPool belonging to the AnalyticEvalNode and attached to the
outgoing row batches. This change also fixes a missing free of
local allocations in QueryMaintenance().

Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f
---
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exprs/agg-fn-evaluator.h
M be/src/exprs/aggregate-functions-ir.cc
M be/src/udf/udf.cc
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
7 files changed, 120 insertions(+), 55 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/4740/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4740
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function

2016-10-21 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

Line 403:   }
> so this bug was similar to IMPALA-3311 (see PartitionedAggregationNode::Han
Yes. We need some refactoring to share the code. Probably not worth it.


Line 420:  (next_partition || (fn_scope_ == RANGE && 
window_.__isset.window_end &&
> this indentation seems off. what does clang-format say?
Tried with clang-format but still looks a bit odd. Changed the indentation to 
look closer to the existing code.


http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exprs/agg-fn-evaluator.h
File be/src/exprs/agg-fn-evaluator.h:

PS3, Line 153: beyond the scope of the call site
> Maybe say "The memory is a local allocation and so needs to be copied out b
Incorporated comments from Dan and Matt.


PS3, Line 153: beyond the scope of the call site
> I'm still not sure this is precise enough... this sounds like it's scoped a
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4740
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Enabling end-to-end tests on a remote cluster

2016-10-21 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#3).

Change subject: Enabling end-to-end tests on a remote cluster
..

Enabling end-to-end tests on a remote cluster

This patch enables data loading and running end-to-end tests on a remote
cluster. The requirements to run the tests on a remote cluster are

  - CDH cluster that is CM managed
  - KMS and KeyTrustee installed and available as service
  - Hive warehouse dir points to /test-warehouse

The new remote_load_data.py script takes a CM host as argument and will
load the test warehouse snapshot on the first cluster managed by this
instance of CM. It will automatically pick the necessary configuration
needed to perform the data load process.

Usage: remote_data_load.py [options] cm_host

Options:
  -h, --helpshow this help message and exit
  --cm-user=CM_USER Cloudera Manager admin user
  --cm-pass=CM_PASS Cloudera Manager admin user password
  --gateway=GATEWAY Gateway host to upload the data from. If not set, uses
the CM host as gateway.
  --ssh-user=SSH_USER   System user on the remote machine with passwordless
SSH configured.
  --no-load Do not try to load the snapshot
  --exploration-strategy=EXPLORATION_STRATEGY
  --testRun end-to-end tests against cluster

Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
---
M bin/load-data.py
A bin/remote_data_load.py
M testdata/bin/compute-table-stats.sh
M testdata/bin/create-load-data.sh
M testdata/bin/create-table-many-blocks.sh
M testdata/bin/generate-schema-statements.py
M testdata/bin/load-test-warehouse-snapshot.sh
M testdata/bin/load_nested.py
M testdata/bin/run-step.sh
M testdata/bin/setup-hdfs-env.sh
10 files changed, 574 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/4769/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to clients

2016-10-21 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4335: Don't send 0-row batches to clients
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to clients

2016-10-21 Thread Henry Robinson (Code Review)
Hello Matthew Jacobs, Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4787

to look at the new patch set (#4).

Change subject: IMPALA-4335: Don't send 0-row batches to clients
..

IMPALA-4335: Don't send 0-row batches to clients

This patch restores some behaviour from pre-IMPALA-2905 where we would
not send 0-row batches to the client. Although 0-row batches are legal,
they're not very useful for clients to receive (and clients may not
correctly process them).

No query was found which reliably produced 0-row batches, so no test is
added.

Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
---
M be/src/exec/plan-root-sink.cc
M be/src/runtime/coordinator.h
2 files changed, 6 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/4787/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to sink

2016-10-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4335: Don't send 0-row batches to sink
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4787/3//COMMIT_MSG
Commit Message:

Line 7: IMPALA-4335: Don't send 0-row batches to sink
update for the new fix


-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to sink

2016-10-21 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4335: Don't send 0-row batches to sink
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4787/2/be/src/exec/plan-root-sink.cc
File be/src/exec/plan-root-sink.cc:

Line 88:   // Don't enter the loop if batch->num_rows() == 0; no point 
triggering the consumer with
> This sounds rather benign. Mention that some clients may not handle empty b
It's benign, but it's true: we don't want to trigger the consumer (why pay a 
context switch to do no work and force another RPC?).


-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to sink

2016-10-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4335: Don't send 0-row batches to sink
..


Patch Set 2: Code-Review+2

Thanks. I verified it fixed case w/ the KuduScanNode.

-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..


Patch Set 9:

(7 comments)

Thanks, the renaming helps a lot.

My feeling is that the separation of attaching resources and flushing resources 
would be clearest if the exec node just used MarkFlushResources() directly 
rather than threading through the parameter.  But if you and Alex feel 
otherwise, I'm okay with leaving it this way.

http://gerrit.cloudera.org:8080/#/c/4448/9/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

Line 653: // the caller can pass the rows up the operator tree.
let's make this comment actually be useful.  how about something like:

No more data in this block.  The batch must be immediately returned up the 
operator tree and deep copied so that NextReadBlock() can reuse the read 
block's buffer.


http://gerrit.cloudera.org:8080/#/c/4448/9/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

PS9, Line 77: Different modes for flushing resources
these are really different modes for flushing resources.  isn't it just a flag 
to indicate whether resources need to be flushed or not?


PS9, Line 227: as soon as possible
this is pretty vague.  how about saying what the rule is specifically:

Used by an operator to indicate that it cannot produce more rows until the 
resources that it has attached to the row batch are freed or acquired by an 
ancestor operator.


PS9, Line 242:  
(that have not been attached to the batch)


PS9, Line 243: .
and deep copied.

(to differentiate from flush where the resources can be acquired).


PS9, Line 244: This is used to control memory management for streaming rows
this is kind of misleading and confusing now especially since "flush" controls 
this.


Line 246:   /// are not allowed to accumulate batches with the 
'needs_deep_copy' flag.
how about saying this is deprecated?


-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2521: Add clustered hint to insert statements

2016-10-21 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#7).

Change subject: IMPALA-2521: Add clustered hint to insert statements
..

IMPALA-2521: Add clustered hint to insert statements

This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.

Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
---
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
8 files changed, 295 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4745/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4745
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to sink

2016-10-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4335: Don't send 0-row batches to sink
..


Patch Set 2: Code-Review+1

(1 comment)

I suspect that whatever test we add may become stale pretty easily. The empty 
batch behavior can change with various parameters that we're likely to tune.

http://gerrit.cloudera.org:8080/#/c/4787/2/be/src/exec/plan-root-sink.cc
File be/src/exec/plan-root-sink.cc:

Line 88:   // Don't enter the loop if batch->num_rows() == 0; no point 
triggering the consumer with
This sounds rather benign. Mention that some clients may not handle empty 
batches correctly, so we cannot return such batches.


-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3718: Add test cancellation tests for Kudu

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3718: Add test_cancellation tests for Kudu
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf3d3853e7075991f6d12f125407ebdbe6a287e2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3718: Add test cancellation tests for Kudu

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3718: Add test_cancellation tests for Kudu
..


IMPALA-3718: Add test_cancellation tests for Kudu

Additional functional tests for Kudu.

Change-Id: Icf3d3853e7075991f6d12f125407ebdbe6a287e2
Reviewed-on: http://gerrit.cloudera.org:8080/4700
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M tests/query_test/test_cancellation.py
1 file changed, 35 insertions(+), 14 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Icf3d3853e7075991f6d12f125407ebdbe6a287e2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-10-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..


Patch Set 11: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4047/11/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 174: }
else DCHECK that the referenced columns are empty?


-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to sink

2016-10-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4335: Don't send 0-row batches to sink
..


Patch Set 1: Code-Review+1

(1 comment)

Thanks. Not sure why it doesn't repro for you, Michael Brown repro'd this on 
his machine and I was able to repro it. I'll test this on my machine.

http://gerrit.cloudera.org:8080/#/c/4787/1//COMMIT_MSG
Commit Message:

PS1, Line 9: patchr
patch


-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4335: Don't send 0-row batches to sink

2016-10-21 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4787

Change subject: IMPALA-4335: Don't send 0-row batches to sink
..

IMPALA-4335: Don't send 0-row batches to sink

This patchr restores some behaviour from pre-IMPALA-2905 where the
PlanFragmentExecutor would not send 0-row batches to the sink. Although
0-row batches are legal, they're not very useful for clients to
receive (and clients may not correctly process them).

No query was found which reliably produced 0-row batches, so no test is
added.

Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
---
M be/src/runtime/coordinator.h
M be/src/runtime/plan-fragment-executor.cc
2 files changed, 3 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/4787/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4787
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7d339c1f9a55d9d75fb0e97d16b3176cc34f2171
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] Minor fixes to remove more "cloudera"s from the code.

2016-10-21 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Minor fixes to remove more "cloudera"s from the code.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4774/1/tests/benchmark/report_benchmark_results.py
File tests/benchmark/report_benchmark_results.py:

Line 249:   url = 
'https://api.github.com/repos/apache/incubator-impala/commits/' + commit_sha
> I tried opening this URL in my browser and it did not work:
I don't think that commit is in the upstream repo. Try this:

https://api.github.com/repos/apache/incubator-impala/commits/bf1d9677fc09dabc88ab6c71f82ec99ec59ec164

github lags behind the official repo.

I don't know of any other tooling which needs to be updated at the moment.


-- 
To view, visit http://gerrit.cloudera.org:8080/4774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I27687a3b677def72f1867df54c8e50c93d5cab3a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-10-21 Thread Thomas Tauber-Marshall (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4047

to look at the new patch set (#10).

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..

IMPALA-3725 Support Kudu UPSERT in Impala

This patch introduces a new query statement, UPSERT, for Kudu
tables which operates like an INSERT and uses all of the analysis,
planning, and execution machinery as INSERT, except that if
there's a primary key collision instead of returning an error an
update is performed.

New syntax:
[with_clause] UPSERT INTO [TABLE] table_name [(column list)]
{
  query_stmt
 | VALUES (value [, value...]) [, (value [, (value...)]) ...]
}

where column list must contain all of the key columns in
table_name, if specified, and table_name must be a Kudu table.

This patch also improves the behavior of INSERTing into Kudu
tables without specifying all of the key columns - this now
results in an analysis exception, rather than attempting the
INSERT and receiving an error back from Kudu.

Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
---
M be/src/exec/kudu-table-sink.cc
M common/thrift/DataSinks.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
A fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
17 files changed, 651 insertions(+), 85 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/4047/10
-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3725 Support Kudu UPSERT in Impala

2016-10-21 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3725 Support Kudu UPSERT in Impala
..


Patch Set 10:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4047/9/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 121:   // If this is an INSERT, will contain one Expr for all 
non-partition columns of the
> ... will contain one Expr for all non-partition columns of the target table
Done


Line 124:   // If this is an UPSERT, will contain one Expr per column mentioned 
in the query and
> will contain one Expr per mentioned column
Done


Line 128: 
> As discussed, please clarify the "if (value == NULL)" part in the BE kudu-t
Done


Line 131:   // Only used for UPSERT, set in prepareExpressions().
> mentionedUpsertColumns_?
Done


Line 664: ArrayList columns = table_.getColumnsInHiveOrder();
> ++col
Done


Line 670:   resultExprs_.add(selectListExprs.get(i));
> single line
Done


Line 754: return TableSink.create(table_, isUpsert_ ? TableSink.Op.UPSERT : 
TableSink.Op.INSERT,
> easier to add a Preconditions.checkState(isUpsert_ || referencedColumns_.is
Done


http://gerrit.cloudera.org:8080/#/c/4047/9/testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_crud.test:

Line 307: select false as valb, 'he' as name, id from tdata where id < 2
> let's also add an upsert test to see what happens when a PK column is NULL
It results in the query getting aborted, which as far as I can tell is not 
something that our query test framework can currently handle.


-- 
To view, visit http://gerrit.cloudera.org:8080/4047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

PS4, Line 378: rewriter_.reset();
 : analysisResult_.stmt_.rewriteExprs(rewriter_);
 : reAnalyze = rewriter_.changed();
> I understand what you're saying. However, as is now, you have a class that 
What's the problem with having the entity performing the change recording 
whether it applied a change? It seems to know best.

Visitor pattern is fine if you have a tree, but our Stmts and Exprs are not all 
ParseNodes.

We discussed making everything ParseNodes, but that is a much more invasive 
change.


-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4339: ensure coredumps end up in IMPALA HOME

2016-10-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4785

Change subject: IMPALA-4339: ensure coredumps end up in IMPALA_HOME
..

IMPALA-4339: ensure coredumps end up in IMPALA_HOME

Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
---
M buildall.sh
1 file changed, 3 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/4785/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4785
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibc34d152139653374f940dc3edbca08e749bf55e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4295: XFAIL wildcard SSL test

2016-10-21 Thread Jim Apple (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4784

to look at the new patch set (#2).

Change subject: IMPALA-4295: XFAIL wildcard SSL test
..

IMPALA-4295: XFAIL wildcard SSL test

commit 9f61397fc4d638aa78b37db2cd5b9c35b6deed94 exposed a bug (one
that was latent before the commit). I am XFAILing this now just to
green the build; IMPALA-4295 can be resolved when this issue is fixed
and not just XFAILed.

Change-Id: Ie809c6c6c967447d527927ebbc6b110095e7320a
---
M tests/custom_cluster/test_client_ssl.py
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/4784/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie809c6c6c967447d527927ebbc6b110095e7320a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-21 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

PS4, Line 378: rewriter_.reset();
 : analysisResult_.stmt_.rewriteExprs(rewriter_);
 : reAnalyze = rewriter_.changed();
> Don't really agree.
I understand what you're saying. However, as is now, you have a class that 
keeps track of modifications to other classes it acts upon. So now you have to 
make sure that you reset the state of the rewriter before you apply it to a 
different entity. Re #2, I believe there is a way to have each class track 
modifications to its own state but it will probably require some kind of 
visitor pattern. Let me think about it and maybe I can suggest something more 
concrete.


http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

PS4, Line 60: rewrite
> Sorry, missed this.
ha, correct. I was thinking of the action. But rewriter is not ambiguous :)


-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 3:

(9 comments)

even w/o the new section, how does compute stats behave?

http://gerrit.cloudera.org:8080/#/c/4756/3/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

Line 56: request_pool_(""),
was this intentional? i don't see why this is necessary


http://gerrit.cloudera.org:8080/#/c/4756/3/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 900: // TODO-MT: call AdmitQuery()
call AdmitQuery() rather than always setting is_admitted


http://gerrit.cloudera.org:8080/#/c/4756/3/be/src/scheduling/simple-scheduler.h
File be/src/scheduling/simple-scheduler.h:

Line 92:   /// If Schedule() returns OK, schedule::is_admitted will be true.
Sorry, on second thought this doesn't add much. There's a comment in the base 
class where this information might be better suited, but I can't think of a way 
to phrase this right now that I think makes it worthwhile.


http://gerrit.cloudera.org:8080/#/c/4756/3/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

PS3, Line 158:  and had the pool set yet.
, not had the pool set yet, or this is StmtType doesn't go through admission 
control.


http://gerrit.cloudera.org:8080/#/c/4756/3/tests/common/impala_service.py
File tests/common/impala_service.py:

PS3, Line 83:   def get_webpage_json(self, page_name):
: """Returns the json for the given Impala debug webpage, eg. 
'/queries'"""
: return json.loads(self.read_debug_webpage(page_name + 
"?json"))
can you move this up to be under read_debug_webpage and call it 
get_debug_webpage_json?


http://gerrit.cloudera.org:8080/#/c/4756/3/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

PS3, Line 603: the 'admitted' and 'queued' states
outdated


PS3, Line 658:  metrics from the webpage
counts from the queries page


Line 663: 
would be nice to add a fn that gets the queries page json and checks that all 
queries (both in flight and completed) either:

a) have query type QUERY or DML and the pool is self.pool_name
OR
b) has an empty pool name (I think it'd be DDL or SET)

I think the results should be stable to check here (they're not making state 
transitions)


Line 708: 
the function to check the pool names could be called here too


-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4295: 9f61397fc4d638aa78b37db2cd5b9c35b6deed94 exposed a bug

2016-10-21 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4295: 9f61397fc4d638aa78b37db2cd5b9c35b6deed94 exposed a 
bug
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4784/1//COMMIT_MSG
Commit Message:

PS1, Line 7: 9f61397fc4d638aa78b37db2cd5b9c35b6deed94
Prefer "xfail wildcard SSL test", and add this commentary to the commit message.


-- 
To view, visit http://gerrit.cloudera.org:8080/4784
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie809c6c6c967447d527927ebbc6b110095e7320a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-21 Thread Tim Armstrong (Code Review)
Hello Internal Jenkins, Alex Behm, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4448

to look at the new patch set (#9).

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..

IMPALA-4023: don't attach buffered tuple streams to batches

This simplifies the memory transfer model by eliminating one category of
resources that can be attached.

This patch also separates the concepts of attaching resources and
flushing resources. Previously RowBatch::AddTupleStream() implicitly
flushed resources from the ExecNode pipeline, which various ExecNodes
relied on to free up memory reservations for subsequent processing. In a
subsequent patch I want the FlushResources() API to become stronger: it
will force streaming ExecNodes to flush their batches or forces
blocking ExecNodes to acquire ownership of the memory resources.
We can't do this right now since we don't have a way to transfer
ownership of BufferedBlockMgr Blocks.

Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
---
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/nested-loop-join-builder.cc
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/row-batch-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
16 files changed, 230 insertions(+), 153 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4448/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4023: don't attach buffered tuple streams to batches

2016-10-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4448/8//COMMIT_MSG
Commit Message:

Did the need_to_return/needs_deep_copy rename


http://gerrit.cloudera.org:8080/#/c/4448/8/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

Line 196:   batch->AddBlock(block, flush);
> when would we ever want to have batch != NULL and no flush?  Shouldn't we a
I don't think the BufferedTupleStream abstraction show have to know anything 
about the buffer management algorithm in the ExecNode implementations. Always 
having Close() flush assumes that that's what the ExecNode wants, but I don't 
think that would always be the case. E.g. it could make sense to have an 
ExecNode that flushes only when it needs more memory rather than every time it 
calls a particular BufferedTupleStream method.

This is also to make it explicit in the calling code that a flush is occurring 
(since it's normally an important part of the buffer management algorithm and 
it's pretty obscure if it's hidden in a Close() function).

We also have callsites where we call Close(batch, FlushMode::FLUSH_RESOURCES) 
where batch is sometimes NULL. E.g. Partition::Close(RowBatch* batch). In the 
case it makes sense to ignore just the flag if the batch is NULL.


http://gerrit.cloudera.org:8080/#/c/4448/8/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

Line 828:   output_batch->MarkFlushResources();
> if we need to expose MarkFlushResources() too, how about not also having th
I added the flag mainly as a way to force the caller to think about whether 
they should be flushing resources or not. Happy to remove it if you think it's 
unnecessary.


-- 
To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a27651bc2e8ff
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4746/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

PS4, Line 60: rewrite
> Maybe "rewriter"? Seeing a verb here is kind of weird :)
Sorry, missed this.
'rewrite' is a noun here, similar to the 'analysis' or 'service' package

Doesn't really matter to me, I can call it rewriter.


-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4134,IMPALA-3704: Kudu INSERT improvements

2016-10-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
..


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4728/6/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS6, Line 145: .
why?  (that's the more interesting part for the code reader since the code 
tells us the "what" part as well). 

also, what's the benefit of the user specifying a larger buffer if we're not 
going to fill it?


Line 282:  << "previous write operation might be 
inconsistent.\n";
can't the write operation be inconsistent even if there was no overflow? i.e. 
can't it be inconsistent if there was a single error?


Line 298: error_msg_buffer << "Kudu error(s) reported, first error: " 
<< e.ToString();
why not just create the Status here since we no longer will append to it later? 
and you can just use status.ok() and get rid of first_error/failed.


PS6, Line 330: kudu_error_counter_->value()
this doesn't really work in case of GetPendingErrors() returning overflow, 
right? are we okay with ignoring that case?


http://gerrit.cloudera.org:8080/#/c/4728/6/be/src/exec/kudu-table-sink.h
File be/src/exec/kudu-table-sink.h:

PS6, Line 48: due to a 
can't it return an error for other reasons?


PS6, Line 49: be reported as warnings.
isn't the first Kudu error also reported as the error status?


PS6, Line 113: if the Kudu client may be
 :   /// getting behind.
what does this mean? does the code use it, or is this saying a user could use 
it?  and how can that be determined from this timer?


http://gerrit.cloudera.org:8080/#/c/4728/6/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

PS6, Line 300: Error Kudu table 
should this say "Error in Kudu table ..." or something?


-- 
To view, visit http://gerrit.cloudera.org:8080/4728
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-21 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#5).

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..

IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

Introduces a new phase for rewriting Exprs after analysis and
before subquery rewriting. The transformed Exprs replace the
original ones in analyzed statements. If Exprs were changed,
the whole statement is reset() and re-analyzed, similar to how
subqueries are rewritten. If both Exprs and subqueries are
rewritten there is only one re-analysis of the changed statement.

The following new classes work together to perform transformations:
1. ExprRewriteRule
- base class for Expr transformation rules
2. ExprRewriter
- drives the transformation of Exprs using a list of
  ExprRewriteRules

Statements that have exprs to be rewritten need to implement
a new method rewriteExprs() that accepts an ExprRewriter.

As an example, this patch adds a rule for converting
BetweenPredicates into their equivalent CompoundPredicates.
The BetweenPredicate has been notoriously buggy due to a lack
of such a separate rewrite phase and is now cleaned up.

Testing:
1. Added a new test for checking that the rewrite framework
   covers all relevant statements, clauses and can properly
   handle nested statements and subqueries.
2. There are many existing tests for BetweePredicates and
   they all exercise the new rewrite rule/phase.
3. Ran a private core/hdfs run and it passed.

Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
A fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
A fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
28 files changed, 589 insertions(+), 214 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/4746/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4241: remove spurious child queries event

2016-10-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4241: remove spurious child queries event
..


Patch Set 2: Code-Review+2

carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3881d032622750444d750f161ad6843bdbd16c30
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 6:

Rebased onto the codegen interface changes.

-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#6).

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..

IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

This change enables codegen for all builtin aggregate functions,
e.g. timestamp functions and group_concat.

There are several parts to the change:
* Adding support for generic UDAs. Previous the codegen code did not
  handle multiple input arguments or NULL return values.
* Defaulting to using the UDA interface when there is not a special
  codegen path (we have implementations of all builtin aggregate
  functions for the interpreted path).
* Remove all the logic to disable codegen for the special cases that now
  are supported.

Also fix the generation of code to get/set NULL bits since I needed
to add functionality there anyway.

Testing:
Add tests that check that codegen was enabled for builtin aggregate
functions. Also fix some gaps in the preexisting tests.

Also add tests for UDAs that check input/output nulls are handled
correctly, in anticipation of enabling codegen for arbitrary UDAs.

Perf:
Ran local TPC-H and targeted perf. Spent a lot of time on TPC-H Q1,
since my original approach regressed it ~5%. In the end the problem was
to do with the ordering of loads/stores to the slot and null bit in the
generated code: the previous version of the code exploited some
properties of the particular aggregate function. I ended up replicating
this behaviour to avoid regressing perf.

Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
---
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/aggregation-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/hash-join-node.cc
M be/src/exec/hash-table.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-aggregation-node-ir.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/text-converter.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/case-expr.cc
M be/src/exprs/compound-predicates.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/tuple.cc
M be/src/runtime/types.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udfs.cc
M be/src/util/tuple-row-compare.cc
M testdata/workloads/functional-query/queries/QueryTest/uda.test
M tests/common/test_result_verifier.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_udfs.py
39 files changed, 842 insertions(+), 559 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/4655/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4212: Add sink output expressions to explain output

2016-10-21 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4783

Change subject: IMPALA-4212: Add sink output expressions to explain output
..

IMPALA-4212: Add sink output expressions to explain output

* Move output exprs from plan fragment to sink
* Add OUTPUT-EXPRS=(...) to relevant sinks

Test results attached. Some spurious changes not yet removed, will tidy
up once format is agreed.

The explain plan can have very long lines. IMPALA-4337 is the (open)
issue to address this.

Change-Id: I27c3a47d7ee28a9efb1405dd1a6ef9b7a83931f6
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-sink.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/runtime/plan-fragment-executor.cc
M common/thrift/DataSinks.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/complex-types-file-formats.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
M testdata/workloads/functional-planner/queries/PlannerTest/constant.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/disable-preaggregations.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/distinct-estimate.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test
M testdata/workloads/functional-planner/queries/PlannerTest/lineage.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mem-limit-broadcast-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-loop-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-key-scans.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
M testdata/workloads/functional-planner/queries/PlannerTest/small-query-opt.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-planner/queries/PlannerTest/values.test
M testdata/workloads/functional-planner/queries/PlannerTest/views.test
M 

[Impala-ASF-CR] IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions

2016-10-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
..


Patch Set 4:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

PS4, Line 242: if (val.is_null) goto null_block;
> Can you please elaborate this comment with code snippet like the following 
Done


http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS4, Line 1065: can ignore the NULL bit of its destination value is
  : // ignored
> ?
Done


PS4, Line 1067: set
> did you mean with NULL bit cleared ?
No it's set - the value should start off as NULL, since sum() of an empty set 
returns NULL.


Line 1599: RETURN_IF_ERROR(agg_expr->GetCodegendComputeFn(state_, 
_expr_fn));
> DCHECK(agg_expr_fn != NULL);
Done


PS4, Line 1620: src.CodegenBranchIfNull(, ret_block);
> It seems that we emit this check for all paths except for the UDA path.  Wh
This is deliberate and a correctness fix that is needed to match the 
interpreted path. In general the UDA may do some processing on NULL values so 
we need to pass them in. E.g. see the test UDA I added that counts the # of 
NULLs in anticipation of using this codegen for arbitrary UDAs.

Filtering out src NULLs before calling the UDA is a special case optimisation 
that we can do for aggregate functions where we know the semantics..


PS4, Line 1628: dst_type.type != TYPE_TIMESTAMP
  :   && !dst_type.IsStringType()
> This seems to be reused in multiple places. May be it's worth factoring thi
I factored this out and made it a positive check so that most of the conditions 
are one line now and it's clearer what cases are actually handled.


PS4, Line 1637: dst_type.type != TYPE_TIMESTAMP
  :   && !dst_type.IsStringType()
> It may be easier to read if we do the check once up front:
I think I prefer this way because the implementation is directly adjacent to 
the conditions for when it should be.

E.g. with the alternative factoring, If I'm looking at case agg_op == SUM, I 
have to look in two places and decode a more complex expression to figure out 
what type/agg_op combinations that code is handling.

Also since the UDA interface is the "generic" approach and the others are 
special-case optimisations, so it makes more sense to me to check for the 
special cases then false back to the generic one.

I did some cleanup and added a comment up the top to make it clearer how it 
should be read.


PS4, Line 1676: src.CodegenBranchIfNull(, ret_block);
> Actually, referring to the comment above again, it seems that we should jus
I find it easier to follow this way since the logic for each case is grouped 
together. E.g. all the logic for UpdateSlot() on a UDA fits on a screen, rather 
than having some of NULL handling down here and some 100 lines up.


PS4, Line 1685: resulting in
> with results stored in
Done


PS4, Line 1725: codegen->GetFunction(symbol);
> Should we consider adding "bool clone" as the second argument similar to th
Done.


PS4, Line 1745: Value* input_lowered_ptr =
  : 
codegen->CreateEntryBlockAlloca(builder->GetInsertBlock()->getParent(),
  : LlvmCodeGen::NamedVariable("input_lowered_ptr",
  : 
input_vals[i].value()->getType()));
  : builder->CreateStore(input_vals[i].value(), 
input_lowered_ptr);
  : Type* input_unlowered_ptr_type = 
CodegenAnyVal::GetUnloweredPtrType(
  : codegen, 
evaluator->input_expr_ctxs()[i]->root()->type());
  : Value* input_unlowered_ptr = builder->CreateBitCast(
  : input_lowered_ptr, input_unlowered_ptr_type, 
"input_unlowered_ptr");
> Do you think it's worth factoring this logic out to a function given it's r
There was actually already a helper in CodegenAnyVal that did most of the work. 
Improved/cleaned that up a bit then used it.


http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/exprs/compound-predicates.cc
File be/src/exprs/compound-predicates.cc:

PS4, Line 68:  
> Please consider removing these extra blank spaces too.
Done


http://gerrit.cloudera.org:8080/#/c/4655/4/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

PS4, Line 603:   Value* tuple_bytes = builder->CreateBitCast(tuple, 
codegen->ptr_type());
 :   Constant* null_byte_offset =
 :   ConstantInt::get(codegen->int_type(), 
null_indicator_offset_.byte_offset);
 :   Value* null_byte_ptr =
 :   builder->CreateInBoundsGEP(tuple_bytes, null_byte_offset, 
"null_byte_ptr");
> What do you think about factoring this code snippet as a utility function f
Done. Also now sharing the code with SlotRef.



[Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function

2016-10-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function
..


Patch Set 3: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

Line 403:   }
so this bug was similar to IMPALA-3311 (see 
PartitionedAggregationNode::HandleOutputStrings()), right? but i guess we can't 
share the solution since in this case we're not directly populating the result 
row batch.


Line 420:  (next_partition || (fn_scope_ == RANGE && 
window_.__isset.window_end &&
this indentation seems off. what does clang-format say?


http://gerrit.cloudera.org:8080/#/c/4740/3/be/src/exprs/agg-fn-evaluator.h
File be/src/exprs/agg-fn-evaluator.h:

PS3, Line 153: beyond the scope of the call site
> I'm still not sure this is precise enough... this sounds like it's scoped a
Maybe say "The memory is a local allocation and so needs to be copied out 
before local allocations are freed" or something?  (As much as I dislike the 
terminology "local allocations" it is the terminology we currently have).


-- 
To view, visit http://gerrit.cloudera.org:8080/4740
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-21 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..


Patch Set 3:

(8 comments)

As discussed, eliminated the separate list on the webpage in favor of just 
having the "Queued" event

http://gerrit.cloudera.org:8080/#/c/4756/2/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

PS2, Line 879: Status SimpleScheduler::Schedule
> Can you comment in the header the post-conditions:
Done


PS2, Line 900: // TODO-MT: call AdmitQuery()
> this should also have set_is_admitted(true) until this TODO is resolved.
Done


http://gerrit.cloudera.org:8080/#/c/4756/2/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

PS2, Line 348: _t wait
> now 'waiting' is a bit ambiguous... can you rename to waiting_for_close now
Done


http://gerrit.cloudera.org:8080/#/c/4756/2/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

PS2, Line 157:   /// Resource pool associated with this query, or an empty 
string if the schedule has not
 :   /// been created and had the pool set yet.
 :   std::string request_pool() const {
 : return schedule_ == NULL ? "" : schedule_->request_pool();
 :   }
 :   int num_rows_fetched() const { return num_rows_fetched_; }
 :   void set_fetched_rows() { fetched_rows_ = true; }
 :   bool fetched_rows() const { return fetched_rows_; }
 :   b
> I think this is big enough to move it out of the header.
Done


PS2, Line 166:   const TResultSetMetadata* result_metadata() { return 
_metadata_; }
 :   const TUniqueId& query_id() const { return 
query_ctx_.query_id; }
 :   c
> comment here too, when does this return an empty string?
Done


http://gerrit.cloudera.org:8080/#/c/4756/2/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

PS2, Line 602: _get_queries_page
> the metrics page exposes json too (and 'metrics' is a bit overloaded), so w
Done


PS2, Line 603: the number of queries currently in the 'admit
> the number of queries currently in the 'admitted' and 'queued' states
Done


PS2, Line 710:   if thread.error is not None:
 : raise thread.error
 : 
> here's where it'd be helpful to differentiate between current counts (i.e. 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1169: Admission control info on the queries debug webpage

2016-10-21 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#3).

Change subject: IMPALA-1169: Admission control info on the queries debug webpage
..

IMPALA-1169: Admission control info on the queries debug webpage

This patch adds a new event, 'Queued', to the query event log to
indicate when a query is queued by the admission controller. This
means that queries on the '/queries' page that are currently
queued will display this as their 'Last Event', making it possible
to see which queries are current queued.

It also adds a column to show the resource pool associated with
the queries.

Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/scheduling/simple-scheduler.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.h
M tests/common/impala_service.py
M tests/custom_cluster/test_admission_controller.py
M www/queries.tmpl
11 files changed, 52 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/4756/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4756
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I504e3c829a14318721e3a42de6281bcc578f7283
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

2016-10-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-1654: General partition exprs in DDL operations.
..


Patch Set 18:

Congrats, Amos! We are good to merge this patch. Do you have time to rebase it?

-- 
To view, visit http://gerrit.cloudera.org:8080/3942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Amos Bird 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Amos Bird 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3788: Support for Kudu 'read-your-writes' consistency

2016-10-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#2).

Change subject: IMPALA-3788: Support for Kudu 'read-your-writes' consistency
..

IMPALA-3788: Support for Kudu 'read-your-writes' consistency

Kudu provides an API to get/set a 'latest observed
timestamp' on clients to allow a client which inserts to
capture and send this timestamp to another client before a
read to ensure that data as of that timestamp is visible.
This adds support for this feature _for reads within a
session_ by capturing the latest observed timestamp when the
KuduTableSink is sending its last update to the coordinator.
The timestamp is sent with other post-write information, and
is aggregated (i.e. taking the max) at the coordinator. The
max is stored in the session, and that value is then set in
the Kudu client on future scans.

This is being tested by running the Kudu tests after
removing delays that were introduced to work around the
issue that reads might not be visible after a write. Before
this change, if there were no delay, inconsistent results
could be returned.

Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd
---
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-table-sink.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M common/thrift/ImpalaInternalService.thrift
M tests/common/impala_test_suite.py
10 files changed, 53 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4779/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4779
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6bcb5fc218ad4ab935343a55b2188441d8c7dfbd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT DOP > 0.

2016-10-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT_DOP 
> 0.
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: No


[Impala-ASF-CR] Minor fixes to remove more "cloudera"s from the code.

2016-10-21 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Minor fixes to remove more "cloudera"s from the code.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4774/1//COMMIT_MSG
Commit Message:

Line 7: Minor fixes to remove more "cloudera"s from the code.
> Is there a Jira we could refer to here?
Not really. IMPALA-3221 and IMPALA-4047 are closely related, though.


-- 
To view, visit http://gerrit.cloudera.org:8080/4774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I27687a3b677def72f1867df54c8e50c93d5cab3a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3718: Add test cancellation tests for Kudu

2016-10-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3718: Add test_cancellation tests for Kudu
..


Patch Set 3: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4700
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf3d3853e7075991f6d12f125407ebdbe6a287e2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] Minor fixes to remove more "cloudera"s from the code.

2016-10-21 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Minor fixes to remove more "cloudera"s from the code.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4774/1//COMMIT_MSG
Commit Message:

Line 7: Minor fixes to remove more "cloudera"s from the code.
Is there a Jira we could refer to here?


-- 
To view, visit http://gerrit.cloudera.org:8080/4774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I27687a3b677def72f1867df54c8e50c93d5cab3a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


Re: [Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate

2016-10-21 Thread Alex Behm
Thanks for your contribution!

On Fri, Oct 21, 2016 at 12:45 AM, Internal Jenkins (Code Review) <
ger...@cloudera.org> wrote:

> Internal Jenkins has submitted this change and it was merged.
>
> Change subject: IMPALA-4325: StmtRewrite lost parentheses of
> CompoundPredicate
> ..
>
>
> IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
>
> StmtRewrite lost parentheses of CompoundPredicate in
> pushNegationToOperands()
> and leads to incorrect toSql() result. Even though this issue would not
> leads
> to incorrect result of query, it makes user confuse of the logical operator
> precedence of predicates shown in EXPLAIN statement.
>
> Change-Id: I79bfc67605206e0e026293bf7032a88227a95623
> Reviewed-on: http://gerrit.cloudera.org:8080/4753
> Reviewed-by: Alex Behm 
> Tested-by: Internal Jenkins
> ---
> M fe/src/main/java/org/apache/impala/analysis/Expr.java
> M testdata/workloads/functional-planner/queries/PlannerTest/
> subquery-rewrite.test
> 2 files changed, 28 insertions(+), 1 deletion(-)
>
> Approvals:
>   Internal Jenkins: Verified
>   Alex Behm: Looks good to me, approved
>
>
>
> --
> To view, visit http://gerrit.cloudera.org:8080/4753
> To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
>
> Gerrit-MessageType: merged
> Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623
> Gerrit-PatchSet: 6
> Gerrit-Project: Impala-ASF
> Gerrit-Branch: master
> Gerrit-Owner: Yuanhao Luo 
> Gerrit-Reviewer: Alex Behm 
> Gerrit-Reviewer: Internal Jenkins
> Gerrit-Reviewer: Yuanhao Luo 
>
> --
> You received this message because you are subscribed to the Google Groups
> "impala-cr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to impala-cr+unsubscr...@cloudera.com.
> For more options, visit https://groups.google.com/a/cloudera.com/d/optout.
>


[Impala-ASF-CR] IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT DOP > 0.

2016-10-21 Thread Alex Behm (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4767

to look at the new patch set (#5).

Change subject: IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT_DOP 
> 0.
..

IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT_DOP > 0.

IMPALA-4258: The problem was that there was a reference to
HdfsScanner::batch_ hidden inside WriteEmptyTuples(). The batch_
reference is NULL when the scanner is run with MT_DOP > 1.

IMPALA-4286: When there are no scan ranges HdfsScanNodeBase::Open()
exits early without initializing the reader context. This lead to
a DCHECK in IoMgr::GetNextRange() called from HdfsScanNodeMt.
The fix is to remove that unnecessary short-circuit Open().

I combined these two bugfixes because the new basic test covers
both cases.

Testing: Added a new test_mt_dop.py test. A private code/hdfs
run passed.

Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b
---
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-rcfile-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
A testdata/workloads/functional-query/queries/QueryTest/mt-dop.test
A tests/query_test/test_mt_dop.py
10 files changed, 73 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/4767/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT DOP > 0.

2016-10-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4285/IMPALA-4286: Fixes for Parquet scanner with MT_DOP 
> 0.
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4767/4//COMMIT_MSG
Commit Message:

Line 17: HdfsScanNodeMt::GetNext().
> why do we have this check in HdfsScanNodeBase::Open() in the first place?
Removed the short-circuit in Open() instead.


http://gerrit.cloudera.org:8080/#/c/4767/4/be/src/exec/hdfs-scan-node-mt.cc
File be/src/exec/hdfs-scan-node-mt.cc:

PS4, Line 71: ,
> did you mean to say more or make it a period?
I removed the short-circuit in Open() instead.


Line 72:   if (file_descs_.empty()) {
> this is okay, but why can't we just delete the equivalent check in HdfsScan
No, removed it.


http://gerrit.cloudera.org:8080/#/c/4767/4/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

PS4, Line 230: row
> "row" seems misleading
Not relevant anymore, I also removed the conjunct eval. Tests passed.


PS4, Line 230: an
> and
Done


Line 236:   // Set remaining tuples in row.
> misleading comment
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79c0f6fd2aeb4bc6fa5f87219a485194fef2db1b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3739: Enable stress tests on Kudu

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3739: Enable stress tests on Kudu
..


IMPALA-3739: Enable stress tests on Kudu

This commit modifies the stress test framework to run TPC-H and TPC-DS
workloads against Kudu. The follwing changes are included in this
commit:
1. Created template files with DDL and DML statements for loading TPC-H and
   TPC-DS data in Kudu
2. Created a script (load-tpc-kudu.py) to load data in Kudu. The
   script is invoked by the stress test runner to load test data in an
   existing Impala/Kudu cluster (both local and CM-managed clusters are
   supported).
3. Created SQL files with TPC-DS queries to be executed in Kudu. SQL
   files with TPC-H queries for Kudu were added in a previous patch.
4. Modified the stress test runner to take additional parameters
   specific to Kudu (e.g. kudu master addr)

The stress test runner for Kudu was tested on EC2 clusters for both TPC-H
and TPC-DS workloads.

Missing functionality:
* No CRUD operations in the existing TPC-H/TPC-DS workloads for Kudu.
* Not all supported TPC-DS queries are included. Currently, only the
  TPC-DS queries from the testdata/workloads/tpcds/queries directory
  were modified to run against Kudu.

Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Reviewed-on: http://gerrit.cloudera.org:8080/4327
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Internal Jenkins
---
A testdata/bin/load-tpc-kudu.py
M testdata/datasets/functional/functional_schema_template.sql
A testdata/datasets/tpcds/tpcds_kudu_template.sql
A testdata/datasets/tpch/tpch_kudu_template.sql
A testdata/workloads/tpcds/queries/tpcds-kudu-q19.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q27.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q3.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q34.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q42.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q43.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q46.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q47.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q52.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q53.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q55.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q59.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q6.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q61.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q63.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q65.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q68.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q7.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q73.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q79.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q8.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q88.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q89.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q96.test
A testdata/workloads/tpcds/queries/tpcds-kudu-q98.test
M tests/comparison/db_connection.py
M tests/stress/concurrent_select.py
31 files changed, 2,459 insertions(+), 6 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dimitris Tsirogiannis: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3c9fc3dae24b761f031ee8e014bd611a49029d34
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate

2016-10-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4325: StmtRewrite lost parentheses of CompoundPredicate
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4753
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79bfc67605206e0e026293bf7032a88227a95623
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yuanhao Luo 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Yuanhao Luo 
Gerrit-HasComments: No