[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-10-28 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Add functional tests for compute stats with mt_dop > 0.
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#3).

Change subject: Add functional tests for compute stats with mt_dop > 0.
..

Add functional tests for compute stats with mt_dop > 0.

Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
---
A 
testdata/workloads/functional-query/queries/QueryTest/mt-dop-compute-stats.test
M tests/query_test/test_mt_dop.py
2 files changed, 21 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4879/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: Add functional tests for compute stats with mt_dop > 0.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4879/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS1, Line 52: location
> Maybe add a comment "Create a second table pointing to the same data files"
Done


http://gerrit.cloudera.org:8080/#/c/4879/2/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS2, Line 50: tion("
> format
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-10-28 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Add functional tests for compute stats with mt_dop > 0.
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4879/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS1, Line 51: execute_query_using_client
> That version doesn't switch the DB and extracting the DB manually from the 
Cool, learned something.


PS1, Line 52: location
> I want to just create "clone" of an existing table that has the same format
Maybe add a comment "Create a second table pointing to the same data files"?


http://gerrit.cloudera.org:8080/#/c/4879/2/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS2, Line 50: fotmat
format


-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: Add functional tests for compute stats with mt_dop > 0.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4879/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS1, Line 51: execute_query_using_client
> I think self.execute_query does the same.
That version doesn't switch the DB and extracting the DB manually from the 
vector is super painful. Added comment.


PS1, Line 52: location
> Why is it necessary to change the location here? I'm mostly curious because
I want to just create "clone" of an existing table that has the same format and 
points to the same files. Create table like does not copy the location, so I 
need to specify it. Alternatives like CTAS don't work because it's not easy to 
specify the format in the CTAS, and we cannot insert into all formats.


Line 54: new_vector = deepcopy(vector)
> can you add a comment why this copy step is needed?
For this test it's actually not needed because in every test we set the mt_dop 
exec option again. I was worried about modifying the vector in place and 
polluting the state of subsequent tests, but all tests use and set mt_dop 
anyway.


-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#2).

Change subject: Add functional tests for compute stats with mt_dop > 0.
..

Add functional tests for compute stats with mt_dop > 0.

Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
---
A 
testdata/workloads/functional-query/queries/QueryTest/mt-dop-compute-stats.test
M tests/query_test/test_mt_dop.py
2 files changed, 20 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4879/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4377: Add detailed error log when a UdfExecutorTest fails.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4881

Change subject: IMPALA-4377: Add detailed error log when a UdfExecutorTest 
fails.
..

IMPALA-4377: Add detailed error log when a UdfExecutorTest fails.

This is not a fix for IMPALA-4377 but should help with diagnosing
the problem if we ever hit the failure again.

Change-Id: Id94130715e4f342e4dd2f6cd137ac1eb7b1ecf2d
---
M fe/src/test/java/org/apache/impala/hive/executor/UdfExecutorTest.java
1 file changed, 106 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/4881/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4881
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id94130715e4f342e4dd2f6cd137ac1eb7b1ecf2d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-28 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3676: Use clang as a static analysis tool
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4758/9/be/src/experiments/bit-stream-utils.8byte.inline.h
File be/src/experiments/bit-stream-utils.8byte.inline.h:

Line 100: if (num_bits - bit_offset_ < size) {
> the dcheck at line 92 contradicts the need for this if-stmt.  is the checke
undone


http://gerrit.cloudera.org:8080/#/c/4758/9/be/src/util/aligned-new.h
File be/src/util/aligned-new.h:

Line 50: };
> given that we only really use this for cache alignment, how did you test it
I ran core tests. I did not add any unit tests.


http://gerrit.cloudera.org:8080/#/c/4758/9/be/src/util/thread-pool.h
File be/src/util/thread-pool.h:

Line 34: class ThreadPool : public CacheLineAligned {
> was this aligned before? if not, why are we aligning it now?
Clang complained that it should have been aligned to meet the invariant we 
asked for on BlockingQueue


-- 
To view, visit http://gerrit.cloudera.org:8080/4758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-28 Thread Jim Apple (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4758

to look at the new patch set (#10).

Change subject: IMPALA-3676: Use clang as a static analysis tool
..

IMPALA-3676: Use clang as a static analysis tool

This patch adds a script to run clang-tidy over the whole code
base. It is a first step towards running clang-tidy over patches as a
tool to help users spot bugs before code review.

Because of the number of clang-tidy checks, this patch only addresses
some of them. In particular, only checks starting with 'clang' are
considered. Many of them which are flaky or not part of our style are
excluded from the analysis. This patch also exlcudes some checks which
are part of our current style but which would be too laborious to fix
over the entire codebase, like using nullptr rather than NULL.

This patch also fixes a number of small bugs found by clang-tidy.

Finally, this patch adds the class AlignedNew, the purpose of which is
to provide correct alignment on heap-allocated data. The global new
operator only guarantees 16-byte alignment. A class that includes a
member variable that must be aligned on a k-byte boundary for k>16 can
inherit from AlignedNew to ensure correct alignment on the heap,
quieting clang's -Wover-aligned warning. (Static and stack allocation
are required by the standard to respect the alignment of the type and
its member variables, so no extra code is needed for allocation in
those places.)

Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
---
A .clang-tidy
M CMakeLists.txt
M be/CMakeLists.txt
M be/src/benchmarks/bloom-filter-benchmark.cc
M be/src/benchmarks/in-predicate-benchmark.cc
M be/src/benchmarks/parse-timestamp-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/common/compiler-util.h
M be/src/common/init.cc
M be/src/common/logging.h
M be/src/common/status.h
M be/src/exec/delimited-text-parser.cc
M be/src/exec/delimited-text-parser.h
M be/src/exec/exec-node.h
M be/src/exec/external-data-source-executor.h
M be/src/exec/hash-table-test.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-table-writer.h
M be/src/exec/parquet-column-readers.h
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/scanner-context.inline.h
M be/src/exec/write-stream.h
M be/src/experiments/bit-stream-utils.8byte.inline.h
M be/src/experiments/tuple-splitter-test.cc
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M be/src/exprs/compound-predicates.h
M be/src/exprs/expr-test.cc
M be/src/exprs/slot-ref.cc
M be/src/exprs/string-functions-ir.cc
M be/src/rpc/TAcceptQueueServer.h
M be/src/rpc/authentication.cc
M be/src/rpc/authentication.h
M be/src/rpc/thrift-server.h
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/data-stream-sender.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.h
M be/src/runtime/free-pool-test.cc
M be/src/runtime/hbase-table.cc
M be/src/runtime/hdfs-fs-cache.h
M be/src/runtime/multi-precision-test.cc
M be/src/runtime/runtime-filter-bank.h
M be/src/runtime/runtime-filter.h
M be/src/runtime/scoped-buffer.h
M be/src/runtime/string-buffer.h
M be/src/runtime/timestamp-parse-util.cc
M be/src/runtime/timestamp-value.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/simple-scheduler-test.cc
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/service/query-result-set.cc
M be/src/statestore/failure-detector.h
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/testutil/test-udas.cc
M be/src/testutil/test-udas.h
M be/src/transport/TSasl.h
M be/src/transport/TSaslClientTransport.h
M be/src/transport/TSaslServerTransport.h
M be/src/transport/TSaslTransport.h
M be/src/udf/uda-test-harness.h
M be/src/udf/uda-test.cc
M be/src/udf/udf-ir.cc
M be/src/udf/udf.cc
M be/src/udf/udf.h
M be/src/udf_samples/uda-sample.cc
A be/src/util/aligned-new.h
M be/src/util/benchmark-test.cc
M be/src/util/bit-stream-utils.inline.h
M be/src/util/bit-util-test.cc
M be/src/util/blocking-queue-test.cc
M be/src/util/blocking-queue.h
M be/src/util/bloom-filter-test.cc
M b

[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.

2016-10-28 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-1286: Extract common conjuncts from disjunctions.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4877/1/fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java
File fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java:

Line 37: public class ExtractCommonConjunctRule implements ExprRewriteRule {
where is this being applied?


http://gerrit.cloudera.org:8080/#/c/4877/1/testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test:

Line 3384:   p_brand = 'Brand#12'
> I'm not really sure how this patch handles a bushy predicate. Do we flatten
i agree, i think we need normalization as a separate transformation step. this 
will be a prerequisite for other transformations as well, and having to bake 
normalization into every single transformation rule doesn't make sense.


Line 3393:   p_brand = 'Brand#23'
additional idea for transformation rule: derive additional, non-essential 
disjunctive scan predicates if the selectivity of the base predicates is 
sufficiently small (for '=', say).

in the example: brand = 12 or brand = 23 or brand = 34 as an extra scan 
predicate for 'parts', if 'brand' has a sufficiently high cardinality.


-- 
To view, visit http://gerrit.cloudera.org:8080/4877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4387: validate decimal type in Avro file schema

2016-10-28 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4387: validate decimal type in Avro file schema
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4876
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3676: Use clang as a static analysis tool

2016-10-28 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3676: Use clang as a static analysis tool
..


Patch Set 9:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4758/9/be/src/exec/hdfs-avro-scanner.cc
File be/src/exec/hdfs-avro-scanner.cc:

Line 933: builder->CreateCondBr(ret_val, end_field_block, bail_out);
this still is buggy. but we have IMPALA-4061 to track that, so okay.


http://gerrit.cloudera.org:8080/#/c/4758/9/be/src/experiments/bit-stream-utils.8byte.inline.h
File be/src/experiments/bit-stream-utils.8byte.inline.h:

Line 100: if (num_bits - bit_offset_ < size) {
the dcheck at line 92 contradicts the need for this if-stmt.  is the checker 
just getting confused, or is there really a path where we can have num_bits > 
size?
if the checker is confused, can we just disable it here? adding this if-stmt 
makes it unclear whether num_bits <= size is really an invariant or not.


http://gerrit.cloudera.org:8080/#/c/4758/9/be/src/util/aligned-new.h
File be/src/util/aligned-new.h:

Line 50: };
given that we only really use this for cache alignment, how did you test it?


http://gerrit.cloudera.org:8080/#/c/4758/9/be/src/util/thread-pool.h
File be/src/util/thread-pool.h:

Line 34: class ThreadPool : public CacheLineAligned {
was this aligned before? if not, why are we aligning it now?


-- 
To view, visit http://gerrit.cloudera.org:8080/4758
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4ed168488cb30ddeccd0087f3840541d858f9c06
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4387: validate decimal type in Avro file schema

2016-10-28 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4387: validate decimal type in Avro file schema
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4876
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4387: validate decimal type in Avro file schema

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4387: validate decimal type in Avro file schema
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4876/2//COMMIT_MSG
Commit Message:

PS2, Line 16: scans
> nit: wrap at 80 chars.
Done


http://gerrit.cloudera.org:8080/#/c/4876/2/be/src/util/avro-util.cc
File be/src/util/avro-util.cc:

Line 109: case AVRO_BYTES:
> maybe add a // fallthrough comment at the end of the line?
It doesn't seem necessary when the case labels are adjacent like this, we don't 
do it in other parts of the codebase.


Line 139:   return Status::OK();
> Wouldn't it be better to return a non-OK status in a release build? If not,
Good point, this value comes from the Avro library so we don't have any 
guarantee it is a supported type. Fixed it.


http://gerrit.cloudera.org:8080/#/c/4876/2/be/src/util/avro-util.h
File be/src/util/avro-util.h:

PS2, Line 84: name
> column_name?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4876
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4387: validate decimal type in Avro file schema

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-4387: validate decimal type in Avro file schema
..

IMPALA-4387: validate decimal type in Avro file schema

This patch prevents an invalid decimal type in an Avro file schema from
crashing Impala. Most invalid Avro schemas are caught by the frontend,
but file schemas still need to be validated by the backend.

After this patch files with bad schemas are skipped.

Testing:
This was hit very rarely by the scanner fuzzing. Added a regression test that
scans a file with a bad schema.

Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
---
M be/src/exec/hdfs-avro-scanner.cc
M be/src/runtime/decimal-test.cc
M be/src/runtime/types.h
M be/src/util/avro-util.cc
M be/src/util/avro-util.h
M common/thrift/generate_error_codes.py
M testdata/bad_avro_snap/README
A testdata/bad_avro_snap/invalid_decimal_schema.avro
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
11 files changed, 104 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/4876/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4876
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4223: Handle truncated file read from HDFS cache
..


Patch Set 1: Code-Review+1

Change looks good to me, I'm not convinced this was the problem we hit in 
IMPALA-4223 though, because it would not have made it this far when there was 
an error seeking in Open().

I think there's some protection against opening the wrong version of a HDFS 
file because the mtime of the cached file is checked (this check is definitely 
a good idea though)

-- 
To view, visit http://gerrit.cloudera.org:8080/4828
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4387: validate decimal type in Avro file schema

2016-10-28 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4387: validate decimal type in Avro file schema
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4876/2//COMMIT_MSG
Commit Message:

PS2, Line 16: scans
nit: wrap at 80 chars.


http://gerrit.cloudera.org:8080/#/c/4876/2/be/src/util/avro-util.cc
File be/src/util/avro-util.cc:

Line 109: case AVRO_BYTES:
maybe add a // fallthrough comment at the end of the line?


Line 139:   return Status::OK();
Wouldn't it be better to return a non-OK status in a release build? If not, can 
you add a comment explaining why / what the consequences are?


http://gerrit.cloudera.org:8080/#/c/4876/2/be/src/util/avro-util.h
File be/src/util/avro-util.h:

PS2, Line 84: name
column_name?


-- 
To view, visit http://gerrit.cloudera.org:8080/4876
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-10-28 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Add functional tests for compute stats with mt_dop > 0.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4879/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS1, Line 51: execute_query_using_client
I think self.execute_query does the same.


PS1, Line 52: location
Why is it necessary to change the location here? I'm mostly curious because I 
did something similar in a different change and didn't specify a location, so 
the data files ended up in the unique_database's directory.


Line 54: new_vector = deepcopy(vector)
can you add a comment why this copy step is needed?


-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..


Patch Set 5: Code-Review+2

+2 for backend, +1 for frontend.

-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache

2016-10-28 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4223: Handle truncated file read from HDFS cache
..


Patch Set 1:

(1 comment)

> (1 comment)
 > 
 > > (1 comment)
 > >
 > > > (1 comment)
 > > >
 > > > Can we test this by having the test load metadata and then
 > > truncate
 > > > a cached file?
 > >
 > > We can try to do this in a custom cluster test. It needs to
 > follow
 > > the steps outlined here: 
 > > https://issues.cloudera.org/browse/IMPALA-4223?focusedCommentId=212776&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-212776
 > >
 > > They require changes to the system limits to allow for larger
 > > cached files, and to the data nodes to increase the file cache as
 > > well, so they might be rather disruptive. Once these are changed
 > we
 > > can write a custom cluster test to download, truncate, and upload
 > > files and run queries over them, checking that the correct log
 > > messages appear.
 > >
 > > Should we break this out into several Jiras / changes? The limits
 > > should be change in impala-setup, too. The datanode settings
 > change
 > > will be required to integrate this into fuzz testing, too.
 > >
 > > I will also have a look at the scanner fuzz test and see if it is
 > > easy to inject these error there.
 > 
 > Yes, if it's a good amount of work to set up that kind of test,
 > let's save it for another step.  It's probably better to spend time
 > with getting the fuzzer working rather than this one particular
 > test case.

I Opened IMPALA-4394 and IMPALA-4395 to track both adding tests for this change 
and adding Hdfs caching to the fuzz tests.

http://gerrit.cloudera.org:8080/#/c/4828/1/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

PS1, Line 448: disk
> Okay. I'm fine either leaving this one as "disk" or changing both.
Thanks. I'll leave it to keep any tooling that searches for this working.


-- 
To view, visit http://gerrit.cloudera.org:8080/4828
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads

2016-10-28 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads
..


Patch Set 11: Code-Review+2

(1 comment)

Carry +2

http://gerrit.cloudera.org:8080/#/c/4371/10/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

Line 209:   SummaryStatsCounter(TUnit::type unit, int32_t total_num_values,
> I'm not sure what this comment means -- this is a constructor, so how can t
Oops, that was meant for SetStats(). I put that here by mistake. I've removed 
this line here now.


-- 
To view, visit http://gerrit.cloudera.org:8080/4371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads

2016-10-28 Thread Sailesh Mukil (Code Review)
Hello Henry Robinson, Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4371

to look at the new patch set (#11).

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads
..

IMPALA-3823: Add timer to measure Parquet footer reads

It's been observed that Parquet footer reads perform poorly especially
when reading from S3. This patch adds a timer "FooterProcessingTimer"
which keeps a track of the average time each split of each scan node
spends in reading and processing the parquet footer.

Added a new utility counter called SummaryStatsCounter which keeps
track of the min, max and average values seen so far from a set of
values. This counter is used to calculate the min, max and average
time taken to scan and process Parquet footers per query per node.

The RuntimeProfile has also been updated to keep a track of, display
and serialize this new counter to thrift.

BE tests have been added to verify that this counter works fine.

Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile-test.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M common/thrift/RuntimeProfile.thrift
M tests/query_test/test_scanners.py
8 files changed, 303 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4371/11
-- 
To view, visit http://gerrit.cloudera.org:8080/4371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4278: Don't abort Catalog startup quickly if HMS is not 
present
..


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4842/3/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

PS3, Line 41: spend on trying to establish connection to HMS the "
: "first time on startup
"wait to establish an initial connection to the HMS"


PS3, Line 75: init_first
init_first seems redundant, here and everywhere else.

How about "initial_hms_cnxn_timeout_s"


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java
File fe/src/main/java/org/apache/impala/catalog/Catalog.java:

Line 67:   protected final MetaStoreClientPool metaStoreClientPool_ = new 
MetaStoreClientPool(0, 0);
long line


PS3, Line 103: public void initMetaStoreClientPool(long 
initFirstClientTimeoutSeconds) {
 : metaStoreClientPool_.addClients(META_STORE_CLIENT_POOL_SIZE,
 : initFirstClientTimeoutSeconds);
 :   }
Why not call this in the c'tor any more? (if timeout is < 0, add no clients. Or 
maybe pass META_STORE_CLIENT_POOL_SIZE as a parameter to the c'tor).


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

PS3, Line 163: super
don't think you need super., do you?


http://gerrit.cloudera.org:8080/#/c/4842/3/fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java
File fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java:

PS3, Line 156: Add 'numClients' to the client pool.
 :* 'initFirstClientTimeoutSeconds' specifies the time (in 
seconds) spent on trying to
 :* initialize the first MetaStore client before giving up.
 :*/
it might be clearer to call this, initially, as addClients(1, 120), then call 
addClients(9, 10) (or whatever the second timeout should be). The current API 
is a little confusing.


http://gerrit.cloudera.org:8080/#/c/4842/3/tests/experiments/test_catalog_hms_failures.py
File tests/experiments/test_catalog_hms_failures.py:

PS3, Line 61: metadatra
metadata


PS3, Line 109: statestored.service.wait_for_live_subscriber
if this fails, I think the test can leave the HMS not running. That will affect 
other tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/4842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c70f939429e1d0d20284a1307f3ee1278ae047
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads

2016-10-28 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads
..


Patch Set 10: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4371/10/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

Line 209:   // This replaces the existing counter with the new values that are 
passed as arguments.
I'm not sure what this comment means -- this is a constructor, so how can the 
counter be existing already?


-- 
To view, visit http://gerrit.cloudera.org:8080/4371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4223: Handle truncated file read from HDFS cache

2016-10-28 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4223: Handle truncated file read from HDFS cache
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4828
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1e1fdb0211819c5938956abb13b512350a46f1a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4048: Misc. improvements to /sessions

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#2).

Change subject: IMPALA-4048: Misc. improvements to /sessions
..

IMPALA-4048: Misc. improvements to /sessions

* Make table searchable and sortable
* Fix 'last accessed time' being printed in UTC
* Made table contents more compact so that it mostly fits on screen
* Clarify summary text re: active and inactive sessions
* Include fix in mustache-cpp required to print 64-bit integers
  correctly (see
  
https://github.com/henryr/cpp-mustache/commit/29768bf0e84f5a1e95e006fc64996d375499dbda)

Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
Testing: Visual inspection and manual sorting, searching etc.
---
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/thirdparty/mustache/mustache.cc
M www/sessions.tmpl
6 files changed, 79 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/4880/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-4048: Misc. improvements to /sessions

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4880

Change subject: IMPALA-4048: Misc. improvements to /sessions
..

IMPALA-4048: Misc. improvements to /sessions

* Make table searchable and sortable
* Fix 'last accessed time' being printed in UTC
* Made table contents more compact so that it mostly fits on screen
* Clarify summary text re: active and inactive sessions
* Include fix in mustache-cpp required to print 64-bit integers
  correctly (see
  
https://github.com/henryr/cpp-mustache/commit/29768bf0e84f5a1e95e006fc64996d375499dbda)

Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
Testing: Visual inspection and manual sorting, searching etc.
---
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/thirdparty/mustache/mustache.cc
M www/sessions.tmpl
6 files changed, 74 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/80/4880/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4880
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I14edcb6d60cf031a62c5a20b2d2b4d23248633a3
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4878/2/be/src/bufferpool/buffer-allocator.h
File be/src/bufferpool/buffer-allocator.h:

PS2, Line 38: MUST_USE(Status)
> The type attribute would definitely make it easier to apply, I'm ok with ho
I don't think we need to hold off.  Let's just decide on the syntax and then 
revisit once we can apply to the type.


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4878/2/be/src/bufferpool/buffer-allocator.h
File be/src/bufferpool/buffer-allocator.h:

PS2, Line 38: MUST_USE(Status)
> Good point!
The type attribute would definitely make it easier to apply, I'm ok with 
holding off until we're in a position to do this, or only using it on key APIs. 
It doesn't look like GCC supports it as a type attribute until the unreleased 
version 7.


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4862/3/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

Line 183: Tuple* kudu_tuple = 
reinterpret_cast(const_cast(krow.cell(0)));
> Logically it's not necessary, but getting rid of it leads to having to also
Ok, no problem then.


-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR](asf-site) Follow Apache Project Branding Requirements

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Follow Apache Project Branding Requirements
..


Patch Set 2: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4683/2/community.html
File community.html:

Line 147: 
Maybe lose some of this whitespace?


http://gerrit.cloudera.org:8080/#/c/4683/2/downloads.html
File downloads.html:

PS2, Line 161: navbar
Will trust you that this looks correct and is semantically reasonable.


-- 
To view, visit http://gerrit.cloudera.org:8080/4683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I263790d7c9aba7bfc57de7c16a367fe5204c49ff
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4387: validate decimal type in Avro file schema

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-4387: validate decimal type in Avro file schema
..

IMPALA-4387: validate decimal type in Avro file schema

This patch prevents an invalid decimal type in an Avro file schema from
crashing Impala. Most invalid Avro schemas are caught by the frontend,
but file schemas still need to be validated by the backend.

After this patch files with bad schemas are skipped.

Testing:
This was hit very rarely by the scanner fuzzing. Added a regression test that 
scans
a file with a bad schema.

Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
---
M be/src/exec/hdfs-avro-scanner.cc
M be/src/runtime/decimal-test.cc
M be/src/runtime/types.h
M be/src/util/avro-util.cc
M be/src/util/avro-util.h
M common/thrift/generate_error_codes.py
M testdata/bad_avro_snap/README
A testdata/bad_avro_snap/invalid_decimal_schema.avro
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
11 files changed, 102 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/4876/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4876
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Alex Behm (Code Review)
Hello Matthew Jacobs, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4862

to look at the new patch set (#5).

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..

IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

Implements additional changes to make the memory layout
of Kudu rows identical to Impala tuples.
In particular, Kudu rows allocate a null bit even for
non-nullable columns, and Impala now does the same
for Kudu scan tuples.

This change exploits the now-identical Kudu and Impala
tuple layouts to avoid the expensive translation.

Perf: Mostafa reported a 50% efficiency gain on full
table scans.

Testing: A private core/hdfs run passed.

TODO:
1) Test cases with nullable/nonnullable non-PK slots.
2) Specify mem layout to client (depends on KUDU-1694)
3) Avoid mem copies (depends on KUDU-1695)

Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-scanner.h
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
5 files changed, 67 insertions(+), 202 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/4862/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4862/4/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

PS4, Line 56: 
: // Maximum size of string values returned from Kudu.
: static constexpr size_t MAX_KUDU_STRING_SIZE = 8 * (1 << 20);
> not needed
Oops. Forgot to remove. Done.


-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4862/4/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

PS4, Line 56: 
: // Maximum size of string values returned from Kudu.
: static constexpr size_t MAX_KUDU_STRING_SIZE = 8 * (1 << 20);
not needed


-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4878/2/be/src/bufferpool/buffer-allocator.h
File be/src/bufferpool/buffer-allocator.h:

PS2, Line 38: MUST_USE(Status)
> I guess we'd have to wait for C++17, where in the link Tim sent (http://en.
Good point!

I guess we could pass "c++1z" rather than "c++14", but we might also need to 
upgrade to a newer gcc in the toolchain:

https://gcc.gnu.org/projects/cxx-status.html#cxx1z

Another idea is to switch to clang for all builds. I tested it, and clang 
allows this attribute on types:

struct[[gnu::warn_unused_result]] Foo {
  int bar;
};

Foo Baz() { return {11}; }

int main() {
  Baz();
  const auto qux = Baz();   

  return qux.bar;
}

GCC (4.9.2) warns about using the attribute in the wrong place. Clang 
(3.8.0-p1) warns about the unused return value from the Baz call on the first 
line of main.


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4314: Standardize on MT-related data structures

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4314: Standardize on MT-related data structures
..


Patch Set 3: Code-Review+1

(4 comments)

Looked at .cc / .thrift only.

http://gerrit.cloudera.org:8080/#/c/4853/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS3, Line 246: boost
remove boost::


PS3, Line 456: runtime
Will you file a JIRA to fix this?


http://gerrit.cloudera.org:8080/#/c/4853/3/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

Line 116:   void Validate() const;
Still missing mention of how this signals failure.


PS3, Line 201: /// TODO: why are these part of the schedule?
The alternative is that there is a pointer to the enclosing QueryExecState.


-- 
To view, visit http://gerrit.cloudera.org:8080/4853
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4878/2/be/src/bufferpool/buffer-allocator.h
File be/src/bufferpool/buffer-allocator.h:

PS2, Line 38: MUST_USE(Status)
> I would prefer MUST_USE(x) better as an API user, since I feel it gives me 
I guess we'd have to wait for C++17, where in the link Tim sent 
(http://en.cppreference.com/w/cpp/language/attributes) it appears to become 
usable as a type attribute.


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4878/2/be/src/bufferpool/buffer-allocator.h
File be/src/bufferpool/buffer-allocator.h:

PS2, Line 38: MUST_USE(Status)
> This is going to get pretty noisy if we start tagging all return types like
I would prefer MUST_USE(x) better as an API user, since I feel it gives me 
information about the type like const and volatile do.

Unfortunately, it looks like warn_unused_result is a function attribute, not a 
type attribute: 
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4878/2//COMMIT_MSG
Commit Message:

PS2, Line 7: IMPALA-3200 (buffer pool)
IMPALA-2615


http://gerrit.cloudera.org:8080/#/c/4878/2/be/src/bufferpool/buffer-allocator.h
File be/src/bufferpool/buffer-allocator.h:

PS2, Line 38: MUST_USE(Status)
This is going to get pretty noisy if we start tagging all return types like 
this.  In the future (perhaps after we fix the warnings over time), will it be 
feasible to mark the Status type itself with this attribute?

And until then, an alternate syntax worth considering is to put the attribute 
after the function decl, like how Kudu does it, e.g.:

Status SetFlushMode(FlushMode m) WARN_UNUSED_RESULT;

To my eyes that's a bit easier to read since it's not so central to the 
declaration. But I don't feel too strongly about it if the consensus prefers 
the current syntax.


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 2: Code-Review+1

(2 comments)

+1, not +2, because I'm interested in hearing Dan's thoughts on this as well.

http://gerrit.cloudera.org:8080/#/c/4878/2//COMMIT_MSG
Commit Message:

PS2, Line 9: STATUS_RETURN
MUST_USE now


PS2, Line 10: __attribute__((warn_unused_result))
expands to [[gnu::warn_unused_result]] now


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4348 / IMPALA-4333: Improve coordinator fragment cancellation

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has submitted this change and it was merged.

Change subject: IMPALA-4348 / IMPALA-4333: Improve coordinator fragment 
cancellation
..


IMPALA-4348 / IMPALA-4333: Improve coordinator fragment cancellation

This patch fixes two problems:

1. If a fragment instance is cancelled (by the coordinator) concurrently
to the coordinator calling GetNext() on its PlanRootSink, GetNext() may
access a destroyed PlanRootSink object and crash (IMPALA-4333).

2. If ExecPlanFragment() times out, but is successful, the coordinator
will never call CloseConsumer() to release its side of the PlanRootSink,
and the fragment instance will never finish (IMPALA-4348).

The following changes address this:

* Make PlanFragmentExecutor::Cancel() call CloseConsumer() on its root
  sink. This ensures that, no matter what the reason, the fragment
  instance will eventually terminate. Otherwise the coordinator may not
  call CloseConsumer() if it could not access the root instance's
  FragmentExecState.

* Ensure that the PlanRootSink has a lifetime at least as long as the
  coordinator object by taking a shared_ptr reference to its parent
  FragmentExecState object. Even if the fragment instance finishes, the
  object will still be valid until the coordinator releases this
  reference. In the future, we expect this to be replaced by an explicit
  reference take / relinquish API.

* Ensure the coordinator tries to cancel fragment instances if any
  attempt was made to start them, not just if the RPC was a
  success. This deals with the timeout case where the RPC was slow, but
  successful.

This patch relies on a fix for IMPALA-4383, which ensures that fragment
instances will always try to send reports, and so will always detect an
error and cancel themselves.

Testing:

* Ran TestRPCTimeout.test_execplanfragment_timeout in a loop for 90
minutes. Patch addresses hangs and failure modes that were observed
previously.

* Added a manual sleep just before Coordinator::GetNext() calls
  root_sink_->GetNext(), and cancelled a query manually in that
  window. Confirmed that query cancels successfully.

Change-Id: I5b4e25c1d658b3929182ba5e56b5c5e881dd394a
Reviewed-on: http://gerrit.cloudera.org:8080/4865
Tested-by: Internal Jenkins
Reviewed-by: Henry Robinson 
---
M be/src/exec/plan-root-sink.cc
M be/src/exec/plan-root-sink.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/plan-fragment-executor.cc
5 files changed, 65 insertions(+), 35 deletions(-)

Approvals:
  Henry Robinson: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4865
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5b4e25c1d658b3929182ba5e56b5c5e881dd394a
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4383: Ensure plan fragment report thread is always started

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4383: Ensure plan fragment report thread is always 
started
..


Patch Set 2: Code-Review+2 Verified+1

Exhaustive build passed, GVO passed in conjunction with 
https://gerrit.cloudera.org/#/c/4865/

-- 
To view, visit http://gerrit.cloudera.org:8080/4864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5d3cab4cc5245014758e6b70dec7a706a7fa929b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4348 / IMPALA-4333: Improve coordinator fragment cancellation

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4348 / IMPALA-4333: Improve coordinator fragment 
cancellation
..


Patch Set 6: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/4865
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5b4e25c1d658b3929182ba5e56b5c5e881dd394a
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4383: Ensure plan fragment report thread is always started

2016-10-28 Thread Henry Robinson (Code Review)
Henry Robinson has submitted this change and it was merged.

Change subject: IMPALA-4383: Ensure plan fragment report thread is always 
started
..


IMPALA-4383: Ensure plan fragment report thread is always started

PlanFragmentExecutor::report_thread_active_ was set during
OpenInternal() after ReportProfile() - run in a different thread -
signalled that it was running.

Howeer, ReportProfile() reads report_thread_active_ to determine if it
should exit its loop. If it runs fast enough, ReportProfile() will never
see report_thread_active_ == true.

This patch moves setting report_thread_active_ to ReportProfile() -
slightly earlier, but visible at almost the same time because it is now
correctly protected by report_thread_lock_.

Testing:
* TestRPCTimeout would hit this in certain configurations. Ran
  test_execplanfragment_timeout() for 90 minutes with no failures;
  previously it would fail within the first five attempts repeatedly.

Change-Id: I5d3cab4cc5245014758e6b70dec7a706a7fa929b
Reviewed-on: http://gerrit.cloudera.org:8080/4864
Reviewed-by: Henry Robinson 
Tested-by: Henry Robinson 
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
2 files changed, 6 insertions(+), 2 deletions(-)

Approvals:
  Henry Robinson: Looks good to me, approved; Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5d3cab4cc5245014758e6b70dec7a706a7fa929b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Alex Behm (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4862

to look at the new patch set (#4).

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..

IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

Implements additional changes to make the memory layout
of Kudu rows identical to Impala tuples.
In particular, Kudu rows allocate a null bit even for
non-nullable columns, and Impala now does the same
for Kudu scan tuples.

This change exploits the now-identical Kudu and Impala
tuple layouts to avoid the expensive translation.

Perf: Mostafa reported a 50% efficiency gain on full
table scans.

Testing: A private core/hdfs run passed.

TODO:
1) Test cases with nullable/nonnullable non-PK slots.
2) Specify mem layout to client (depends on KUDU-1694)
3) Avoid mem copies (depends on KUDU-1695)

Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
---
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-scanner.h
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
5 files changed, 70 insertions(+), 202 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/4862/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4862/3//COMMIT_MSG
Commit Message:

Line 17: 
> TODO:
Done


http://gerrit.cloudera.org:8080/#/c/4862/3/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

Line 183: Tuple* kudu_tuple = 
reinterpret_cast(const_cast(krow.cell(0)));
> Is the const_cast necessary? We shouldn't be modifying the memory. Ok to le
Logically it's not necessary, but getting rid of it leads to having to also fix 
these:
* Tuple::DeepCopy() is not a const function
* ExecNode::EvalConjuncts() takes a TupleRow* and not a const TupleRow*
* ExprContext::Get*Val() take TupleRow* and not a const TupleRow*. These are 
called from EvalConjuncts().


Line 198:   *batch_done = true;
> Not your change but I think we should set *batch_done = false up the top of
Done


PS3, Line 215: 8 * (1 << 20)
> Let's make this a named constant just for clarity.
Removed this function in response to Matt's comment.


http://gerrit.cloudera.org:8080/#/c/4862/3/be/src/exec/kudu-scanner.h
File be/src/exec/kudu-scanner.h:

PS3, Line 79:   /// Asserts that all string slots in the given tuple have less 
than 8MB of data.
:   /// Kudu should never return values larger than 8MB.
:   void ValidateVarLenKuduData(const Tuple* tuple) const;
> why do we need to bother checking this since it's their limitation not ours
removed


-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3724: Support Kudu non-covering range partitions

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3724: Support Kudu non-covering range partitions
..


Patch Set 2:

(10 comments)

Had a pass over the tests.

http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java
File fe/src/main/java/org/apache/impala/util/KuduUtil.java:

Line 117:* Builds and returns a range partition bound to be used in the 
creation of a Kudu
remove "to be" to disambiguate ("bound" could be a interpreted as noun or 
"bound to be" as a verb")

also use "range-partition bound"


Line 167: TExprNode literal = boundaryVal.getNodes().get(0);
check state that boundaryVal is a literal expr


http://gerrit.cloudera.org:8080/#/c/4856/2/fe/src/test/java/org/apache/impala/analysis/ParserTest.java
File fe/src/test/java/org/apache/impala/analysis/ParserTest.java:

Line 1632: //TestUtils.assumeKuduIsSupported();
I know why this is here :)


Line 2386: ParserError("CREATE TABLE Foo (i int) DISTRIBUTE BY RANGE(i) 
SPLIT ROWS ((1),(2))");
remove (there many things we don't support)


Line 2408: ParserError("CREATE TABLE Foo (a int) DISTRIBUTE BY RANGE (a) 
()");
Test that something like >= does not parse.

We might consider parsing all comparison ops and giving a nicer error message 
during analysis. You can think about it.


Line 2419: ParserError("CREATE TABLE Foo (a int) DISTRIBUTE BY RANGE (a) " +
add tests with PARTITION 1 < VALUE and PARTITION VALUES = 10 (mismatched 
comparison op and VALUE/VALUES)


Line 2561: ParsesOk("CREATE TABLE Foo PRIMARY KEY (a, b) DISTRIBUTE BY HASH 
(b) INTO 2 " +
add one CTAS test with RANGE and some PARTITIONs


http://gerrit.cloudera.org:8080/#/c/4856/2/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test:

Line 91: (partitiion values <= 1, partition 1 < value <= 10, partition 10 < 
values) stored as kudu;
typo: partitiion


http://gerrit.cloudera.org:8080/#/c/4856/2/testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test:

Line 68:   (partition values <= 10, partition 10 < values <=20, partition 20 < 
values <= 30,
nit: space after second <=


Line 127:   primary key (id, name)) distribute by hash(id) into 4 buckets, 
range(id, name)
is there a limit on the number of range partitions?


-- 
To view, visit http://gerrit.cloudera.org:8080/4856
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6799c01a37003f0f4c068d911a13e3f060110a06
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4878/1//COMMIT_MSG
Commit Message:

Line 14: The macro is only used in the buffer pool for now, but we can use it in
> I saw a JIRA you filed about this where you found a likely bug. Maybe refer
Done


http://gerrit.cloudera.org:8080/#/c/4878/1/be/src/common/status.h
File be/src/common/status.h:

PS1, Line 262: __attribute__((warn_unused_result))
> This can be [[gnu::warn_unused_result]] with the new C++11 syntax
Done. Looks like eventually this will be standardised as [[nodiscard]] 
http://en.cppreference.com/w/cpp/language/attributes


Line 262: #define STATUS_RETURN __attribute__((warn_unused_result)) Status
> How would you feel about making this a function-like macro, like MUST_USE(T
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..

IMPALA-3200 (buffer pool): warn if Status is ignored

This introduces a STATUS_RETURN macro that expands to
__attribute__((warn_unused_result)) Status. It can be used in function
declarations in place of Status to issue warnings if a return Status
is ignored. E.g. I found a bug IMPALA-4391 by applying it to some other
places in the code.

The macro is only used in the buffer pool for now, but we can use it in
other places in follow-up patches if we agree on the pattern.

Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
---
M be/src/bufferpool/buffer-allocator.h
M be/src/bufferpool/buffer-pool.h
M be/src/common/status.h
3 files changed, 16 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4878/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 


[Impala-ASF-CR] IMPALA-4379: Fix and test Kudu table type checking

2016-10-28 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4379: Fix and test Kudu table type checking
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4857/2/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test:

Line 46: ImpalaRuntimeException: Table 'simple_new' does not represent a Kudu 
table. Expected storage_handler 'com.cloudera.kudu.hive.KuduStorageHandler' but 
found 'foo'
> Agree, I'm inclined to disallow altering storage handlers altogether. Why n
They have weird behavior that I think is bad too.

It doesn't allow you to use ALTER TABLE for any non-native tables.

For native tables, it allows you to set storage_handler, but that will make the 
table a non-native table and then it can't be changed or even dropped.

hive> alter table i1740_alter set tblproperties ('storage_handler'='');
OK
Time taken: 0.132 seconds
hive> alter table i1740_alter set tblproperties ('storage_handler'='foo');
FAILED: SemanticException [Error 10134]: ALTER TABLE cannot be used for a 
non-native table i1740_alter

After this, nothing works on this table. I think I'll have to go to Mysql to 
drop it.


-- 
To view, visit http://gerrit.cloudera.org:8080/4857
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I475273cbbf4110db8d0f78ddf9a56abfc6221e3e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3342: Add thread counters to monitor plan fragment 
execution
..


Patch Set 7: Code-Review+1

(2 comments)

LGTM once you make a couple of small changes.

http://gerrit.cloudera.org:8080/#/c/4633/7/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS7, Line 137: PlanFragmentThread
Let's call it "PlanFragmentThreads" to make it clear that there may be multiple 
threads.


http://gerrit.cloudera.org:8080/#/c/4633/7/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 365:   /// Total CPU utilization for all threads in this plan fragment.
Let's group it with the other timers - above total_storage_wait_timer_


-- 
To view, visit http://gerrit.cloudera.org:8080/4633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add functional tests for compute stats with mt dop > 0.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4879

Change subject: Add functional tests for compute stats with mt_dop > 0.
..

Add functional tests for compute stats with mt_dop > 0.

Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
---
A 
testdata/workloads/functional-query/queries/QueryTest/mt-dop-compute-stats.test
M tests/query_test/test_mt_dop.py
2 files changed, 20 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/4879/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4879
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Icd4e7e44f9f23f66e59ad1fb298e13da76ad817a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] Add all build targets to CMake and speed up builds

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add all build targets to CMake and speed up builds
..


Patch Set 3:

Confirmed I was able to build on all of our supported platforms with the 
change. Will wait until our builds stabilise before merging.

-- 
To view, visit http://gerrit.cloudera.org:8080/4790
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I23617adf13bdeb034c24f6bba14b5ae480e8dd26
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4372: 'Describe formatted' returns types in upper case

2016-10-28 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-4372: 'Describe formatted' returns types in upper case
..


Patch Set 2:

test_describe_formatted did not catch this bug because it was silently failing.

The primary issue there was that it wasn't passing a 'transport' param to the 
jdbc client, and it uses the workload runner infrastructure, which logs errors 
it encounters but doesn't fail tests (in 
query_exec_functions.py:run_query_capture_results). To prevent other tests from 
making the same mistake, I added an assert in JdbcQueryExecConfig that ensures 
that the transport is set.

There was also an incorrect number of parameters being passed to the 
HiveQueryResult constructor in create_exec_result that I fixed.

Finally, it turns out that we handle NULLs differently from Hive. For now, I've 
left our NULL behavior as is, and just modified the test to allow NULLs to 
equal empty strings, which also allowed me to remove an xfail from the test.

-- 
To view, visit http://gerrit.cloudera.org:8080/4861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274b97d4d1247244247fb38a5ca7f4c10bba8d22
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.

2016-10-28 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-1286: Extract common conjuncts from disjunctions.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4877/1/fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java
File fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java:

Line 42: if (!Expr.IS_OR_PREDICATE.apply(expr)) return expr;
Isn't this check a little weak in the sense that it only checks the root's Op 
type to be OR. IIUC, a case like ((a OR b) OR (a AND c)) still passes and 
returns 'a' as the common conjunct. Is it better to have a check that makes 
sure the expr is in a proper disjunctive normal form?


http://gerrit.cloudera.org:8080/#/c/4877/1/testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test:

Line 3384:   p_brand = 'Brand#12'
I'm not really sure how this patch handles a bushy predicate. Do we flatten 
them somewhere and convert it to a DNF before applying the new rule? (May be I 
misread the code)

For example: ((a or (b and c)) or ((a and b) or (d or f)))

The tests here and elsewhere seem to be handling simple predicates, so I'm 
wondering how cases like above are dealt with.


-- 
To view, visit http://gerrit.cloudera.org:8080/4877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4372: 'Describe formatted' returns types in upper case

2016-10-28 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#2).

Change subject: IMPALA-4372: 'Describe formatted' returns types in upper case
..

IMPALA-4372: 'Describe formatted' returns types in upper case

A recent change caused 'describe formatted' to display the types
in all upper case, but we want 'describe formatted' to match Hive's
'describe' output, which displays the types in lower case.

This patch also fixes several problems with test_describe_formatted,
which was encountering an error but reporting success.

Change-Id: I274b97d4d1247244247fb38a5ca7f4c10bba8d22
---
M fe/src/main/java/org/apache/impala/catalog/Column.java
M testdata/workloads/functional-query/queries/QueryTest/avro-schema-changes.test
M tests/common/impala_test_suite.py
M tests/metadata/test_metadata_query_statements.py
M tests/performance/query_exec_functions.py
M tests/performance/query_executor.py
6 files changed, 43 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/4861/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I274b97d4d1247244247fb38a5ca7f4c10bba8d22
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4878/1//COMMIT_MSG
Commit Message:

Line 14: The macro is only used in the buffer pool for now, but we can use it in
I saw a JIRA you filed about this where you found a likely bug. Maybe reference 
that here?


http://gerrit.cloudera.org:8080/#/c/4878/1/be/src/common/status.h
File be/src/common/status.h:

Line 262: #define STATUS_RETURN __attribute__((warn_unused_result)) Status
How would you feel about making this a function-like macro, like MUST_USE(Type) 
__attribute__((warn_unused_result)) Type?


PS1, Line 262: __attribute__((warn_unused_result))
This can be [[gnu::warn_unused_result]] with the new C++11 syntax


-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4862/3//COMMIT_MSG
Commit Message:

Line 17: 
TODO:
1) Test cases with nullable/nonnullable slots.
2) Specify mem layout to client (depends on KUDU-1694)
3) Avoid mem copies (depends on KUDU-1695)


http://gerrit.cloudera.org:8080/#/c/4862/3/be/src/exec/kudu-scanner.h
File be/src/exec/kudu-scanner.h:

PS3, Line 79:   /// Asserts that all string slots in the given tuple have less 
than 8MB of data.
:   /// Kudu should never return values larger than 8MB.
:   void ValidateVarLenKuduData(const Tuple* tuple) const;
why do we need to bother checking this since it's their limitation not ours? if 
they change their buffer sizes in the future then we have to change this as 
well.


-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3552: make incremental stats max serialized size configurable

2016-10-28 Thread Yonghyun Hwang (Code Review)
Yonghyun Hwang has posted comments on this change.

Change subject: IMPALA-3552: make incremental stats max serialized size 
configurable
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4867/1/be/src/catalog/catalog.cc
File be/src/catalog/catalog.cc:

PS1, Line 78: catalog_class_, catalog_ctor_,
:   load_in_background, num_metadata_loading_threads, 
sentry_config,
:   FlagToTLogLevel(FLAGS_v), 
FlagToTLogLevel(FLAGS_non_impala_java_vlog),
:   auth_to_local, principal
> No for both.
Then, let's include it here. :)


http://gerrit.cloudera.org:8080/#/c/4867/1/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

PS1, Line 83: boolean loadInBackground, int numMetadataLoadingThreads,
:   String sentryServiceConfig, int impalaLogLevel, int 
otherLogLevel,
:   boolean allowAuthToLocal, String kerberosPrincipal
> Its a good ramp-up task and I think we should do it here. This clean-up has
Agreed. However, I think current fix contains enough changes. I will create a 
new jira ticket and take care of the clean up right after this commit. I want 
to minimize the scope of changes for one commit.


-- 
To view, visit http://gerrit.cloudera.org:8080/4867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I33684725a61eabc67237503e61178305d37d3cb5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Yonghyun Hwang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Huaisi Xu 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: Yonghyun Hwang 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input
..


Patch Set 4:

(17 comments)

Thanks for adding the tests, this is looking pretty good - I think it could be 
refined and polished a bit more but no major concerns here.

http://gerrit.cloudera.org:8080/#/c/4863/4/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 268:   // Pass the row batch to the writer. If new_file is returned true 
then the current
I find this comment a little confusing. Maybe explain more the "why", i.e. that 
the rows may be written to multiple files.


Line 272:   do {
I feel like this might be clearer if you write it as while(true) and then break 
out if 'new_file' is false.


Line 288:   // TODO: Should we adapt hdfs writer to accept start,num pair 
instead of a vector?
My guess is that it would only help a little bit. It's nice to use the same 
code path for this and dynamic partitioning at least for now.


Line 296: PartitionPair* next_partition_pair = NULL;
It would be more efficient to keep track of the previous key and comparing 
directly instead of looking up the partition hash table for every row. It's a 
little more complicated but I think would help perf.


PS4, Line 301: Flush
I think it might be clearer to just say "Write rows" instead of "flush". Flush 
makes me think it's writing to the filesystem, but it's really just pushing the 
rows into the HdfsTableWriter, which does its own buffering.


Line 301: // Flush previous partition
I think the comments in here are a little too low level. Here maybe a single 
comment along the lines of "Done with previous partition - write rows and 
close." might be more helpful.


Line 310: // Collect row index
This comment isn't too helpful.


PS4, Line 313: Flush
"Write rows to" is probably clearer than "flush".


PS4, Line 607: WriteRowsOfPartition
WriteRowsToPartition?


http://gerrit.cloudera.org:8080/#/c/4863/4/be/src/exec/hdfs-table-sink.h
File be/src/exec/hdfs-table-sink.h:

PS4, Line 183: rows
Maybe reword to be clearer, took me a while to figure out:
"a vector of indices of the rows in the current batch to insert into the 
partition"


Line 209:   /// Writes all rows of a partition to the partitions writer and 
clears the row index
This could be clearer that it's only writing the rows with indices in 
'partition_pair', it kind-of sounds like it's writing all the rows in 'batch' 
into the partition.


PS4, Line 209: partitions
partition's (missing apostrophe).


PS4, Line 215:  is expected to
This could be more declarative - i.e. "must" instead of "is expected to"


http://gerrit.cloudera.org:8080/#/c/4863/4/tests/query_test/test_insert.py
File tests/query_test/test_insert.py:

PS4, Line 71: text
parquet, right?


http://gerrit.cloudera.org:8080/#/c/4863/4/tests/query_test/test_insert_behaviour.py
File tests/query_test/test_insert_behaviour.py:

Line 476: 
Hmm, I think we could also do with a test that writes some large partitions 
split across files. E.g. do a CTAS of lineitem partitioned by some low-NDV 
column. We could also maybe adjust PARQUET_FILE_SIZE to force it to split it 
into more files.

Depending on how long it takes, might make sense to only run it in exhaustive.


Line 477: 
Extra blank line


Line 501:   assert len(files) == 1
Comment why we only expect one file - it might not be that obvious to people 
looking at this code later.


-- 
To view, visit http://gerrit.cloudera.org:8080/4863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1286: Extract common conjuncts from disjunctions.
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4877/1/fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java
File fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java:

Line 31:  * Extracts common conjuncts from disjunctive CompoundPredicates.
Might be worth commenting on how this handles > 2 disjuncts. I.e. you need to 
apply the rule from the bottom-up.


Line 49:   if (child1Conjuncts.contains(conjunct)) {
This is n^2 so will get very slow if we have long lists of disjuncts, e.g. 
machine-generated queries. This is fine for small n but I think we should 
seriously consider switching to HashMaps for child1Conjuncts and 
commonConjuncts for n > some threshold. The core logic I think is the same if 
you factor it out into a function that operates on Collection.


Line 72: if (!child0Conjuncts.isEmpty()) {
At this point both child0Conjuncts and child1Conjuncts are non-empty because we 
would have returned early otherwise. So I think these branches are unnecessary.


PS1, Line 84: newDisjuncts.size() > 1
See above - this should always be true.


http://gerrit.cloudera.org:8080/#/c/4877/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 146: "(!(int_col=5 or tinyint_col > 9) and double_col = 7) or " +
Slightly tangential, but should we apply De Morgan's first to normalise the 
disjunct into a conjunct? That would help if we had something like:

  (!(int_col=5 or tinyint_col > 9) and double_col = 7) or (!(int_col = 5) and 
!(tinyint_col > 9) and double_col = 8)

I guess in general it would be nice to do some basic normalisation of 
expressions in other cases, e.g. convert !(tinyint_col > 9) to tinyint_col <= 
9, convert 9 < tinyint_col to tinyint_col <= 9, etc.

I think that would make the common conjunct extraction a bit more robust. You 
can rewrite the query but avoiding that is the motivation for this in the first 
place, so it would be nice if it worked in some more simple cases.


Line 153: "(int_col < 10 and id between 5 and 6) or " +
Does common conjunct extraction work if you have a BETWEEN expression and the 
equivalent expression after the rewrite? Should work if we apply the extraction 
after the BETWEEN rewrite, but would be nice to test.


-- 
To view, visit http://gerrit.cloudera.org:8080/4877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution

2016-10-28 Thread anujphadke (Code Review)
anujphadke has posted comments on this change.

Change subject: IMPALA-3342: Add thread counters to monitor plan fragment 
execution
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4633/6//COMMIT_MSG
Commit Message:

Line 9: This change removes the use of total_cpu_timer which incorrectly
> Yes
Done


http://gerrit.cloudera.org:8080/#/c/4633/6/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

Line 138
> Why don't we create the thread counters here like before? This seems much c
Done and I can access the private variable here too.


http://gerrit.cloudera.org:8080/#/c/4633/6/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 242:   /// 'failed_allocation_size' is zero, nothing about the allocation 
size is logged.
> The convention is that the trailing underscore means that the member is pri
Done.
I start the timer in the runtime-state instead of in the plan-fragment-executor 
object.


-- 
To view, visit http://gerrit.cloudera.org:8080/4633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution

2016-10-28 Thread anujphadke (Code Review)
anujphadke has uploaded a new patch set (#7).

Change subject: IMPALA-3342: Add thread counters to monitor plan fragment 
execution
..

IMPALA-3342: Add thread counters to monitor plan fragment execution

This change removes the use of total_cpu_timer which incorrectly
monitors the CPU time. Adding THREAD_COUNTERS to measure the user
and sys time in plan fragment execution. This also accounts for the
time spent in the hdfs/kudu scanner and in a blocking join.

Snippet of a query plan with the newly added PlanFragment
THREAD_COUNTERS:

Fragment F00:
  Instance 22405f2d916ba9e6:41f8692c0009
  .
 - PerHostPeakMemUsage: 978.06 MB (1025568440)
 - PlanFragmentThreadInvoluntaryContextSwitches: 12.85K (12846)
 - PlanFragmentThreadTotalWallClockTime: 5m25s
   - PlanFragmentThreadSysTime: 147.673ms
   - PlanFragmentThreadUserTime: 20s710ms
 - PlanFragmentThreadVoluntaryContextSwitches: 13.43K (13427)

Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
---
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/kudu-scan-node.cc
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
7 files changed, 19 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/4633/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-3487: validate decimal type in Avro file schema

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4876

Change subject: IMPALA-3487: validate decimal type in Avro file schema
..

IMPALA-3487: validate decimal type in Avro file schema

This patch prevents an invalid decimal type in an Avro file schema from
crashing Impala. Most invalid Avro schemas are caught by the frontend,
but file schemas still need to be validated by the backend.

After this patch files with bad schemas are skipped.

Testing:
This was hit very rarely by the scanner fuzzing. Added a regression test that 
scans
a file with a bad schema.

Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
---
M be/src/exec/hdfs-avro-scanner.cc
M be/src/runtime/decimal-test.cc
M be/src/runtime/types.h
M be/src/util/avro-util.cc
M be/src/util/avro-util.h
M common/thrift/generate_error_codes.py
M testdata/bad_avro_snap/README
A testdata/bad_avro_snap/invalid_decimal_schema.avro
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
11 files changed, 102 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/4876/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4876
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I25a326ee2220bc14d3b5f887dc288b4adf859cfc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3200 (buffer pool): warn if Status is ignored

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4878

Change subject: IMPALA-3200 (buffer pool): warn if Status is ignored
..

IMPALA-3200 (buffer pool): warn if Status is ignored

This introduces a STATUS_RETURN macro that expands to
__attribute__((warn_unused_result)) Status. It can be used in function
declarations in place of Status to issue warnings if a return Status
is ignored.

The macro is only used in the buffer pool for now, but we can use it in
other places in follow-up patches if we agree on the pattern.

Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
---
M be/src/bufferpool/buffer-allocator.h
M be/src/bufferpool/buffer-pool.h
M be/src/common/status.h
3 files changed, 16 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4878/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8c26e7532b5f2c7fe167accc73179e8b72b192bc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.

2016-10-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3346: DeepCopy() Kudu rows into Impala tuples.
..


Patch Set 3: Code-Review+1

(3 comments)

So much simpler. Just a few minor things.

http://gerrit.cloudera.org:8080/#/c/4862/3/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

Line 183: Tuple* kudu_tuple = 
reinterpret_cast(const_cast(krow.cell(0)));
Is the const_cast necessary? We shouldn't be modifying the memory. Ok to leave 
if there's some reason the types make it difficult to avoid.


Line 198:   *batch_done = true;
Not your change but I think we should set *batch_done = false up the top of the 
function instead of relying on the caller to initialize. That's what we do 
almost everywhere else in the code.


PS3, Line 215: 8 * (1 << 20)
Let's make this a named constant just for clarity.


-- 
To view, visit http://gerrit.cloudera.org:8080/4862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic911e4eff9fe98bf28d8a1bab5c9d7e9ab66d9cb
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1286: Extract common conjuncts from disjunctions.

2016-10-28 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4877

Change subject: IMPALA-1286: Extract common conjuncts from disjunctions.
..

IMPALA-1286: Extract common conjuncts from disjunctions.

Adds a new ExprRewriteRule to extract common conjuncts from
disjunctions.

Examples:
(a AND b AND c) OR (b AND d) ==> b AND ((a AND c) OR (d))
(a AND b) OR (a AND b) ==> a AND b
(a AND b AND c) OR (c) ==> c

Testing: Added a new unit test in ExprRewriteRulesTest.

Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
---
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
A fe/src/main/java/org/apache/impala/rewrite/ExtractCommonConjunctRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
7 files changed, 198 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/4877/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4877
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

2016-10-28 Thread Alex Behm (Code Review)
Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4746

to look at the new patch set (#9).

Change subject: IMPALA-4309: Introduce Expr rewrite phase and supporting 
classes.
..

IMPALA-4309: Introduce Expr rewrite phase and supporting classes.

Introduces a new phase for rewriting Exprs after analysis and
before subquery rewriting. The transformed Exprs replace the
original ones in analyzed statements. If Exprs were changed,
the whole statement is reset() and re-analyzed, similar to how
subqueries are rewritten. If both Exprs and subqueries are
rewritten there is only one re-analysis of the changed statement.

The following new classes work together to perform transformations:
1. ExprRewriteRule
- base class for Expr transformation rules
2. ExprRewriter
- drives the transformation of Exprs using a list of
  ExprRewriteRules

Statements that have exprs to be rewritten need to implement
a new method rewriteExprs() that accepts an ExprRewriter.

As an example, this patch adds a rule for converting
BetweenPredicates into their equivalent CompoundPredicates.
The BetweenPredicate has been notoriously buggy due to a lack
of such a separate rewrite phase and is now cleaned up.

Testing:
1. Added a new test for checking that the rewrite framework
   covers all relevant statements, clauses and can properly
   handle nested statements and subqueries.
2. Added a new test for ExprRewriteRules and implemented
   tests for the BetweenPredicate rewrite.
2. There are many existing tests for BetweePredicates and
   they all exercise the new rewrite rule/phase.
3. Ran a private core/hdfs run and it passed.

Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
A fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java
A fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
A fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
30 files changed, 726 insertions(+), 216 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/4746/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4746
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4348 / IMPALA-4333: Improve coordinator fragment cancellation

2016-10-28 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4348 / IMPALA-4333: Improve coordinator fragment 
cancellation
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4865
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5b4e25c1d658b3929182ba5e56b5c5e881dd394a
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No