[Impala-ASF-CR] IMPALA-8063: Add sleep to ImpalaTestSuite::wait for state() loop

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12207 )

Change subject: IMPALA-8063: Add sleep to ImpalaTestSuite::wait_for_state() loop
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaed0a9b292d431a64e22b460c92f6ef57b6abd1e
Gerrit-Change-Number: 12207
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 10 Jan 2019 07:52:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1763/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Thu, 10 Jan 2019 05:57:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..

IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual 
consistency

This patch is a temporary fix to disable tests which fail due to
S3's eventually consistent behavior. The permanent fix would
involve running tests with S3Guard enabled.

Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
---
M tests/metadata/test_compute_stats.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_permutation.py
M tests/query_test/test_nested_types.py
4 files changed, 12 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/12203/3
--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12203/2/tests/query_test/test_insert.py
File tests/query_test/test_insert.py:

http://gerrit.cloudera.org:8080/#/c/12203/2/tests/query_test/test_insert.py@26
PS2, Line 26:
> flake8: E501 line too long (99 > 90 characters)
Done



--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 3
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Thu, 10 Jan 2019 05:23:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8063: Add sleep to ImpalaTestSuite::wait for state() loop

2019-01-09 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12207 )

Change subject: IMPALA-8063: Add sleep to ImpalaTestSuite::wait_for_state() loop
..


Patch Set 1:

Looks good.
Can you also reduce the timeout in tests/webserver/test_web_pages.py:: 
__run_query_and_get_debug_page. A timeout of 100 seconds is very high for a 
test marked as xfail.


--
To view, visit http://gerrit.cloudera.org:8080/12207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaed0a9b292d431a64e22b460c92f6ef57b6abd1e
Gerrit-Change-Number: 12207
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 10 Jan 2019 05:16:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8063: Add sleep to ImpalaTestSuite::wait for state() loop

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12207 )

Change subject: IMPALA-8063: Add sleep to ImpalaTestSuite::wait_for_state() loop
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1762/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaed0a9b292d431a64e22b460c92f6ef57b6abd1e
Gerrit-Change-Number: 12207
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 10 Jan 2019 04:21:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8063: Add sleep to ImpalaTestSuite::wait for state() loop

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12207 )

Change subject: IMPALA-8063: Add sleep to ImpalaTestSuite::wait_for_state() loop
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3628/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/12207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaed0a9b292d431a64e22b460c92f6ef57b6abd1e
Gerrit-Change-Number: 12207
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Thu, 10 Jan 2019 03:48:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8063: Add sleep to ImpalaTestSuite::wait for state() loop

2019-01-09 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12207


Change subject: IMPALA-8063: Add sleep to ImpalaTestSuite::wait_for_state() loop
..

IMPALA-8063: Add sleep to ImpalaTestSuite::wait_for_state() loop

ImpalaTestSuite::wait_for_state() loops waiting for the state
on the connection handle to reach the expected state. This loop
does not sleep, and the get_state() does some logging, so this
loop is generating a large amount of logging.

This adds a 0.5 second sleep to the loop. Running test_web_pages.py
shows a large reduction in logging.

Change-Id: Iaed0a9b292d431a64e22b460c92f6ef57b6abd1e
---
M tests/common/impala_test_suite.py
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/12207/1
--
To view, visit http://gerrit.cloudera.org:8080/12207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iaed0a9b292d431a64e22b460c92f6ef57b6abd1e
Gerrit-Change-Number: 12207
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6533: Add min-max filter for decimal types on kudu tables.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12113 )

Change subject: IMPALA-6533: Add min-max filter for decimal types on kudu 
tables.
..

IMPALA-6533: Add min-max filter for decimal types on kudu tables.

The code mimics the code written for other min-max filters.  Decimal data
can be stored using 4 bytes, 8 bytes and 16 bytes.  The code respectively
handles these 3 storage configurations.  The column definition states the
precision and the precision determines the storage size.

The minimum and maximum values are stored in a union.  The precision from
the column will come in as an input.  Based on the precision the size will be
found, and depending on the size appropriate variable will be used.

The code in min-max-filter* follows the general convention of the file, hence
uses macros.

The test includes 24 decimal columns (as listed below) with the following joins:
1.  Inner Join with broadcast (2 tables)
  1a. 1 predicate
  1b. 4 predicates - all results in decimal min-max filter
  1c. 4 predicates - 3 results in decimal min=max filter; 1 doesn't
2.  Inner Join with Shuffle (3 tables)
3.  Right outer join (2 tables)
4.  Left Semi join (2 tables)
5.  Right Semi join (2 tables)

Decimal Columns:
4bytes:
(5,0), (5,1), (5,3), (5,5)
(9,0), (9,1), (9,5), (9,9)
8 bytes:
(14,0), (14,1), (14,7), (14,14)
(18,0), (18,1), (18,9), (18,18)
16 bytes:
(28,0), (28,1), (28,14), (28,28)
(38,0), (38,1), (38,19), (38,38)

The test aggregates the count of probe rows.  This shows that the min-max filter
is exercised, because the number of probe rows is less than the total number
of rows in the probe side table.  The count of probe rows is considered to be
deterministic.  But, it will be beneficial to look out for changes in Kudu that 
can
change the way data is partitioned.  Such a change could change the probe row 
count
and in that case, the test will have to be updated.

impala_test_suite.py and test_result_verifier.py are enhanced to support saving
of aggregation using update_results.

Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94
Reviewed-on: http://gerrit.cloudera.org:8080/12113
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/exec/filter-context.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/decimal-value.h
M be/src/runtime/decimal-value.inline.h
M be/src/util/min-max-filter-ir.cc
M be/src/util/min-max-filter-test.cc
M be/src/util/min-max-filter.cc
M be/src/util/min-max-filter.h
M bin/rat_exclude_files.txt
M common/thrift/Data.thrift
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M testdata/data/README
A testdata/data/decimal_rtf_tbl.txt
A testdata/data/decimal_rtf_tiny_tbl.txt
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-query/queries/QueryTest/decimal_min_max_filters.test
M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
M tests/common/impala_test_suite.py
M tests/common/test_result_verifier.py
M tests/query_test/test_runtime_filters.py
22 files changed, 7,321 insertions(+), 23 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/12113
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94
Gerrit-Change-Number: 12113
Gerrit-PatchSet: 19
Gerrit-Owner: Janaki Lahorani 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Janaki Lahorani 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6533: Add min-max filter for decimal types on kudu tables.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12113 )

Change subject: IMPALA-6533: Add min-max filter for decimal types on kudu 
tables.
..


Patch Set 18: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12113
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94
Gerrit-Change-Number: 12113
Gerrit-PatchSet: 18
Gerrit-Owner: Janaki Lahorani 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Janaki Lahorani 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 Jan 2019 03:32:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7666: Adding an opaque client identifier to query options.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12130 )

Change subject: IMPALA-7666: Adding an opaque client identifier to query 
options.
..

IMPALA-7666: Adding an opaque client identifier to query options.

We sometimes struggle to identify the client (e.g., a given version of a
JDBC driver, Tableau, Hue, etc.) for a given query. This commit adds a
User-Agent header style, called "Client Identifier", which clients can
set as a Query Option. Nothing is done with this header, but it's
written into logs and query profiles.

This commit includes changes to impala-shell to include the version of
impala shell with an associated test.

A future commit will serialize the name of the py.test being run into
this field, which is handy for figuring out where a query came from.

Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Reviewed-on: http://gerrit.cloudera.org:8080/12130
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M shell/impala_shell.py
M tests/shell/test_shell_commandline.py
6 files changed, 30 insertions(+), 5 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/12130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Gerrit-Change-Number: 12130
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7666: Adding an opaque client identifier to query options.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12130 )

Change subject: IMPALA-7666: Adding an opaque client identifier to query 
options.
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Gerrit-Change-Number: 12130
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 Jan 2019 03:11:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12068 )

Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1761/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
Gerrit-Change-Number: 12068
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 10 Jan 2019 01:54:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-7550 Add documentation to profile counters

2019-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12116 )

Change subject: WIP IMPALA-7550 Add documentation to profile counters
..


Patch Set 2:

(7 comments)

I'm in favour of going forward with this. I guess we'd incrementally document 
counters and as the final step make it impossible to create the counters 
without registering them?

http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/exec/scan-node.cc
File be/src/exec/scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/exec/scan-node.cc@144
PS2, Line 144:   // TODO: What to do with this? We probably shouldn't have 
counters with the same name
It seems like we should have different names. A standard convention would be 
nice though. What you're doing here seems fine. It looks like there's only a 
handful of these time series counters in the codebase to update.


http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/runtime/coordinator.cc@71
PS2, Line 71: unstable
It almost seems like this should be stable. I guess there's some limitations, 
like we don't include CPU time outside of query execution (coordination, 
planning, etc).


http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h@107
PS2, Line 107: stable
We usually have enum values as all caps - any reason for the difference?


http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h@132
PS2, Line 132:   const char* name_;
Memory lifetime is non-obvious - document the requirement that the string 
memory has process lifetime?


http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h@144
PS2, Line 144:   // TODO: live in ExecEnv?
If this approach works out simpler, I think we could make the case that it's 
different from the members of ExecEnv because it's "static" data.


http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h@149
PS2, Line 149:   void AddPrototype(const ProfileEntryPrototype* prototype) {
I guess this allows duplicate counters for now?

I guess we might get linker errors on duplicate counters if the symbols are 
exported from the compilation units, which I think they are?


http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h@160
PS2, Line 160: // todo: separate r/w locking
Could skip this - I feel like if we get to the point that this is a bottleneck, 
we've got bigger issues.



--
To view, visit http://gerrit.cloudera.org:8080/12116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idaa0a44f0a996f3487566b545d984d562e6e1588
Gerrit-Change-Number: 12116
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 Jan 2019 01:35:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7986,IMPALA-7987: run daemons in docker containers

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12095 )

Change subject: IMPALA-7986,IMPALA-7987: run daemons in docker containers
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1760/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5975cced33fa93df43101dd47d19b8af12e93d11
Gerrit-Change-Number: 12095
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 Jan 2019 01:36:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

2019-01-09 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/12068 )

Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
..

IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

These two classes evaluate scalar expressions. Previously codegen was
done by calling ScalarExpr::GetCodegendComputeFnWrapper which generates
a static method that calls the scalar expression evaluation method. Make
this more efficient by generating code which is customized using
information available at codegen time.

Add new cross-compiled files is-not-empty-predicate-ir.cc null-literal-
ir.cc slot-ref-ir.cc  valid-tuple-id-ir.cc.

IsNotEmptyPredicate works by getting a CollectionVal object from the
single child Expr node, and counting its tuples. As preparation for
codegen, split out the logic that counts the tuples to a new cross-
compiled method "IsNotEmpty()". At codegen time we know the type and
value of the child node. Generate a call to a node-specific non-virtual
cross-compiled method to get the CollectionVal object from the child.
Then generate a call to return the result of calling IsNotNull() on the
CollectionVal object.

A ValidTupleIdExpr node contains a vector of tuple ids. It works by
probing each row for the tuple ids in the vector to find a non-null
tuple. As preparation for codegen, split out the logic that probes the
row for a non-null tuple id into a new cross-compiled method
"IsTupleFromRowNonNull()".  At codegen time we know the vector of tuple
ids. We unroll the loop through the tuple ids, generating code that
calls "IsTupleFromRowNonNull()", and returning the tuple id if/when a
non-null tuple is found.

IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper() in
TupleIsNullPredicate. In the current Impala code this method is never
called. This is because TupleIsNullPredicate is always wrapped in an
IfExpr. This is always codegen'd by IfExpr's
GetCodegendComputeFnWrapper() method. There is a separate Jira
IMPALA-7655 to improve codegen of IfExpr.

Minor corrections:
  Correct the link to llvm tutorial in LlvmCodegen.

PERFORMANCE:
  I tested performance on a local mini-cluster. I wrote some
  pathological queries to test the new code. The new codegen'd code is
  very similar in performance. ValidTupleIdExpr seems a bit faster, but
  IsNotEmptyPredicate is only very slightly faster or identical.
  Overall these changes are not purely for performance  but to move
  away from GetCodegendComputeFnWrapper.

TESTING:
  The changed scalar expressions are well exercised by current tests.
  Ran exhaustive end-to-end tests.

Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/is-not-empty-predicate-ir.cc
M be/src/exprs/is-not-empty-predicate.cc
M be/src/exprs/is-not-empty-predicate.h
A be/src/exprs/null-literal-ir.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/null-literal.h
M be/src/exprs/scalar-expr.h
A be/src/exprs/slot-ref-ir.cc
M be/src/exprs/slot-ref.cc
M be/src/exprs/slot-ref.h
A be/src/exprs/valid-tuple-id-ir.cc
M be/src/exprs/valid-tuple-id.cc
M be/src/exprs/valid-tuple-id.h
17 files changed, 378 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/5
--
To view, visit http://gerrit.cloudera.org:8080/12068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
Gerrit-Change-Number: 12068
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7986,IMPALA-7987: run daemons in docker containers

2019-01-09 Thread Tim Armstrong (Code Review)
Hello Philip Zeyliger, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12095

to look at the new patch set (#13).

Change subject: IMPALA-7986,IMPALA-7987: run daemons in docker containers
..

IMPALA-7986,IMPALA-7987: run daemons in docker containers

This refactors start-impala-cluster.py to allow multiple implementations
of the minicluster operations like start and stop. There are now
two classes implementing the same set of operations -
MiniClusterOperations and DockerMiniClusterOperations. The docker
versions start and stop the containers added in IMPALA-7948.

With some configuration (see instructions below), the containers can
connect back to services (HDFS, HMS, Kudu, Sentry, etc) running on the
host. Config generation was modified so that services optionally
communicate via the docker bridge network rather than loopback
(the host's loopback interface is not accessible to the containers).

Notes:
* I improved the container build to regenerate containers when cluster
  configs are regenerated (previously the containers could have stale
  configs).
* Switch from CMD to ENTRYPOINT to allow passing in arguments to "docker
  run" without clobbering default args.
* Python 2.6 is not supported for this code path. This only affects
  CentOS 6, which has limited support for docker anyway.
* I deferred implementing wait_for_cluster(), since the existing
  code requires surgery to abstract out assumptions about locating
  processes and web UI ports - see IMPALA-7988.

How to use:
==
Create a docker network to use for internal cluster communication,
e.g.:
  docker network create -d bridge --gateway=172.17.0.1 \
  --subnet=172.17.0.1/16 impala-cluster

Add the gateway address of the docker network you created to
impala-config-local.sh, e.g.:

  export INTERNAL_LISTEN_HOST=172.17.0.1
  export DEFAULT_FS=hdfs://${INTERNAL_LISTEN_HOST}:20500

Regenerate configs and docker images:

  . bin/impala-config.sh
  ./bin/create-test-configuration.sh
  ninja -j $IMPALA_BUILD_THREADS docker_images

Restart the minicluster and Impala services to pick up the config:

  ./testdata/bin/run-all.sh
  start-impala-cluster.py --docker_network impala-cluster

You can connect with impala-shell and run some queries. You will
likely run into issues, particularly if running against an existing
data load, since "localhost" or "127.0.0.1" get baked into HMS
table definitions.

Testing:
Ran exhaustive tests (not using Docker) to make sure I didn't break
anything.

Change-Id: I5975cced33fa93df43101dd47d19b8af12e93d11
---
M bin/create-test-configuration.sh
M bin/impala-config.sh
M bin/start-impala-cluster.py
M docker/CMakeLists.txt
M docker/catalogd/Dockerfile
M docker/impalad/Dockerfile
M docker/statestored/Dockerfile
M fe/CMakeLists.txt
M fe/src/test/resources/authz-policy.ini.template
M fe/src/test/resources/hbase-site.xml.template
M fe/src/test/resources/mysql-hive-site.xml.template
M fe/src/test/resources/postgresql-hive-site.xml.template
M fe/src/test/resources/sentry-site.xml.template
M fe/src/test/resources/sentry-site_no_oo.xml.template
M fe/src/test/resources/sentry-site_oo.xml.template
M fe/src/test/resources/sentry-site_oo_nogrant.xml.template
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
M testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.tmpl
M tests/custom_cluster/test_breakpad.py
21 files changed, 435 insertions(+), 172 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/12095/13
--
To view, visit http://gerrit.cloudera.org:8080/12095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5975cced33fa93df43101dd47d19b8af12e93d11
Gerrit-Change-Number: 12095
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7986,IMPALA-7987: run daemons in docker containers

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12095 )

Change subject: IMPALA-7986,IMPALA-7987: run daemons in docker containers
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1759/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5975cced33fa93df43101dd47d19b8af12e93d11
Gerrit-Change-Number: 12095
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 Jan 2019 00:54:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7986,IMPALA-7987: run daemons in docker containers

2019-01-09 Thread Tim Armstrong (Code Review)
Hello Philip Zeyliger, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12095

to look at the new patch set (#12).

Change subject: IMPALA-7986,IMPALA-7987: run daemons in docker containers
..

IMPALA-7986,IMPALA-7987: run daemons in docker containers

This refactors start-impala-cluster.py to allow multiple implementations
of the minicluster operations like start and stop. There are now
two classes implementing the same set of operations -
MiniClusterOperations and DockerMiniClusterOperations. The docker
versions start and stop the containers added in IMPALA-7948.

With some configuration (see instructions below), the containers can
connect back to services (HDFS, HMS, Kudu, Sentry, etc) running on the
host. Config generation was modified so that services optionally
communicate via the docker bridge network rather than loopback
(the host's loopback interface is not accessible to the containers).

Notes:
* I improved the container build to regenerate containers when cluster
  configs are regenerated (previously the containers could have stale
  configs).
* Switch from CMD to ENTRYPOINT to allow passing in arguments to "docker
  run" without clobbering default args.
* Python 2.6 is not supported for this code path. This only affects
  CentOS 6, which has limited support for docker anyway.
* I deferred implementing wait_for_cluster(), since the existing
  code requires surgery to abstract out assumptions about locating
  processes and web UI ports - see IMPALA-7988.

How to use:
==
Create a docker network to use for internal cluster communication,
e.g.:
  docker network create -d bridge --gateway=172.17.0.1 \
  --subnet=172.17.0.1/16 impala-cluster

Add the gateway address of the docker network you created to
impala-config-local.sh, e.g.:

  export INTERNAL_LISTEN_HOST=172.17.0.1
  export DEFAULT_FS=hdfs://${INTERNAL_LISTEN_HOST}:20500

Regenerate configs and docker images:

  . bin/impala-config.sh
  ./bin/create-test-configuration.sh
  ninja -j $IMPALA_BUILD_THREADS docker_images

Restart the minicluster and Impala services to pick up the config:

  ./testdata/bin/run-all.sh
  start-impala-cluster.py --docker_network impala-cluster

You can connect with impala-shell and run some queries. You will
likely run into issues, particularly if running against an existing
data load, since "localhost" or "127.0.0.1" get baked into HMS
table definitions.

Testing:
Ran exhaustive tests (not using Docker) to make sure I didn't break
anything.

Change-Id: I5975cced33fa93df43101dd47d19b8af12e93d11
---
M bin/create-test-configuration.sh
M bin/impala-config.sh
M bin/start-impala-cluster.py
M docker/CMakeLists.txt
M docker/catalogd/Dockerfile
M docker/impalad/Dockerfile
M docker/statestored/Dockerfile
M fe/CMakeLists.txt
M fe/src/test/resources/authz-policy.ini.template
M fe/src/test/resources/hbase-site.xml.template
M fe/src/test/resources/mysql-hive-site.xml.template
M fe/src/test/resources/postgresql-hive-site.xml.template
M fe/src/test/resources/sentry-site.xml.template
M fe/src/test/resources/sentry-site_no_oo.xml.template
M fe/src/test/resources/sentry-site_oo.xml.template
M fe/src/test/resources/sentry-site_oo_nogrant.xml.template
M testdata/cluster/admin
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
M testdata/cluster/node_templates/common/etc/hadoop/conf/yarn-site.xml.tmpl
M tests/custom_cluster/test_breakpad.py
21 files changed, 433 insertions(+), 171 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/12095/12
--
To view, visit http://gerrit.cloudera.org:8080/12095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5975cced33fa93df43101dd47d19b8af12e93d11
Gerrit-Change-Number: 12095
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7986,IMPALA-7987: run daemons in docker containers

2019-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12095 )

Change subject: IMPALA-7986,IMPALA-7987: run daemons in docker containers
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12095/11/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/12095/11/bin/start-impala-cluster.py@244
PS11, Line 244:   result = ["logbufsecs=5", "-v={0}".format(options.log_level),
I accidentally removed the -, which makes the argument ineffective. This showed 
up as a failure in test_minidump_cleanup(), which had a preexisting race 
between the log maintenance thread and the check for the # of minidumps.

I need to fix this argument and then make that test non-flaky.



--
To view, visit http://gerrit.cloudera.org:8080/12095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5975cced33fa93df43101dd47d19b8af12e93d11
Gerrit-Change-Number: 12095
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 10 Jan 2019 00:22:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6533: Add min-max filter for decimal types on kudu tables.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12113 )

Change subject: IMPALA-6533: Add min-max filter for decimal types on kudu 
tables.
..


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3627/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12113
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94
Gerrit-Change-Number: 12113
Gerrit-PatchSet: 18
Gerrit-Owner: Janaki Lahorani 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Janaki Lahorani 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:32:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6533: Add min-max filter for decimal types on kudu tables.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12113 )

Change subject: IMPALA-6533: Add min-max filter for decimal types on kudu 
tables.
..


Patch Set 18: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12113
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94
Gerrit-Change-Number: 12113
Gerrit-PatchSet: 18
Gerrit-Owner: Janaki Lahorani 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Janaki Lahorani 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:32:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7867 (Part 4): Collection cleanup in catalog

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12131 )

Change subject: IMPALA-7867 (Part 4): Collection cleanup in catalog
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83425201c90966aae4c280d94cf1b427b3d71d1
Gerrit-Change-Number: 12131
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:17:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1758/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:23:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7666: Adding an opaque client identifier to query options.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12130 )

Change subject: IMPALA-7666: Adding an opaque client identifier to query 
options.
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3626/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Gerrit-Change-Number: 12130
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:16:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7867 (Part 4): Collection cleanup in catalog

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12131 )

Change subject: IMPALA-7867 (Part 4): Collection cleanup in catalog
..

IMPALA-7867 (Part 4): Collection cleanup in catalog

Continues the collection clean work to:

* Use collection interfaces for variable and function argument
  declarations,
* Replace generic Guava newArrayList(), etc. calls with the direct use
  of the Java collection classes,
* Clean up unused imports and add override annotations.

This patch focuses on the catalog module and its tests.

Tests: this is purely a code change, no functional change. Reran
existing tests.

Change-Id: Ic83425201c90966aae4c280d94cf1b427b3d71d1
Reviewed-on: http://gerrit.cloudera.org:8080/12131
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java
M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/FeKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/FeTable.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/HiveStorageDescriptorFactory.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladTableUsageTracker.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java
M fe/src/main/java/org/apache/impala/catalog/PrimitiveType.java
M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java
M fe/src/main/java/org/apache/impala/catalog/StructType.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoadingMgr.java
M fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalHbaseTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalKuduTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsStorageDescriptorTest.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
M fe/src/test/java/org/apache/impala/catalog/TestSchemaUtils.java
46 files changed, 329 insertions(+), 322 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/12131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic83425201c90966aae4c280d94cf1b427b3d71d1
Gerrit-Change-Number: 12131
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7666: Adding an opaque client identifier to query options.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12130 )

Change subject: IMPALA-7666: Adding an opaque client identifier to query 
options.
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Gerrit-Change-Number: 12130
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:16:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8021: Add estimated cardinality to EXPLAIN output

2019-01-09 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12136 )

Change subject: IMPALA-8021: Add estimated cardinality to EXPLAIN output
..


Patch Set 17:

Finally passed the pre-commit builds after latest changes: 
https://jenkins.impala.io/job/pre-review-test/278/

This build shows that removing the cardinality range check works in conjunction 
with only doing cardinality checks on selected test cases.

So, looks like this one is ready for the next commit step.


--
To view, visit http://gerrit.cloudera.org:8080/12136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9aa2d715b04cbb279aaffec8c5692686562d986
Gerrit-Change-Number: 12136
Gerrit-PatchSet: 17
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:07:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1756/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 1
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:03:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1757/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 23:02:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-09 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/cup/sql-parser.cup@1097
PS3, Line 1097:   KW_ALTER KW_TABLE table_name:table KW_ADD KW_COLUMN 
if_not_exists_val:if_not_exists
> I wonder if this is asking to be refactored to pull out the increasingly-de
Is there anything in particular that you have in mind to be pulled out into the 
common bits?


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java:

http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java@83
PS3, Line 83:   if (!colNames.add(colName)) {
> Should we be adding lower-case names? If we do a SELECT *, shouldn't we giv
colNames is only used for checking duplicate column names. I believe we 
preserve the case for the column names passed to HMS.


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java@87
PS3, Line 87:   }
> A bit beyond the scope of your change, but since you are refactoring...
Actually I just realized Kudu does not support ALTER TABLE REPLACE COLUMNS, so 
this whole logic can be removed. The original code combined the two statements 
into one which was confusing.


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2042
PS3, Line 2042: if (ifNotExists) {
> Shouldn't this check be done at analysis time so that execution is always g
I was initially thinking of doing it in the analysis. However, looking at other 
code in this file with "IF EXISTS" or "IF NOT EXISTS", it seems we handle "IF 
EXISTS" and "IF NOT EXISTS" in this file. I do agree, that may not be ideal.


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2075
PS3, Line 2075:   msTbl.getParameters().put(sortByKey, alteredColumns);
> Again, seems this should be done at analysis time so that exec just runs th
Similar argument as above.


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@326
PS3, Line 326: AnalyzesOk("alter table functional.alltypes add column 
NEW_COL int");
> Test with various case combinations? INT_COL, etc.
Done


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@338
PS3, Line 338: AnalysisError("alter table functional.alltypes add column if 
not exists year int",
> What are the rules for which names are valid? Do we check the name, or leav
We should rely on the host system to do the check. I don't think it's a good 
idea to mimic the rules of the target system since if the host rules change, we 
need to update our rules as well.


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@368
PS3, Line 368: "ALTER TABLE ADD COLUMNS not currently supported on 
HBase tables.");
> Do we have a test for ...(foo int, bar string, foo int)? That is, the dupli
yeah it's in L383


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@420
PS3, Line 420:
> Is the column validation code in common with the above case for Add Column?
Yeah ColumnDef.anayze() is the code that does column validation. Unfortunately 
there's no specific test for that class. I can create a follow up JIRA for 
ColumnDef test coverage.


http://gerrit.cloudera.org:8080/#/c/12181/3/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@459
PS3, Line 459: "Kudu tables do not support complex types: c 
STRUCT");
> Overall, nice test case coverage. Thanks!
Thanks! I also added more missing test cases especially for Kudu tables.



--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Re

[Impala-ASF-CR] IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

2019-01-09 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/12181 )

Change subject: IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD 
COLUMN(S)
..

IMPALA-7832: Support for IF NOT EXISTS in ALTER TABLE ADD COLUMN(S)

This patch adds IF NOT EXISTS support in ALTER TABLE ADD COLUMN and
ALTER TABLE ADD COLUMNS. If IF NOT EXISTS is specified and a column
already exists with this name, no error is thrown.

Syntax:
ALTER TABLE tbl ADD COLUMN [IF NOT EXISTS] i int
ALTER TABLE tbl ADD [IF NOT EXISTS] COLUMNS (i int, j int)

Testing:
- Added new FE tests
- Ran all FE tests
- Updated E2E DDL tests
- Ran all E2E DDL tests

Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
R fe/src/main/java/org/apache/impala/analysis/AlterTableAddColsStmt.java
C fe/src/main/java/org/apache/impala/analysis/AlterTableReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
9 files changed, 386 insertions(+), 179 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/12181/4
--
To view, visit http://gerrit.cloudera.org:8080/12181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I60ed22c8a8eefa10e94ad3dedf32fe67c16642d9
Gerrit-Change-Number: 12181
Gerrit-PatchSet: 4
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12203/2/tests/query_test/test_insert.py
File tests/query_test/test_insert.py:

http://gerrit.cloudera.org:8080/#/c/12203/2/tests/query_test/test_insert.py@26
PS2, Line 26:
flake8: E501 line too long (99 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:33:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-7550 Add documentation to profile counters

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12116 )

Change subject: WIP IMPALA-7550 Add documentation to profile counters
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1755/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idaa0a44f0a996f3487566b545d984d562e6e1588
Gerrit-Change-Number: 12116
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:36:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..

IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual 
consistency

This patch is a temporary fix to disable tests which fail due to
S3's eventually consistent behavior. The permanent fix would
involve running tests with S3Guard enabled.

Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
---
M tests/metadata/test_compute_stats.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_permutation.py
M tests/query_test/test_nested_types.py
4 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/12203/2
--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12203/1/tests/query_test/test_insert.py
File tests/query_test/test_insert.py:

http://gerrit.cloudera.org:8080/#/c/12203/1/tests/query_test/test_insert.py@152
PS1, Line 152: S
> flake8: F821 undefined name 'SkipIfS3'
Done


http://gerrit.cloudera.org:8080/#/c/12203/1/tests/query_test/test_insert_permutation.py
File tests/query_test/test_insert_permutation.py:

http://gerrit.cloudera.org:8080/#/c/12203/1/tests/query_test/test_insert_permutation.py@48
PS1, Line 48:
> flake8: F821 undefined name 'SkipIfS3'
Done



--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 2
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:32:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12203 )

Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12203/1/tests/query_test/test_insert.py
File tests/query_test/test_insert.py:

http://gerrit.cloudera.org:8080/#/c/12203/1/tests/query_test/test_insert.py@152
PS1, Line 152: S
flake8: F821 undefined name 'SkipIfS3'


http://gerrit.cloudera.org:8080/#/c/12203/1/tests/query_test/test_insert_permutation.py
File tests/query_test/test_insert_permutation.py:

http://gerrit.cloudera.org:8080/#/c/12203/1/tests/query_test/test_insert_permutation.py@48
PS1, Line 48: S
flake8: F821 undefined name 'SkipIfS3'



--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 1
Gerrit-Owner: Pooja Nilangekar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pooja Nilangekar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:29:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual consistency

2019-01-09 Thread Pooja Nilangekar (Code Review)
Pooja Nilangekar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12203


Change subject: IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's 
eventual consistency
..

IMPALA-6544/IMPALA-7070: Disable tests which fail due to S3's eventual 
consistency

This patch is a temporary fix to disable tests which fail due to
S3's eventually consistent behavior. The permanent fix would
involve running tests with S3Guard enabled.

Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
---
M tests/metadata/test_compute_stats.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_permutation.py
M tests/query_test/test_nested_types.py
4 files changed, 9 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/12203/1
--
To view, visit http://gerrit.cloudera.org:8080/12203
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I676faa191bec8b156e430661c22ee69242eeba9d
Gerrit-Change-Number: 12203
Gerrit-PatchSet: 1
Gerrit-Owner: Pooja Nilangekar 


[Impala-ASF-CR] IMPALA-6964: Track stats about column and page sizes in Parquet reader

2019-01-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11575 )

Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet 
reader
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/util/counters.py
File tests/util/counters.py:

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/util/counters.py@19
PS16, Line 19: class SummaryStatsCounter(object):
Yeah, looks like the backend tests use the text profile in many places and we 
should rather revisit that in a future change.

> If we want to stick with the text profile, we should have a namedtuple + an 
> is_empty(counter) method where we need it

Namedtuples are tuples with names for their fields: 
https://docs.python.org/2/library/collections.html#collections.namedtuple

Using TSummaryStatsCounter instead is also good.

Then I'd just have a helper method in test scanners, right where you need it:

  def counter_is_zero(c):
return c.max == c.min == c.avg == c.samples == 0

You could even go that with a lambda:

  counter_is_zero = lambda c: x.max == c.min == c.avg == c.samples

That does the trick in a single line and looks readable enough to me.



--
To view, visit http://gerrit.cloudera.org:8080/11575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817
Gerrit-Change-Number: 11575
Gerrit-PatchSet: 16
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:18:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-7550 Add documentation to profile counters

2019-01-09 Thread Lars Volker (Code Review)
Hello Philip Zeyliger, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12116

to look at the new patch set (#2).

Change subject: WIP IMPALA-7550 Add documentation to profile counters
..

WIP IMPALA-7550 Add documentation to profile counters

This is a prototype, I'm looking for feedback on the overall approach.

This change adds a singleton registry for runtime profile counters
prototypes, similar to what Kudu does for metrics. That allows us to
print documentation for all profile counters. As an example, this change
adds a debug webpage /profile_docs. With that we can also generate
documentation for all profile counters. In a future change we can then
add tooltips to the profile view directly.

Profile counters are annotated with their stability:

* Stable counters - generally useful to understand query performance,
should only change rarely and if it does we'll make some effort to
notify users. E.g. BytesRead.
* Unstable but useful - useful to understand query performance, but
subject to change, particularly if the implementation changes. E.g.
RowBatchQueuePutWaitTime, MaterializeTupleTimer
* Debugging counters - generally not useful to users of Impala, the main
use case is low-level debugging. Can be hidden to reduce noise for most
consumers of profiles.

The downside is that we'd reduce the comments that currently explain
some of the counters in header files by moving them to the .cc files.
Additionally a (arguably good) limitation is that profile counter names
need to be unique.

Change-Id: I1f088227b03fca19b0efee6c390f30de0b327226

End of prototype

Mention profile_docs

Change-Id: Idaa0a44f0a996f3487566b545d984d562e6e1588
---
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
D be/src/util/debug-counters.h
M be/src/util/default-path-handlers.cc
M be/src/util/runtime-profile-counters.h
M be/src/util/runtime-profile.cc
M www/query_profile.tmpl
9 files changed, 192 insertions(+), 93 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/12116/2
--
To view, visit http://gerrit.cloudera.org:8080/12116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idaa0a44f0a996f3487566b545d984d562e6e1588
Gerrit-Change-Number: 12116
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] WIP IMPALA-7550 Add documentation to profile counters

2019-01-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12116 )

Change subject: WIP IMPALA-7550 Add documentation to profile counters
..


Patch Set 1:

> It's hard to argue with documenting these.
 >
 > The other extreme position is just to do it in the Thrift, e.g. the
 > following diff. In practice we might end up generating the thrift
 > (and the docs).
 >
 > --- common/thrift/RuntimeProfile.thrift
 > +++ common/thrift/RuntimeProfile.thrift
 > @@ -92,6 +92,9 @@ struct TRuntimeProfileNode {
 > 2: required i32 num_children
 > 3: required list counters
 >
 > +  // documentation...
 > +  1000: TCounter bytes_read
 > +
 >
 >
 > If I were starting from scratch, I would definitely do this from
 > Thrift. (This is how we do query options, btw, and I think it's
 > largely fine, though we could do more code generation to avoid the
 > annoyance.) Given where we are, I think your approach is probably
 > more incremental, which is good.

Thanks for the feedback. Having profile counter prototypes properly registered 
would also allow us to generate thrift fields, so in an incremental way we can 
get there without having to throw away work.


--
To view, visit http://gerrit.cloudera.org:8080/12116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idaa0a44f0a996f3487566b545d984d562e6e1588
Gerrit-Change-Number: 12116
Gerrit-PatchSet: 1
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:07:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-7550 Add documentation to profile counters

2019-01-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12116 )

Change subject: WIP IMPALA-7550 Add documentation to profile counters
..


Patch Set 1:

> (1 comment)

Thanks for the feedback. I added a stability field, let me know what you think.

We have ~200 counters it seems: git grep -E 
"ADD_TIMER|ADD_COUNTER|ADD_TIME_SERIES_COUNTER|ADD_SUMMARY_STATS_COUNTER|ADD_CHILD_TIMER"
 | wc -l

I can't think of a good way to make this change mechanically, so I'm leaning 
towards doing this in several stages to prevent a large change from rotting 
away in review. With the main plumbing in we can also start to work on the 
documentation side of things, e.g. generate useful docs.

Would you prefer a large, atomic change instead?


--
To view, visit http://gerrit.cloudera.org:8080/12116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idaa0a44f0a996f3487566b545d984d562e6e1588
Gerrit-Change-Number: 12116
Gerrit-PatchSet: 1
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:06:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-7550 Add documentation to profile counters

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12116 )

Change subject: WIP IMPALA-7550 Add documentation to profile counters
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/runtime/coordinator.cc@68
PS2, Line 68: PROFILE_DEFINE_COUNTER(NumBackends, stable, TUnit::UNIT, "Number 
of backends running this query");
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h@94
PS2, Line 94:   ::impala::CounterPrototype PROFILE_##name(#name, 
ProfileEntryPrototype::Stability::stability, desc, unit)
line too long (107 > 90)


http://gerrit.cloudera.org:8080/#/c/12116/2/be/src/util/runtime-profile-counters.h@96
PS2, Line 96:   ::impala::RateCounterPrototype PROFILE_##name(#name, 
ProfileEntryPrototype::Stability::stability, desc, unit)
line too long (111 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12116
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idaa0a44f0a996f3487566b545d984d562e6e1588
Gerrit-Change-Number: 12116
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 22:02:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8060: [DOCS] A typo fix

2019-01-09 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12201 )

Change subject: IMPALA-8060: [DOCS] A typo fix
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12201
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1fc02f9676c46f95f799f5c71e2b34ed30a4899
Gerrit-Change-Number: 12201
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 09 Jan 2019 21:36:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] A typo fix

2019-01-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12201 )

Change subject: IMPALA-8060: [DOCS] A typo fix
..

IMPALA-8060: [DOCS] A typo fix

Change-Id: Ia1fc02f9676c46f95f799f5c71e2b34ed30a4899
Reviewed-on: http://gerrit.cloudera.org:8080/12201
Tested-by: Impala Public Jenkins 
Reviewed-by: Bikramjeet Vig 
---
M docs/topics/impala_admission.xml
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Bikramjeet Vig: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/12201
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia1fc02f9676c46f95f799f5c71e2b34ed30a4899
Gerrit-Change-Number: 12201
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-6964: Track stats about column and page sizes in Parquet reader

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11575 )

Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet 
reader
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1754/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817
Gerrit-Change-Number: 11575
Gerrit-PatchSet: 17
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 21:21:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] A typo fix

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12201 )

Change subject: IMPALA-8060: [DOCS] A typo fix
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/183/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/12201
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1fc02f9676c46f95f799f5c71e2b34ed30a4899
Gerrit-Change-Number: 12201
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 09 Jan 2019 21:20:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] A typo fix

2019-01-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12201


Change subject: IMPALA-8060: [DOCS] A typo fix
..

IMPALA-8060: [DOCS] A typo fix

Change-Id: Ia1fc02f9676c46f95f799f5c71e2b34ed30a4899
---
M docs/topics/impala_admission.xml
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/12201/1
--
To view, visit http://gerrit.cloudera.org:8080/12201
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia1fc02f9676c46f95f799f5c71e2b34ed30a4899
Gerrit-Change-Number: 12201
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-8060: [DOCS] A typo fix

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12201 )

Change subject: IMPALA-8060: [DOCS] A typo fix
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/183/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/12201
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1fc02f9676c46f95f799f5c71e2b34ed30a4899
Gerrit-Change-Number: 12201
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 09 Jan 2019 21:09:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] Restructured the admission control docs

2019-01-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12191 )

Change subject: IMPALA-8060: [DOCS] Restructured the admission control docs
..

IMPALA-8060: [DOCS] Restructured the admission control docs

- Created a new doc category called "Resource Management".
- Moved impala_admission under Resource Management.
- Move the config steps out of impala_admission.xml and
- created impala_admission_config.xml

Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Reviewed-on: http://gerrit.cloudera.org:8080/12191
Reviewed-by: Bikramjeet Vig 
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 
---
M docs/impala.ditamap
M docs/impala_keydefs.ditamap
M docs/topics/impala_admission.xml
A docs/topics/impala_admission_config.xml
M docs/topics/impala_dedicated_coordinator.xml
M docs/topics/impala_resource_management.xml
6 files changed, 573 insertions(+), 745 deletions(-)

Approvals:
  Bikramjeet Vig: Looks good to me, but someone else must approve
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/12191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Gerrit-Change-Number: 12191
Gerrit-PatchSet: 3
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6964: Track stats about column and page sizes in Parquet reader

2019-01-09 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11575 )

Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet 
reader
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/query_test/test_scanners.py@819
PS16, Line 819: runtime_profile = str(result.runtime_profile)
> is this wrapping in str() necessary?
No, its not. We were doing this in several other places in this file, so I 
removed them all.



--
To view, visit http://gerrit.cloudera.org:8080/11575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817
Gerrit-Change-Number: 11575
Gerrit-PatchSet: 16
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 20:46:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6964: Track stats about column and page sizes in Parquet reader

2019-01-09 Thread Sahil Takiar (Code Review)
Hello Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11575

to look at the new patch set (#17).

Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet 
reader
..

IMPALA-6964: Track stats about column and page sizes in Parquet reader

Adds the following new stats:

* ParquetCompressedPageSize - a summary (average, min, max) counter that
tracks the size of compressed pages read, if no compressed pages are
read then this counter is empty
* ParquetUncompressedPageSize - a summary counter that tracks the size
of uncompressed pages read, it is updated in two places: (1) when a
compressed page is de-compressed, and (2) when a page that is not
compressed is read
* ParquetCompressedDataReadPerColumn - a summary counter that tracks the
amount of compressed data read per column for a scan node
* ParquetUncompressedDataReadPerColumn - a summary counter that tracks
the amount of uncompressed data read per column for a scan node

The PerColumn counters are calculated by aggregating the number of bytes
read for each column across all scan ranges processed by a scan node.
Each sample in the counter is the size of a single column.

Here is an example of what the updated HDFS scan profile looks like:

- ParquetCompressedDataReadPerColumn: (Avg: 227.56 KB (233018) ;
Min: 225.14 KB (230540) ; Max: 229.98 KB (235496) ; Number of samples: 2)
- ParquetUncompressedDataReadPerColumn: (Avg: 227.96 KB (233426) ;
Min: 224.91 KB (230306) ; Max: 231.00 KB (236547) ; Number of samples: 2)
- ParquetCompressedPageSize: (Avg: 4.46 KB (4568) ; Min: 3.86 KB (3955) ;
Max: 5.19 KB (5315) ; Number of samples: 102)
- ParquetDecompressedPageSize: (Avg: 4.47 KB (4576) ; Min: 3.86 KB (3950)
 ; Max: 5.22 KB (5349) ; Number of samples: 102)

Testing:
* Added new tests to test_scanners.py that do some basic validation of
the new counters above

Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/util/runtime-profile.cc
M tests/infra/test_utils.py
M tests/query_test/test_scanners.py
A tests/util/counters.py
M tests/util/parse_util.py
10 files changed, 320 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/11575/17
--
To view, visit http://gerrit.cloudera.org:8080/11575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817
Gerrit-Change-Number: 11575
Gerrit-PatchSet: 17
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-6964: Track stats about column and page sizes in Parquet reader

2019-01-09 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11575 )

Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet 
reader
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/util/counters.py
File tests/util/counters.py:

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/util/counters.py@19
PS16, Line 19: class SummaryStatsCounter(object):
> Is there a way that we use the thrift profile directly instead? By now we a
Yeah, thats a good point. We can use TSummaryStatsCounter directly. What about 
changing this class to counter_utils.py and just defining an 
is_empty(TSummaryStatsCounter) method?

Clients of this class would then create a TSummaryStatsCounter directly and 
just call counter_utils::is_empty(my_tsummary_stats_counter)

"If we want to stick with the text profile, we should have a namedtuple + an 
is_empty(counter) method where we need it" - not sure I fully understand what 
you mean?

My understanding of the code is that the Beeswax Python client we are using 
uses the method GetRuntimeProfile 
(https://github.com/apache/impala/blob/master/common/thrift/ImpalaService.thrift#L408)
 to return the Runtime Profile, which returns the Runtime Profile as a string 
(even through there seems to be a Runtime Profile Thrift struct - 
TRuntimeProfileTree).



--
To view, visit http://gerrit.cloudera.org:8080/11575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817
Gerrit-Change-Number: 11575
Gerrit-PatchSet: 16
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 20:43:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7666: Adding an opaque client identifier to query options.

2019-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12130 )

Change subject: IMPALA-7666: Adding an opaque client identifier to query 
options.
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Gerrit-Change-Number: 12130
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 20:02:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7666: Adding an opaque client identifier to query options.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12130 )

Change subject: IMPALA-7666: Adding an opaque client identifier to query 
options.
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1752/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Gerrit-Change-Number: 12130
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 20:08:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7666: Propagate name of test into CLIENT IDENTIFIER.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12177 )

Change subject: IMPALA-7666: Propagate name of test into CLIENT_IDENTIFIER.
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1753/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f685fd16982d73ad3fc0f4a7578c5ad83b9a84c
Gerrit-Change-Number: 12177
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 20:06:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] Restructured the admission control docs

2019-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12191 )

Change subject: IMPALA-8060: [DOCS] Restructured the admission control docs
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Gerrit-Change-Number: 12191
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 20:01:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8061: Fix S3 ACCESS VALIDATED unbound variable

2019-01-09 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12193 )

Change subject: IMPALA-8061: Fix S3_ACCESS_VALIDATED unbound variable
..


Patch Set 1:

> Patch Set 1:
>
> Did something recently fail that makes this necessary? Putting the unbound 
> variable inside quotes seems to work as expected on my machine.

buildall.sh has set -u which causes the shell to treat unset variables as an 
error and exit immediately.


--
To view, visit http://gerrit.cloudera.org:8080/12193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If48866bb71d748f12ceada4fd62822d7d374811c
Gerrit-Change-Number: 12193
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:46:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] Restructured the admission control docs

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12191 )

Change subject: IMPALA-8060: [DOCS] Restructured the admission control docs
..


Patch Set 2: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/182/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/12191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Gerrit-Change-Number: 12191
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:44:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12192 )

Change subject: IMPALA-8058: Fallback for HBase key scan range estimation
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1751/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce
Gerrit-Change-Number: 12192
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:41:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] Restructured the admission control docs

2019-01-09 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12191 )

Change subject: IMPALA-8060: [DOCS] Restructured the admission control docs
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12191/2/docs/topics/impala_admission.xml
File docs/topics/impala_admission.xml:

http://gerrit.cloudera.org:8080/#/c/12191/2/docs/topics/impala_admission.xml@122
PS2, Line 122: It
nit: in



-- 
To view, visit http://gerrit.cloudera.org:8080/12191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Gerrit-Change-Number: 12191
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:32:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6664: Tag log statements with fragment or query ids.

2019-01-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12129 )

Change subject: IMPALA-6664: Tag log statements with fragment or query ids.
..


Patch Set 3:

> Does/should this fix up the Java side messages as well? May be some
 > configuration in Log4J that produces a result similar to what
 > you've achieved on the C++ side.

The Java side of Impala logs to Glog via 
fe/src/main/java/org/apache/impala/util/GlogAppender.java which does JNI to 
write to GLOG. As a bonus, the log line below is an example of how we get this 
tagging for free assuming that the Java code isn't creating threads and so on. 
For the planner, it's single-threaded as far as I know, so we don't have to do 
anything special.

  I0108 10:39:16.456627 14752 Frontend.java:1282] 
85420d575b9ff4b9:402b8868] Analysis finished.


--
To view, visit http://gerrit.cloudera.org:8080/12129
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6634ef9d1a7346339f24f2d40a7a3aa36a535da8
Gerrit-Change-Number: 12129
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:18:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] Restructured the admission control docs

2019-01-09 Thread Alex Rodoni (Code Review)
Alex Rodoni has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12191 )

Change subject: IMPALA-8060: [DOCS] Restructured the admission control docs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12191/1/docs/topics/impala_admission.xml
File docs/topics/impala_admission.xml:

http://gerrit.cloudera.org:8080/#/c/12191/1/docs/topics/impala_admission.xml@35
PS1, Line 35:   and out-of-memory conditions on busy CDH clusters. The 
admission control
> We should remove references to CDH
Done



--
To view, visit http://gerrit.cloudera.org:8080/12191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Gerrit-Change-Number: 12191
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:16:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8060: [DOCS] Restructured the admission control docs

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12191 )

Change subject: IMPALA-8060: [DOCS] Restructured the admission control docs
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/182/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/12191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Gerrit-Change-Number: 12191
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:15:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] Restructured the admission control docs

2019-01-09 Thread Alex Rodoni (Code Review)
Hello Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12191

to look at the new patch set (#2).

Change subject: IMPALA-8060: [DOCS] Restructured the admission control docs
..

IMPALA-8060: [DOCS] Restructured the admission control docs

- Created a new doc category called "Resource Management".
- Moved impala_admission under Resource Management.
- Move the config steps out of impala_admission.xml and
- created impala_admission_config.xml

Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
---
M docs/impala.ditamap
M docs/impala_keydefs.ditamap
M docs/topics/impala_admission.xml
A docs/topics/impala_admission_config.xml
M docs/topics/impala_dedicated_coordinator.xml
M docs/topics/impala_resource_management.xml
6 files changed, 573 insertions(+), 745 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/12191/2
--
To view, visit http://gerrit.cloudera.org:8080/12191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Gerrit-Change-Number: 12191
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8021: Add estimated cardinality to EXPLAIN output

2019-01-09 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12136 )

Change subject: IMPALA-8021: Add estimated cardinality to EXPLAIN output
..


Patch Set 17:

(1 comment)

Bharath, thanks for the +2. Please hold off kicking off the final build until I 
verify that I've nailed all the test issues. Unfortunately, our pre-commit 
tests only catch one Python test failure per four-hour run, so it is a bit 
tedious to track down tests impacted by an EXPLAIN output change... I'l post 
here when the pre-review tests fully pass.

http://gerrit.cloudera.org:8080/#/c/12136/17/fe/src/test/java/org/apache/impala/testutil/TestUtils.java
File fe/src/test/java/org/apache/impala/testutil/TestUtils.java:

http://gerrit.cloudera.org:8080/#/c/12136/17/fe/src/test/java/org/apache/impala/testutil/TestUtils.java@210
PS17, Line 210:   try (Scanner e = new Scanner(expectedStr);
> Just to be sure, there is no functional change here right?  As I understand
There is a functional change: one we need. The original version consumed two 
tokens per pass. Glance over at the left and look for the two a.next()/e.next() 
calls. Both consume a token. So, the original code didn't do what it claimed to 
do. The new code does work correctly.

I retained this fix even though the original impetus for the fix, adding 
filtering, was removed.



--
To view, visit http://gerrit.cloudera.org:8080/12136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9aa2d715b04cbb279aaffec8c5692686562d986
Gerrit-Change-Number: 12136
Gerrit-PatchSet: 17
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:11:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6964: Track stats about column and page sizes in Parquet reader

2019-01-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11575 )

Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet 
reader
..


Patch Set 16:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/query_test/test_scanners.py@819
PS16, Line 819: runtime_profile = str(result.runtime_profile)
is this wrapping in str() necessary?


http://gerrit.cloudera.org:8080/#/c/11575/16/tests/util/counters.py
File tests/util/counters.py:

http://gerrit.cloudera.org:8080/#/c/11575/16/tests/util/counters.py@19
PS16, Line 19: class SummaryStatsCounter(object):
Is there a way that we use the thrift profile directly instead? By now we are 
parsing the string profile into a class that resembles TSummaryStatsCounter and 
will have to be kept in sync.

If we want to stick with the text profile, we should have a namedtuple + an 
is_empty(counter) method where we need it. This would keep things more concise.



--
To view, visit http://gerrit.cloudera.org:8080/11575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817
Gerrit-Change-Number: 11575
Gerrit-PatchSet: 16
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:09:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7666: Propagate name of test into CLIENT IDENTIFIER.

2019-01-09 Thread Philip Zeyliger (Code Review)
Hello David Knupp, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12177

to look at the new patch set (#2).

Change subject: IMPALA-7666: Propagate name of test into CLIENT_IDENTIFIER.
..

IMPALA-7666: Propagate name of test into CLIENT_IDENTIFIER.

To facilitate correlating test failures (where we sometimes know things
like fragment id) with the tests that generated those queries, we can
stuff the test name into CLIENT_IDENTIFIER.

The mechanics here are to create a global, tests.common.current_node to
store the current test, create a plugin in conftest to set this when
entering a test, and then configuring connections as they're created
deep in test code.

Change-Id: I2f685fd16982d73ad3fc0f4a7578c5ad83b9a84c
---
M tests/common/impala_connection.py
M tests/conftest.py
M tests/query_test/test_observability.py
3 files changed, 34 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/12177/2
--
To view, visit http://gerrit.cloudera.org:8080/12177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f685fd16982d73ad3fc0f4a7578c5ad83b9a84c
Gerrit-Change-Number: 12177
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7666: Propagate name of test into CLIENT IDENTIFIER.

2019-01-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12177 )

Change subject: IMPALA-7666: Propagate name of test into CLIENT_IDENTIFIER.
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12177/1/tests/conftest.py
File tests/conftest.py:

http://gerrit.cloudera.org:8080/#/c/12177/1/tests/conftest.py@595
PS1, Line 595: @pytest.hookimpl(trylast=True)
> flake8: E302 expected 2 blank lines, found 1
Done


http://gerrit.cloudera.org:8080/#/c/12177/1/tests/conftest.py@601
PS1, Line 601: ,
> flake8: E231 missing whitespace after ','
Done


http://gerrit.cloudera.org:8080/#/c/12177/1/tests/conftest.py@601
PS1, Line 601: :
> flake8: E501 line too long (95 > 90 characters)
Done



--
To view, visit http://gerrit.cloudera.org:8080/12177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f685fd16982d73ad3fc0f4a7578c5ad83b9a84c
Gerrit-Change-Number: 12177
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:04:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7666: Adding an opaque client identifier to query options.

2019-01-09 Thread Philip Zeyliger (Code Review)
Hello Lars Volker, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12130

to look at the new patch set (#4).

Change subject: IMPALA-7666: Adding an opaque client identifier to query 
options.
..

IMPALA-7666: Adding an opaque client identifier to query options.

We sometimes struggle to identify the client (e.g., a given version of a
JDBC driver, Tableau, Hue, etc.) for a given query. This commit adds a
User-Agent header style, called "Client Identifier", which clients can
set as a Query Option. Nothing is done with this header, but it's
written into logs and query profiles.

This commit includes changes to impala-shell to include the version of
impala shell with an associated test.

A future commit will serialize the name of the py.test being run into
this field, which is handy for figuring out where a query came from.

Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M shell/impala_shell.py
M tests/shell/test_shell_commandline.py
6 files changed, 30 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/12130/4
--
To view, visit http://gerrit.cloudera.org:8080/12130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Gerrit-Change-Number: 12130
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7666: Adding an opaque client identifier to query options.

2019-01-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12130 )

Change subject: IMPALA-7666: Adding an opaque client identifier to query 
options.
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12130/3/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/12130/3/be/src/service/query-options.cc@723
PS3, Line 723:
> nit: space
Done


http://gerrit.cloudera.org:8080/#/c/12130/3/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/12130/3/shell/impala_shell.py@1632
PS3, Line 1632:   # support this config option, though a warning is produced.
> I feel like using a newer client to connect to an older impala daemon might
I moved this around.No more warning.



--
To view, visit http://gerrit.cloudera.org:8080/12130
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a7708492f05d33b2bc99fc3a03b461bbb6f3ea4
Gerrit-Change-Number: 12130
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:02:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7867 (Part 4): Collection cleanup in catalog

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12131 )

Change subject: IMPALA-7867 (Part 4): Collection cleanup in catalog
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83425201c90966aae4c280d94cf1b427b3d71d1
Gerrit-Change-Number: 12131
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:01:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7867 (Part 4): Collection cleanup in catalog

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12131 )

Change subject: IMPALA-7867 (Part 4): Collection cleanup in catalog
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3625/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83425201c90966aae4c280d94cf1b427b3d71d1
Gerrit-Change-Number: 12131
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 09 Jan 2019 19:01:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7087: Read Parquet decimal columns with lower precision/scale

2019-01-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12163 )

Change subject: IMPALA-7087: Read Parquet decimal columns with lower 
precision/scale
..


Patch Set 1:

> > > (1 comment)
 > > >
 > > > Thanks for taking a look @Csaba! I've only responded to your
 > > > comment on overflow handling so far, as I want to get that
 > > behavior
 > > > locked down first.
 > >
 > > Are you sure that the query will be aborted - isn't it only one
 > > warning / file? If there are "normal" files processed before the
 > > one with overflows, then some rows will be returned before
 > running
 > > to the error. In the current implementation it is also possible
 > > that some rows are returned from the file before reaching the
 > > overflowing value.
 > >
 > > I am also not completely sure about the best approach.
 > >
 > > I would prefer the error to be non-fatal, so to print one
 > > warning/file and continue the query.
 > >
 > > If no important use case is lost this way, then I would prefer to
 > > use the formula and skip files based on metadata instead of
 > > checking every row.
 > >
 > > This would make it possible to print a useful error message, so
 > > that the user will know the precision needed to process the file,
 > > and can call ALTER TABLE to increase precision / decrease scale
 > and
 > > make the file usable. If the error is printed based the first
 > row,
 > > then this can become a long iterative process.
 >
 > Thats a fair point. It is throwing an error. The last test in
 > parquet-decimal-precision-and-scale-widening.test has a `CATCH`
 > section, so if I understand the logic in ImpalaTestSuite#run_test_case
 > then that means the query threw an exception. I agree that its
 > possible for some rows to be returned before this error is throw,
 > but I guess thats an issue with many runtime errors in Impala.
 >
 > I'm good with printing a non-fatal error and setting the row to
 > NULL. This is what Hive does, so we will be consistent with Hive.
 >
 > Using the formula will prevent overflows from happening in the
 > first place, so it somewhat orthogonal to the discussion of error
 > handling. The only drawback I can see to using the formula you
 > suggested is that it is not exactly SQL Standard compliant. It's
 > entirely possible that the formula prevents a query from running,
 > even though none of the values in the table would cause an
 > overflow. The standard says that values should be rounded or
 > truncated to the match target precision / scale, but I guess it
 > doesn't explicitly mention overflows so its hard to say.
 >
 > I think that following Hive's behavior might be best here. I've
 > added @Lars to the review to see what he thinks.

I agree that following Hive's behavior would be best. For other errors we also 
usually issue a warning and return null, unless abort_on_error is set.

On a side note: If the parquet file has min/max statistics we should be able to 
tell whether an overflow would happen or not. However, if we want to have 
per-row behavior, that won't help us much.


--
To view, visit http://gerrit.cloudera.org:8080/12163
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafc8efd12379a39756e3e70f022a81a636dadb61
Gerrit-Change-Number: 12163
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 09 Jan 2019 18:42:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8060: [DOCS] Restructured the admission control docs

2019-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12191 )

Change subject: IMPALA-8060: [DOCS] Restructured the admission control docs
..


Patch Set 1:

(1 comment)

LGTM except for the "CDH"s

http://gerrit.cloudera.org:8080/#/c/12191/1/docs/topics/impala_admission.xml
File docs/topics/impala_admission.xml:

http://gerrit.cloudera.org:8080/#/c/12191/1/docs/topics/impala_admission.xml@35
PS1, Line 35:   and out-of-memory conditions on busy CDH clusters. The 
admission control
We should remove references to CDH



--
To view, visit http://gerrit.cloudera.org:8080/12191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id42ae256b215fb267023197c5052f36bedb052a3
Gerrit-Change-Number: 12191
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 18:35:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6533: Add min-max filter for decimal types on kudu tables.

2019-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12113 )

Change subject: IMPALA-6533: Add min-max filter for decimal types on kudu 
tables.
..


Patch Set 17: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/12113
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7e7278e902160d7060f8097290bc172d9031f94
Gerrit-Change-Number: 12113
Gerrit-PatchSet: 17
Gerrit-Owner: Janaki Lahorani 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Janaki Lahorani 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 18:33:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet

2019-01-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Anonymous Coward (359)
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-Comment-Date: Wed, 09 Jan 2019 18:30:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7867 (Part 4): Collection cleanup in catalog

2019-01-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12131 )

Change subject: IMPALA-7867 (Part 4): Collection cleanup in catalog
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic83425201c90966aae4c280d94cf1b427b3d71d1
Gerrit-Change-Number: 12131
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 09 Jan 2019 18:12:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-5843: Use page index in Parquet files to skip pages

2019-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12065 )

Change subject: WIP: IMPALA-5843: Use page index in Parquet files to skip pages
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12065/1/be/src/exec/parquet/parquet-bool-decoder.cc
File be/src/exec/parquet/parquet-bool-decoder.cc:

http://gerrit.cloudera.org:8080/#/c/12065/1/be/src/exec/parquet/parquet-bool-decoder.cc@86
PS1, Line 86:   } else {
> I am ok with the current solution, but I think that it is sub-optimal in th
Yeah that is a bit of a weird behaviour. I suspect that the bool decoding is 
fast enough relative to other parts of query execution that we wouldn't see a 
measurable impact from an extra copy, but it might be worth mentioning in a 
comment for future readers to understand.



-- 
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 09 Jan 2019 17:56:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-5843: Use page index in Parquet files to skip pages

2019-01-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12065 )

Change subject: WIP: IMPALA-5843: Use page index in Parquet files to skip pages
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12065/1/be/src/exec/parquet/parquet-bool-decoder.cc
File be/src/exec/parquet/parquet-bool-decoder.cc:

http://gerrit.cloudera.org:8080/#/c/12065/1/be/src/exec/parquet/parquet-bool-decoder.cc@86
PS1, Line 86:   } else {
> I'm not sure I follow.
I am ok with the current solution, but I think that it is sub-optimal in the 
following case:
The whole page is a long literal run, and there are a few values to skip 
(num_values % 8 != 0). ParquetBoolDecoder::DecodeValue will fill its buffer at 
every 128th value, and rle_decoder_ will never become 8 bit aligned again, so 
rle_decoder_'s buffer will be also used when filling unpacked_values_.

Note that RLE handling in ParquetBoolDecoder was implemented (by me) in a 
sub-optimal way from the start to make it simpler (avoiding adding encoding as 
template parameter). So I am ok with keeping as it is and maybe optimizing both 
skipping and decoding logic in the future.



--
To view, visit http://gerrit.cloudera.org:8080/12065
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0cc99f129f2048dbafbe7f5a51d1ea3a5005731a
Gerrit-Change-Number: 12065
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 09 Jan 2019 17:50:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7979: Enhance decoders to support value-skipping

2019-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12172 )

Change subject: IMPALA-7979: Enhance decoders to support value-skipping
..


Patch Set 1:

(9 comments)

Thanks for pulling this out, it was helpful to be able to look at some of the 
subtleties here.

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder.h
File be/src/exec/parquet/parquet-bool-decoder.h:

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder.h@44
PS1, Line 44:   ///TODO: add e2e tests when page filtering is implemented.
Maybe include a JIRA in the TODO?


http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder.cc
File be/src/exec/parquet/parquet-bool-decoder.cc:

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder.cc@78
PS1, Line 78:  
doesn't really matter, but "-="?


http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder.cc@83
PS1, Line 83:   unpacked_value_idx_ = num_remaining;
This doesn't detect a decode error if UnpackBatch() returns < num_remaining, 
right? It looks like on the RLE branch below we do detect such errors, so we 
should be consistent (I don't generally think that these decoders need to be 
strictly validating, but we should be consistent where possible).


http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/exec/parquet/parquet-bool-decoder.cc@89
PS1, Line 89:   return true;
I'd find it a bit more readable if this returned from within the PLAIN branch.


http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/bit-stream-utils.h
File be/src/util/bit-stream-utils.h:

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/bit-stream-utils.h@139
PS1, Line 139:   void SkipBatch(int bit_width, int num_values_to_skip);
Maybe this should return a value to determine whether the skip was successful.

I guess it's not strictly necessary if we're willing to be loose about 
validation since we don't actually read the values, but it's worth documenting 
if we decide to go down that path.


http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/bit-stream-utils.inline.h
File be/src/util/bit-stream-utils.inline.h:

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/bit-stream-utils.inline.h@112
PS1, Line 112:   DCHECK_LE(skip_bytes, buffer_end_ - buffer_pos_);
I think this DCHECK could be hit on a corrupt file, e.g. where there was less 
packed data than the reported number of values.


http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/dict-encoding.h@557
PS1, Line 557: num_remaining
This is an int64_t. Maybe we should just use int64_t for all the counts in this 
function

(There's some preexisting issues in other functions like this but best to avoid 
them here).


http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/dict-encoding.h@567
PS1, Line 567: else {
Could remove level of nesting here.


http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/12172/1/be/src/util/rle-encoding.h@689
PS1, Line 689:   int32_t num_skipped = 0;
I think we could avoid having num_skipped if we just maintain num_remaining 
instaead.

In the above function num_consumed was a bit more useful since it helped 
calculate the output position.



--
To view, visit http://gerrit.cloudera.org:8080/12172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib848f1bd71735fe84e8064daf700417b32589f57
Gerrit-Change-Number: 12172
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 09 Jan 2019 17:48:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation

2019-01-09 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12192 )

Change subject: IMPALA-8058: Fallback for HBase key scan range estimation
..


Patch Set 1:

Passed pre-review tests: https://jenkins.impala.io/job/pre-review-test/277/


--
To view, visit http://gerrit.cloudera.org:8080/12192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce
Gerrit-Change-Number: 12192
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 09 Jan 2019 17:35:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8058: Fallback for HBase key scan range estimation

2019-01-09 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12192


Change subject: IMPALA-8058: Fallback for HBase key scan range estimation
..

IMPALA-8058: Fallback for HBase key scan range estimation

HBase provides keys. Impala supports "pushing" of key range predicates
to HBase to read only rows within the target key range. The planner
estimates the cardinality of such scans by sampling the rows within the
range. However, we have seen cases where the predicates are so selective
that no keys fall within the sampling range, and we end up without a
good cardinality estimate. (Specifically, the code does a division by
zero and produces a huge estimate. See the ticket for details.)

This fix:

* Creates a fall-back strategy that uses table cardinality from HMS and
  the selectivity of the key predicates to estimate cardinality when the
  sampling approach fails.
* The fall-back strategy requires tracking the predicates used for HBase
  keys so that they can be applied during fall-back calculations.
* Moved HBase key calculation out of the SingleNodePlanner into the
  HBase scan node as suggested by a "TO DO" in the code. Doing so
  simplified the new code.
* In the spirit of IMPALA-7919, adds the key predicates to the HBase
  scan node in the EXPLAIN output.

Testing:

* Adds a query context option to disable the normal key sampling to
  force the use of the fall-back. Used for testing.
* Adds a new set of HBase test cases that use the new feature to check
  plans with the fall-back approach.
* Reran all existing tests.

Testing will be improved once IMPALA-8021 is available: we will have the
estimated cardinality in both sets of HBase tests and can compare the
two approaches.

Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce
---
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/catalog/FeHBaseTable.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/hbase-no-key-est.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
10 files changed, 485 insertions(+), 124 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/12192/1
--
To view, visit http://gerrit.cloudera.org:8080/12192
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic01147abcb6b184071ba28b55aedc3bc49b322ce
Gerrit-Change-Number: 12192
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 


[Impala-ASF-CR] IMPALA-6742: Profiles of running queries should include execution summary.

2019-01-09 Thread Yongjun Zhang (Code Review)
Yongjun Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11591 )

Change subject: IMPALA-6742: Profiles of running queries should include 
execution summary.
..


Patch Set 6:

> Change has been successfully rebased and submitted as
 > e9652a48dd00c3c076ddccaa940d074b6970b7fc by Joe McDonnell

Many thanks Joe and Tim for the review and commit!

--Yongjun


--
To view, visit http://gerrit.cloudera.org:8080/11591
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc7f714c9427d4b26d4e78cf27ceca2b0b336699
Gerrit-Change-Number: 11591
Gerrit-PatchSet: 6
Gerrit-Owner: Yongjun Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Yongjun Zhang 
Gerrit-Comment-Date: Wed, 09 Jan 2019 17:31:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8061: Fix S3 ACCESS VALIDATED unbound variable

2019-01-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12193 )

Change subject: IMPALA-8061: Fix S3_ACCESS_VALIDATED unbound variable
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/12193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If48866bb71d748f12ceada4fd62822d7d374811c
Gerrit-Change-Number: 12193
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 Jan 2019 17:30:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8061: Fix S3 ACCESS VALIDATED unbound variable

2019-01-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12193 )

Change subject: IMPALA-8061: Fix S3_ACCESS_VALIDATED unbound variable
..


Patch Set 1:

Did something recently fail that makes this necessary? Putting the unbound 
variable inside quotes seems to work as expected on my machine.


--
To view, visit http://gerrit.cloudera.org:8080/12193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If48866bb71d748f12ceada4fd62822d7d374811c
Gerrit-Change-Number: 12193
Gerrit-PatchSet: 1
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 09 Jan 2019 17:30:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8043: Fix BE test failures related to SystemV timezones.

2019-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12199 )

Change subject: IMPALA-8043: Fix BE test failures related to SystemV timezones.
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1750/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/12199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9288cd24c8af0c059e55d47c86bd92eaf0075681
Gerrit-Change-Number: 12199
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 09 Jan 2019 15:41:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8043: Fix BE test failures related to SystemV timezones.

2019-01-09 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12199


Change subject: IMPALA-8043: Fix BE test failures related to SystemV timezones.
..

IMPALA-8043: Fix BE test failures related to SystemV timezones.

This is a fix for the following issue:

1. Some BE tests (e.g. ExprTest.TimestampFunctions) use the system's
   local timezone but run against a test timezone db (instead of the
   system's timezone db).
2. On some Linux installations /usr/share/zoneinfo contains symlinks
   to files in the /usr/share/zoneifo/SystemV directory
   (e.g /usr/share/zoneinfo/America/Los_Angeles is a symlink to
   ../SystemV/PST8PDT).
3. The 'SystemV' directory is not part of the test timezone db, since
   it is obsolete and excluded by default.

Consequently, if the system's local timezone is set to
America/Los_Angeles, BE tests won't find the corresponding timezone
file in the test timezone db. BE tests will default to UTC, which will
break some of them.

This change sets local timezone explicitly for failing BE tests, so
they don't depend on the system's local timezone.
It also adds 'SystemV' directory to the test timezone db to avoid
similar issues in the future.

Change-Id: I9288cd24c8af0c059e55d47c86bd92eaf0075681
---
M be/src/exprs/expr-test.cc
M testdata/data/timezoneverification.csv
M testdata/tzdb/2017c.zip
3 files changed, 45 insertions(+), 22 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/12199/1
--
To view, visit http://gerrit.cloudera.org:8080/12199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9288cd24c8af0c059e55d47c86bd92eaf0075681
Gerrit-Change-Number: 12199
Gerrit-PatchSet: 1
Gerrit-Owner: Attila Jeges 


[Impala-ASF-CR] IMPALA-6664: Tag log statements with fragment or query ids.

2019-01-09 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12129 )

Change subject: IMPALA-6664: Tag log statements with fragment or query ids.
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/12129/3/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/12129/3/be/src/common/thread-debug-info.h@a94
PS3, Line 94:
We should copy parent_.thread_name_, or remove it from struct 'ParentInfo'.


http://gerrit.cloudera.org:8080/#/c/12129/2/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/12129/2/be/src/common/thread-debug-info.h@a116
PS2, Line 116:
> Yep, I'll wait for Zoltan's input.
Yeah, the idea was to make it readable when debugging a minidump or core file 
and you can't invoke PrintId() from GDB.
But since PrintId() just prints the hex of members 'hi' and 'lo', it should be 
fine.

But we also need to update 'lib/python/impala_py_lib/gdb/impala-gdb.py' because 
currently it expects 'instance_id_' to be a string.

It also breaks the unit tests in 'thread-debug-info-test.cc'.


http://gerrit.cloudera.org:8080/#/c/12129/3/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/12129/3/be/src/exec/hdfs-scan-node-base.cc@37
PS3, Line 37: #include "common/thread-debug-info.h"
nit: do we need this include? Might be better to include it in hdfs-scan-node.cc


http://gerrit.cloudera.org:8080/#/c/12129/3/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/12129/3/be/src/exec/hdfs-scan-node.cc@325
PS3, Line 325: ThreadDebugInfo* parent_thread_debug_info = 
GetThreadDebugInfo();
 : DCHECK(parent_thread_debug_info != nullptr);
 : auto fn = [this, first_thread, scanner_thread_reservation,
 :  parent_thread_debug_info]() {
 :   
GetThreadDebugInfo()->SetParentInfo(parent_thread_debug_info);
Since we use Thread::Create() later, 
https://github.com/apache/impala/blob/274e96bd147b5d91872c441c3a600fa8d5295bbe/be/src/util/thread.cc#L317
 and 
https://github.com/apache/impala/blob/274e96bd147b5d91872c441c3a600fa8d5295bbe/be/src/util/thread.cc#L351
 should do this automatically.
Doesn't it happen?


http://gerrit.cloudera.org:8080/#/c/12129/3/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/12129/3/be/src/runtime/io/disk-io-mgr.cc@424
PS3, Line 424: // We are now working on behalf of a query, so set thread 
state appropriately.
 : GetThreadDebugInfo()->SetQueryId(worker_context->query_id());
 : 
GetThreadDebugInfo()->SetInstanceId(worker_context->instance_id());
Fyi, IMPALA-6254 and IMPALA-6417 is about to make this automated. Maybe we 
could add a TODO here and mention those Jiras.



--
To view, visit http://gerrit.cloudera.org:8080/12129
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6634ef9d1a7346339f24f2d40a7a3aa36a535da8
Gerrit-Change-Number: 12129
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 09 Jan 2019 11:25:26 +
Gerrit-HasComments: Yes