[Impala-ASF-CR] IMPALA-4865: Reject Expr Rewrite When Appropriate

2019-04-04 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12814 )

Change subject: IMPALA-4865: Reject Expr Rewrite When Appropriate
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
File fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java:

http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java@406
PS6, Line 406:analyzer.getQueryCtx(), 0);
Not sure we want to do it this way. I'd suggest creating another version of 
EvalExprWithoutRow that takes two parameters, and have that turn around and 
call the three-parameter version.


http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
File fe/src/main/java/org/apache/impala/analysis/ColumnDef.java:

http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java@273
PS6, Line 273: FeSupport.MAX_STRING_LEN);
Here, rather than decode the meaning as a parameter, just have another create() 
function, maybe createBounded, that takes a limit.

Note that, here, we are creating a literal. In fact, we are creating a date 
time literals. The date time can never overflow. If we return an expression, 
rather than a literal, "bad things" will happen.


http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
File fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java:

http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java@182
PS6, Line 182:   public static LiteralExpr create(Expr constExpr, TQueryCtx 
queryCtx, int maxResultSize)
There are only two cases: limited or unlimited. Best to have two functions: 
create and createBounded (say) that express those two cases, and encode the 
limits in those new functions.


http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/analysis/RangePartition.java
File fe/src/main/java/org/apache/impala/analysis/RangePartition.java:

http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/analysis/RangePartition.java@186
PS6, Line 186:   literal = LiteralExpr.create(e, analyzer.getQueryCtx(), 
65_536);
Magic constants are generally frowned upon. Make this a declared constant?


http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java:

http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java@23
PS6, Line 23: import org.apache.impala.authorization.*;
?


http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java
File fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java:

http://gerrit.cloudera.org:8080/#/c/12814/6/fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java@43
PS6, Line 43:   public static final int MAX_STRING_LITERAL_SIZE = 65_536;
This guy should be moved to LiteralExpr: it is a property of the rewrite 
operation, not of this rule.



--
To view, visit http://gerrit.cloudera.org:8080/12814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8b078113ccc1aa49b0cea0c86dff2e02e1dd0e23
Gerrit-Change-Number: 12814
Gerrit-PatchSet: 6
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 04 Apr 2019 23:23:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7971: Add support for insert events in event processor.

2019-04-02 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12889 )

Change subject: IMPALA-7971: Add support for insert events in event processor.
..


Patch Set 5: Code-Review+1

The code here is OK. Discussed in person the issue of cross-system race 
conditions which we agreed to address another time. Would like Vihang to take a 
close look at the notification code itself since he is most knowledgeable.


--
To view, visit http://gerrit.cloudera.org:8080/12889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c48c5ca4bde18d532c582980aebbc25f1bf1c52
Gerrit-Change-Number: 12889
Gerrit-PatchSet: 5
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 02 Apr 2019 22:24:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7718: [DOCS] Additional info in the extended EXPLAIN output

2019-03-13 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12732 )

Change subject: IMPALA-7718: [DOCS] Additional info in the extended EXPLAIN 
output
..


Patch Set 4: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12732
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0ad5794d8a9b62cc7d01d023f56e700dc018f24b
Gerrit-Change-Number: 12732
Gerrit-PatchSet: 4
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 13 Mar 2019 21:25:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8133: [DOCS] Review and update Known Issues for 3.2

2019-03-11 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12718 )

Change subject: IMPALA-8133: [DOCS] Review and update Known Issues for 3.2
..


Patch Set 1: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6b22fc5cf3944ffeeaf8189932e88498a14696e
Gerrit-Change-Number: 12718
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 12 Mar 2019 00:34:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8097: mt dop for all queries via hidden flag

2019-03-11 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12257 )

Change subject: IMPALA-8097: mt_dop for all queries via hidden flag
..


Patch Set 8: Code-Review+1

The code looks good. I'm not super familiar with the nuances of mt_dop, so will 
defer to others for full approval.


--
To view, visit http://gerrit.cloudera.org:8080/12257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I72f0b02a005e8bf22fd17b8fb5aabf8c0d9b6b15
Gerrit-Change-Number: 12257
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 12 Mar 2019 00:18:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8014: Revise FK/PK cardinality estimation

2019-03-08 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12535 )

Change subject: IMPALA-8014: Revise FK/PK cardinality estimation
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@350
PS4, Line 350:  // The code has decided that the RHS is a dimension (detail, 
FK) table, and
 : // that the LHS is a fact (master, FK) table
> isn't it the opposite?
Done


http://gerrit.cloudera.org:8080/#/c/12535/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@416
PS4, Line 416: fkCard = Math.max(fkCard, largestKeyCard);
> Sorry, I meant you are multiplying all lhsNdv() for each slot and comparing
Right. Very confusing. This code is only for the compound key case. We are 
dealing with the lack of an NDV for the key as a whole. (This is where we can 
very much use Anurag's FK/PK enhancement.)

Note the step above in which we constrain the fk cardinality to be no greater 
than the master table cardinality. This can occur, as in our tests, when the fk 
consists of (id, int_col), so NDV(compound key) = NDV(id) * NDV(int_col) > 
master tale cardinality. So, we constrain the foreign key to be no greater than 
the master table (based on our M:1 assumption.)

But, then we have to deal with the case that our guess is wrong: the FK 
cardinality really is greater. Even if the two columns are perfectly 
correlated, their joint NDV has to be at least as large as the largest 
individual NDV.

To handle the reality of messy schemas and stats, we:

* Compute the "raw" joint NDV
* Constrain it by the detail table size (can't have more keys than rows)
* Constrain it by the master table size (can't have more FKs than PKs, ideally)
* Constrain it by largest key NDV size (can't have fewer compound keys than 
column NDV, even if that is greater than master table size)

I *think* this is right. The above messy adjustments are the result of 
inspecting tests that changed. But please do think it through to be certain.



--
To view, visit http://gerrit.cloudera.org:8080/12535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da
Gerrit-Change-Number: 12535
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 08 Mar 2019 20:31:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8014: Revise FK/PK cardinality estimation

2019-03-08 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12535

to look at the new patch set (#5).

Change subject: IMPALA-8014: Revise FK/PK cardinality estimation
..

IMPALA-8014: Revise FK/PK cardinality estimation

Impala uses two techniques to estimate join cardinality: one for FK/PK,
another for the "generic" case. (IMPALA-8018 suggests we combine the
two. But, for the purposes of this fix, we leave them separate.)

The code for the FK/PK estimation is mostly right for a key that
consists of a singe column. But, if a key is compound (contains more
than one column), the calculations can produce incorrect results.

This patch modifies the code to use the standard join calculation:
|join| = |left| * |right| / max(|left key|, |right key|).

The math for compound keys had a number of errors. Since no test tables
represent a proper compound key, all tests use unrealistic keys such as
a unique Id along with a second, superfluous column. To make the math
come out OK for these, the code included adjustments that misfired on
real-world keys. This patch fixes those errors. A separate patch
introduces new tables with the required schema.

Tests: Many planner tests were modified to reflect the new join
cardinality estimate. Some plans changed, including some TPC-H plans.
Inspected each change to verify that the numbers are correct using the
formula cited above.

Turns out that the change caused a spill test to fail. The new plan is
more efficient and didn't require as much memory. Shrank the memory
given to the query to force the newer, more efficient plan to also
spill.

Testing was also done using a production query that misfired without
this fix, but that produced correct results with this fix.

Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da
---
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/card-multi-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
17 files changed, 1,920 insertions(+), 1,829 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/12535/5
--
To view, visit http://gerrit.cloudera.org:8080/12535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da
Gerrit-Change-Number: 12535
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8100: Add initial support for Ranger

2019-03-07 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12632 )

Change subject: IMPALA-8100: Add initial support for Ranger
..


Patch Set 9: Code-Review+1

Really like how this turned out. Very clean. Would like Bharath to look at the 
details of semantics.


--
To view, visit http://gerrit.cloudera.org:8080/12632
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8cad9e609d20aae1ff645c84fd58a02afee70276
Gerrit-Change-Number: 12632
Gerrit-PatchSet: 9
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 07 Mar 2019 19:57:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7917 (Part 3): Decouple Sentry from Impala

2019-03-07 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12684 )

Change subject: IMPALA-7917 (Part 3): Decouple Sentry from Impala
..


Patch Set 5: Code-Review+1

Code looks good. Would like Bharath to double-check the logic flow.


--
To view, visit http://gerrit.cloudera.org:8080/12684
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1a5d3e0a3e86ac2b0329b56247357fca93229dd0
Gerrit-Change-Number: 12684
Gerrit-PatchSet: 5
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Thu, 07 Mar 2019 19:52:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7917 (Part 2): Decouple Sentry from Impala

2019-03-06 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12542 )

Change subject: IMPALA-7917 (Part 2): Decouple Sentry from Impala
..


Patch Set 11: Code-Review+2

Very nicely done!


--
To view, visit http://gerrit.cloudera.org:8080/12542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I051d06451fb14ba2ffb3f7e0d4aae37dee55ab86
Gerrit-Change-Number: 12542
Gerrit-PatchSet: 11
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Austin Nobis 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 06 Mar 2019 19:44:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8266 : Event filtering logic may not filter all the events

2019-03-04 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12641 )

Change subject: IMPALA-8266 : Event filtering logic may not filter all the 
events
..


Patch Set 2: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaeaa26017ee223cca18344e5e1d6ace87200fd9c
Gerrit-Change-Number: 12641
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 05 Mar 2019 01:22:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8014: Incorrect FK/PK cardinality estimation

2019-03-04 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12535 )

Change subject: IMPALA-8014: Incorrect FK/PK cardinality estimation
..


Patch Set 4:

Here are two references to get you started:

https://pdfs.semanticscholar.org/2735/672262940a23cfce8d4cc1e25c0191de7da7.pdf
See Section 2, equation 1.

http://www.vldb.org/pvldb/vol9/p204-leis.pdf
Section 2.3

The basic formula is pretty standard, as is the math to calculate NDV for 
compound keys. The ad-hoc bits are the adjustments that limit values.

The current version is a bit odd: it uses NDVs to infer PK/FK, then uses PK/FK 
to adjust NDVs. Didn't want to fix too much in a single patch, so let this odd 
bit of reasoning.

We can go over the details in person.


--
To view, visit http://gerrit.cloudera.org:8080/12535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da
Gerrit-Change-Number: 12535
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 05 Mar 2019 00:32:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

2019-03-04 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12628

to look at the new patch set (#5).

Change subject: IMPALA-8258: Realistic star-schema tables
..

IMPALA-8258: Realistic star-schema tables

The tables in the `functional` db provide many interesting cases. The
tables in TPC-H and TPC-DS simulate a well-behaved application.
We also need some tables that show messy, real-world cases:

- Correlated filters (same filter on multiple tables)
- Compund join (primary, foreign) keys
- Extreme data skew

This patch adds scripts to generate five new "star_" tables. Modifies
functional scripts to load the tables, adds a text description of the
schema, and adds test that use the schema.

Tests: This is a test-only patch. Adds a new PlannerTest that highlights
current limitations in the planner: data skew and wrong plan when
correlated filters are present.

Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/bin/compute-table-stats.sh
A testdata/datasets/functional/build-star.py
M testdata/datasets/functional/functional_schema_template.sql
A testdata/datasets/functional/preload
M testdata/datasets/functional/schema_constraints.csv
A testdata/datasets/functional/star-schema.txt
A 
testdata/workloads/functional-planner/queries/PlannerTest/card-star-schema.test
8 files changed, 589 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/12628/5
--
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8014: Incorrect FK/PK cardinality estimation

2019-03-04 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12535 )

Change subject: IMPALA-8014: Incorrect FK/PK cardinality estimation
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12535/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12535/4//COMMIT_MSG@32
PS4, Line 32: Turns out that the change caused a spill test to fail. The new 
plan is
> This is my bad, I should have hinted the plan to force a join order and str
I agree that your solution is more stable. Fudging with magic numbers is asking 
for trouble.

Can you recommend the proper fix? How should the test in question be modified 
to achieve the goal you had in mind?



--
To view, visit http://gerrit.cloudera.org:8080/12535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da
Gerrit-Change-Number: 12535
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 04 Mar 2019 20:23:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

2019-03-04 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12628 )

Change subject: IMPALA-8258: Realistic star-schema tables
..


Patch Set 4:

Passed pre-review tests: https://jenkins.impala.io/job/pre-review-test/328/


--
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Mon, 04 Mar 2019 20:17:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8258: Realistic star-schema tables

2019-03-04 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12628


Change subject: IMPALA-8258: Realistic star-schema tables
..

IMPALA-8258: Realistic star-schema tables

The tables in the `functional` db provide many interesting cases. The
tables in TPC-H and TPC-DS simulate a well-behaved application.
We also need some tables that show messy, real-world cases:

- Correlated filters (same filter on multiple tables)
- Compund join (primary, foreign) keys
- Extreme data skew

This patch adds scripts to generate five new "star_" tables. Modifies
functional scripts to load the tables, adds a text description of the
schema, and adds test that use the schema.

Tests: This is a test-only patch. Adds a new PlannerTest that highlights
current limitations in the planner: data skew and wrong plan when
correlated filters are present.

Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
---
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/bin/compute-table-stats.sh
A testdata/datasets/functional/build-star.py
M testdata/datasets/functional/functional_schema_template.sql
A testdata/datasets/functional/preload
M testdata/datasets/functional/schema_constraints.csv
A testdata/datasets/functional/star-schema.txt
A 
testdata/workloads/functional-planner/queries/PlannerTest/card-star-schema.test
8 files changed, 579 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/12628/4
--
To view, visit http://gerrit.cloudera.org:8080/12628
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4fadfef9bc1bbcad5ed2bf998fc1d99e1ba2a080
Gerrit-Change-Number: 12628
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 


[Impala-ASF-CR] IMPALA-8014: Incorrect FK/PK cardinality estimation

2019-03-04 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12535 )

Change subject: IMPALA-8014: Incorrect FK/PK cardinality estimation
..


Patch Set 4:

Passed pre-review tests: https://jenkins.impala.io/job/pre-review-test/328/


--
To view, visit http://gerrit.cloudera.org:8080/12535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da
Gerrit-Change-Number: 12535
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Mon, 04 Mar 2019 19:55:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8014: Incorrect FK/PK cardinality estimation

2019-03-04 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12535


Change subject: IMPALA-8014: Incorrect FK/PK cardinality estimation
..

IMPALA-8014: Incorrect FK/PK cardinality estimation

Impala uses two techniques to estimate join cardinality: one for FK/PK,
another for the "generic" case. (IMPALA-8018 suggests we combine the
two. But, for the purposes of this fix, we leave them separate.)

The code for the FK/PK estimation is mostly right for a key that
consists of a singe column. But, if a key is compound (contains more
than one column), the calculations can produce incorrect results.

This patch modifies the code to use the standard join calculation:
|join| = |left| * |right| / max(|left key|, |right key|).

The math for compound keys had a number of errors. Since no test tables
represent a proper compound key, all tests use unrealistic keys such as
a unique Id along with a second, superfluous column. To make the math
come out OK for these, the code included adjustments that misfired on
real-world keys. This patch fixes those errors. A separate patch
introduces new tables with the required schema.

Tests: Many planner tests were modified to reflect the new join
cardinality estimate. Some plans changed, including some TPC-H plans.
Inspected each change to verify that the numbers are correct using the
formula cited above.

Turns out that the change caused a spill test to fail. The new plan is
more efficient and didn't require as much memory. Shrank the memory
given to the query to force the newer, more efficient plan to also
spill.

Testing was also done using a production query that misfired without
this fix, but that produced correct results with this fix.

Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da
---
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test
M testdata/workloads/functional-planner/queries/PlannerTest/card-multi-join.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
M testdata/workloads/functional-query/queries/QueryTest/spilling.test
17 files changed, 1,909 insertions(+), 1,829 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/12535/4
--
To view, visit http://gerrit.cloudera.org:8080/12535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id0db82564596b0a9271b89b8e3f7a3cf92d306da
Gerrit-Change-Number: 12535
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-03-01 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12437 )

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..


Abandoned

Can live without it.
--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-8106: Cleanup of literal expression nodes

2019-03-01 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12265 )

Change subject: IMPALA-8106: Cleanup of literal expression nodes
..


Abandoned

Not a priority at the moment.
--
To view, visit http://gerrit.cloudera.org:8080/12265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I740f877964d5bc677fef98f1ccfa5be32cc8f7ac
Gerrit-Change-Number: 12265
Gerrit-PatchSet: 14
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8156: Add format options to the EXPLAIN statement

2019-03-01 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12340 )

Change subject: IMPALA-8156: Add format options to the EXPLAIN statement
..


Abandoned

Not a priority at the moment.
--
To view, visit http://gerrit.cloudera.org:8080/12340
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I84a8bc4fbff52b70a747f3a3c08abe6973e37fc1
Gerrit-Change-Number: 12340
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8041, Part 2: Refactor SELECT list

2019-03-01 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12144 )

Change subject: IMPALA-8041, Part 2: Refactor SELECT list
..


Abandoned

Not a priority at the moment.
--
To view, visit http://gerrit.cloudera.org:8080/12144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I5e173dad717e739c097f1f0cd86a0352648ff886
Gerrit-Change-Number: 12144
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8041, Part 1: Move rewrite rules into expr nodes

2019-03-01 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12143 )

Change subject: IMPALA-8041, Part 1: Move rewrite rules into expr nodes
..


Abandoned

Not a priority at the moment.
--
To view, visit http://gerrit.cloudera.org:8080/12143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ifda7082abea4f6448e6cfef0d9ef7b8565d73fce
Gerrit-Change-Number: 12143
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8266 : Event filtering logic may not filter all the events

2019-02-28 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12641 )

Change subject: IMPALA-8266 : Event filtering logic may not filter all the 
events
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12641/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12641/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@189
PS1, Line 189:   fromIndex++;
Really awkward and error prone. Do we need to remove items from the list? Can 
we just do:

for (MetastoreEvent : metastoreEvents) {
  if (whatever) {
process
  }
}

If this loop consumes all events, no need to return the list. The GC will get 
rid of it.

Or, if we want to remove some items:

int i = 0;
while (I < metastoreEvents.size()) {
  MetastoreEvent event = metastoreEvents.get();
  if (consumed) metastoreEvents.remove(i);
  else I++;
}



--
To view, visit http://gerrit.cloudera.org:8080/12641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaeaa26017ee223cca18344e5e1d6ace87200fd9c
Gerrit-Change-Number: 12641
Gerrit-PatchSet: 1
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Fri, 01 Mar 2019 03:24:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8182: Add single-node plan to PlanCtx

2019-02-27 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12436 )

Change subject: IMPALA-8182: Add single-node plan to PlanCtx
..


Abandoned

Will do as part of an actual test later.
--
To view, visit http://gerrit.cloudera.org:8080/12436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I983473f259df8af17d8dd38e497fe33e839cd0dc
Gerrit-Change-Number: 12436
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-27 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12437

to look at the new patch set (#6).

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..

IMPALA-8185: Abstract out real/mock file system operations

The recent addition of the test case builder introduced the idea of
running a query in "test case mode" on a system different from the on on
which the test case was created. A number of if-statements were used to
mock file system operations which depend on the file system to exist.

This patch abstracts out the mock vs. real operations into a set of
"facade" classes. The HDFS facade performs the real opeations, the
"mock" facade mocks the operations well enough to fool the planner.
Selection of the facade depends on the recently-added test case mode
query option.

Tests: Reran all tests: both those that depend on a real file system and
the test case builder tests that depend on the mock file system.

Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java
A fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java
M fe/src/main/java/org/apache/impala/analysis/HdfsUri.java
A fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
7 files changed, 352 insertions(+), 137 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/12437/6
--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7972 Detect self-events to avoid unnecessary invalidates

2019-02-27 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12591 )

Change subject: IMPALA-7972 Detect self-events to avoid unnecessary invalidates
..


Patch Set 4:

(7 comments)

Identified one more potential race condition and suggested a solution.

http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@738
PS4, Line 738:   public Collection getInFlightVersionsForEvents(String 
dbName, String tblName)
Should be a list: evens are ordered and must be ignored in the order that they 
were added to the list.


http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@765
PS4, Line 765:   public void removeFromInFlightVersionsForEvents(String dbName, 
String tblName,
This form introduces a race condition. Can you do something like:

void invaliateOrIgnore(int versionNo)
  - lock
  - if list is non-empty, and first entry is versionNo, remove that first entry
  - Else, invalidate the table
  - unlock


http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@889
PS4, Line 889: protected Collection pendingVersionNumbersFromCatalog_ 
= Collections.EMPTY_LIST;
Again, this introduces a race condition if some other thread changes the table 
since we obtained the list.


http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@919
PS4, Line 919: protected boolean isSelfEvent() throws CatalogException {
I see. You want to identify outside of a lock if this is a self event. The 
proposed non-racy solution makes self-event detection an integral part of the 
invalidate, avoiding race conditions.

The logic would be:
  if server id not set, or does not match
- invalidate
  else
- invalidOrIgnore(event version)

Can even be simplified to:

  eventVersion = -1
  if serverId is set and matches then evenVersion = version from event
  invalidateOrIgnore(eventVersion)

I think this completely eliminates all race condition paths


http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@957
PS4, Line 957:   if (isSelfEvent()) {
To enable this logging, have invalidOrIgnore return true if ignored, false if 
invalidated (or visa-versa, you choose)


http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@1100
PS4, Line 1100: protected boolean isSelfEvent() {
With the new system, we don't know if it is a self event until we lock and 
look. All we can tell is that, when we checked, it WAS a self event.


http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12591/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@148
PS4, Line 148:  * its a performance penalty not a correctness issue.
Thanks for the explanation!



--
To view, visit http://gerrit.cloudera.org:8080/12591
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6db0d7f7fe465158fc8cb9d6b6b57a321827b353
Gerrit-Change-Number: 12591
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 27 Feb 2019 23:13:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7975 : Improve supportability of the automatic invalidate feature

2019-02-26 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12549 )

Change subject: IMPALA-7975 : Improve supportability of the automatic 
invalidate feature
..


Patch Set 8: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23cb3aa866879eca03c64ab881796eaa9caa0337
Gerrit-Change-Number: 12549
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 27 Feb 2019 04:56:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7972 Detect self-events to avoid unnecessary invalidates

2019-02-26 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12591 )

Change subject: IMPALA-7972 Detect self-events to avoid unnecessary invalidates
..


Patch Set 2:

(17 comments)

Quick initial review. More substantive comments will follow our in-person 
discussion.

http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@891
PS1, Line 891: 
propertyValues.add(tbl.getMetaStoreTable().getParameters().get(propertyKey));
This will add nulls to the list if the property is not set. Is this the 
intention?

Actually, looks like the semantics here is that you have to correlated lists: 
keys in one list, values in another. Is a map a better choice here?


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@920
PS1, Line 920: propVals.add(hdfsPartition.getParameters().get(key));
Same issue as above.


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@61
PS1, Line 61:   public static final String CATALOG_SERVICE_ID_PROP_KEY = 
"impala.CatalogServiceId";
Nit: the above props use TitleCase. The one below uses camelCase. Should we be 
consistent?


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@359
PS1, Line 359:* not. Used by TableInvalidating Events to avoid unnecessary 
invalidatesß
Future readers will greatly benefit form an explanation of the logic used to 
detect self-events. I'm having to reverse engineer it from the implementation: 
a few words of explanation would be greatly appreciated!


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@360
PS1, Line 360:*/
The #1 item that needs explanation is concurrency.

Change 1: sends to HMS
Change 2: sends to HMS
Change 2: returns from HMS, changes catalog
Change 1: returns from HMS, changes catalog

Notice that the above is deliberately done out of order due to, say, network 
delays. What are the semantics in each case?

Suppose that change 1 is a rename, change 2 is a request to add a column to the 
old table. If change 2 is processed first, things work. If not, things don't 
work.

How is this then reflected in the events and how are these cases synchronized?


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@363
PS1, Line 363: private final Long lastDDLTime_;
Does this need the overhead of an object? Or, can we use -1 as the "not set" 
indicator and use a primitive long?


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@368
PS1, Line 368: private final Long catalogVersion_;
As above.


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@408
PS1, Line 408:  * instance, two SIDs which have different serviceIds cannot 
be compared. This
They can: just compare based on the service IDs. Also, put all null service IDs 
before all events with a service ID. May not mean much, but does impose an 
ordering.


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@415
PS1, Line 415:  * 1 means we want to process the event represented by this 
SID
Rather than explaining all this, and doing a mapping from partial ordering to 
semantics, why not simply have a method that answers the direct question: 
canIgnore(something)?


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@420
PS1, Line 420:   if (this.equals(0)) return 0;
Why would this ever equal the number 0? Or did you mean lower case "o"?

The usual practice is to implement equals in terms of compareTo, not the other 
way around...


http://gerrit.cloudera.org:8080/#/c/12591/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@424
PS1, Line 424:   if (this.lastDDLTime_ < o.lastDDLTime_) return -1;
The above assumes both values are non-null, else will crash with NPE. This is a 
good reason to use a primitive long instead of an object Long for the field.

Consider:
int result = Long.compare(lastDDLTime, o.lastDDLTime_);
if (result != 0) return result;



[Impala-ASF-CR] IMPALA-8069: Crash in impala::Sorter::Run::Run

2019-02-26 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12562 )

Change subject: IMPALA-8069: Crash in impala::Sorter::Run::Run
..


Patch Set 5:

Passes pre-review tests: https://jenkins.impala.io/job/pre-review-test/317/

Bharath, please review.


--
To view, visit http://gerrit.cloudera.org:8080/12562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
Gerrit-Change-Number: 12562
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 26 Feb 2019 17:27:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8069: Crash in impala::Sorter::Run::Run

2019-02-25 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12562

to look at the new patch set (#5).

Change subject: IMPALA-8069: Crash in impala::Sorter::Run::Run
..

IMPALA-8069: Crash in impala::Sorter::Run::Run

Modifies the planner to omit a sorter node in the obscure case we have
an analytic query with a constant sort. The analytic node requires a
buffered tuple. When the sort is omitted, creates a buffered tuple on
top of the node that would have been input to the sort.

Testing:
* Added a PlannerTest case.
* Added an end-to-end test.
* Reran the query that previously crashed: it now produces proper results.

Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
---
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
4 files changed, 71 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/12562/5
--
To view, visit http://gerrit.cloudera.org:8080/12562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
Gerrit-Change-Number: 12562
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7917 (Part 2): Decouple Sentry from Impala

2019-02-25 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12542 )

Change subject: IMPALA-7917 (Part 2): Decouple Sentry from Impala
..


Patch Set 7:

Took a quick spin over the files. I really like how you've completely 
abstracted out Sentry into the auth factory. Very clean. Adding Ranger (or 
anything else) should be quite clean without the (if sentry to this, if ranger 
do that, else do nothing) code of old. Nicely done!


--
To view, visit http://gerrit.cloudera.org:8080/12542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I051d06451fb14ba2ffb3f7e0d4aae37dee55ab86
Gerrit-Change-Number: 12542
Gerrit-PatchSet: 7
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Austin Nobis (450)
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Tue, 26 Feb 2019 03:53:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7645: Add a query option to set the default table file format

2019-02-25 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12568 )

Change subject: IMPALA-7645: Add a query option to set the default table file 
format
..


Patch Set 6:

Can the file format be set in the sql statement itself? I always consider it a 
bit of a hack to have to set a query option before running a statement: just 
one more thing for the user to remember. And, to remember to set it back 
afterward.


--
To view, visit http://gerrit.cloudera.org:8080/12568
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic857c38076d973ad749a41fecd1b470c7881db5e
Gerrit-Change-Number: 12568
Gerrit-PatchSet: 6
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 26 Feb 2019 03:47:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7975 : Improve supportability of the automatic invalidate feature

2019-02-25 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12549 )

Change subject: IMPALA-7975 : Improve supportability of the automatic 
invalidate feature
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12549/5/fe/src/main/java/org/apache/impala/common/Metrics.java
File fe/src/main/java/org/apache/impala/common/Metrics.java:

http://gerrit.cloudera.org:8080/#/c/12549/5/fe/src/main/java/org/apache/impala/common/Metrics.java@146
PS5, Line 146: .append("\tMean rate: ")
Nit: you can combine these:
.append("\n\tMean rate: ")

Also, tabs are iffy, they display differently in different environments. Spaces 
instead?


http://gerrit.cloudera.org:8080/#/c/12549/5/fe/src/main/java/org/apache/impala/common/Metrics.java@155
PS5, Line 155: .append("\t15min rate: ")
Nit: if this is for human consumption, "15 min. rate" is more legible.



--
To view, visit http://gerrit.cloudera.org:8080/12549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23cb3aa866879eca03c64ab881796eaa9caa0337
Gerrit-Change-Number: 12549
Gerrit-PatchSet: 5
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 26 Feb 2019 03:44:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8243: Fix racy access to nonPartFieldSchemas

2019-02-25 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12572 )

Change subject: IMPALA-8243: Fix racy access to nonPartFieldSchemas_
..


Patch Set 1: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12572
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7d68b54af2ba954cf0ffa7b2533cde7be835be77
Gerrit-Change-Number: 12572
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Tue, 26 Feb 2019 03:39:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8069: Crash in impala::Sorter::Run::Run

2019-02-25 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12562

to look at the new patch set (#4).

Change subject: IMPALA-8069: Crash in impala::Sorter::Run::Run
..

IMPALA-8069: Crash in impala::Sorter::Run::Run

Modifies the planner to omit a sorter node in the obscure case we have
an analytic query with a constant sort. The analytic node requires a
buffered tuple. When the sort is omitted, creates a buffered tuple on
top of the node that would have been input to the sort.

Testing:
* Added a PlannerTest case.
* Added an end-to-end test.
* Reran the query that previously crashed: it now produces proper results.

Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
---
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
4 files changed, 71 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/12562/4
--
To view, visit http://gerrit.cloudera.org:8080/12562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
Gerrit-Change-Number: 12562
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7975 : Improve supportability of the automatic invalidate feature

2019-02-25 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12549 )

Change subject: IMPALA-7975 : Improve supportability of the automatic 
invalidate feature
..


Patch Set 4: Code-Review+1

(2 comments)

LGTM. Bharath should give it a once over as well.

http://gerrit.cloudera.org:8080/#/c/12549/3/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/12549/3/fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java@56
PS3, Line 56:
Nit: Drop "Get", does not add much info.


http://gerrit.cloudera.org:8080/#/c/12549/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/12549/3/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@129
PS3, Line 129: case DROP_DATABASE: return new 
DropDatabaseEvent(catalog_, metrics_, event);
Not for this patch. In the future, might want to provide an EventsProcessor 
object that holds the catalog and metrics so you can pass along just one object.



--
To view, visit http://gerrit.cloudera.org:8080/12549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23cb3aa866879eca03c64ab881796eaa9caa0337
Gerrit-Change-Number: 12549
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 25 Feb 2019 23:17:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8069: Crash in impala::Sorter::Run::Run

2019-02-25 Thread Paul Rogers (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12562

to look at the new patch set (#3).

Change subject: IMPALA-8069: Crash in impala::Sorter::Run::Run
..

IMPALA-8069: Crash in impala::Sorter::Run::Run

Modifies the planner to omit a sorter node in the obscure case we have
an analytic query with a constant sort. The analytic node requires a
buffered tuple. When the sort is omitted, creates a buffered tuple on
top of the node that would have been input to the sort.

Testing:
* Added a PlannerTest case.
* Added an end-to-end test.
* Reran the query that previously crashed: it now produces proper results.

Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
---
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test
4 files changed, 71 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/12562/3
--
To view, visit http://gerrit.cloudera.org:8080/12562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
Gerrit-Change-Number: 12562
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7917 (Part 2): Decouple Sentry from Impala

2019-02-22 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12542 )

Change subject: IMPALA-7917 (Part 2): Decouple Sentry from Impala
..


Patch Set 3:

(7 comments)

First-pass review. Sorry, was started on PS 2; I see you're now on PS 3.

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableFactory.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizableFactory.java:

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableFactory.java@46
PS2, Line 46:   case SENTRY:
I wonder if there is another way to do this? This class would appear to need to 
enumerate all the auth providers. As such, it breaks abstraction. Can these 
methods be defined within each auth plugin itself? Sentry, say, knows to create 
its own authorizables?


http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableTable.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizableTable.java:

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/authorization/AuthorizableTable.java@26
PS2, Line 26: public class AuthorizableTable extends Authorizable {
Looks like there are a series of these classes. Each takes a series of names. I 
wonder, does this generalize? Or, should these be built via composition? An 
auth table takes an auth DB and a table name? An auth column takes an auth 
table and a column name?

I'm not proposing we make the change if it's not needed, just trying to reason 
out the data model.


http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java:

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/authorization/AuthorizationConfig.java@60
PS2, Line 60:   static SentryAuthorizationConfig sentry(String serverName,
Similar to prior comments: should the sentry-specific binding be in the generic 
interface, or should this reside in some sentry-specific driver?


http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java
File fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java:

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/authorization/AuthorizationPolicy.java@499
PS2, Line 499: Set groupNames = ((SentryAuthorizationChecker) 
fe.getAuthzChecker())
Temporary code? Or, should this be handled by sentry-specific code?


http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/service/Frontend.java@321
PS2, Line 321:   if (authzConfig_.getProvider() == 
AuthorizationProvider.SENTRY) {
Similar to earlier: should Sentry-specific stuff leak out of the abstraction, 
or should this stuff be handled through a generic interface?


http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/service/JniCatalog.java
File fe/src/main/java/org/apache/impala/service/JniCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/service/JniCatalog.java@111
PS2, Line 111: if (BackendConfig.INSTANCE.useSentry()) {
Should this be factored to into a an auth factory rather than inline here in 
this top-level class?


http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/12542/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@143
PS2, Line 143: if (BackendConfig.INSTANCE.useSentry()) {
Ditto



--
To view, visit http://gerrit.cloudera.org:8080/12542
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I051d06451fb14ba2ffb3f7e0d4aae37dee55ab86
Gerrit-Change-Number: 12542
Gerrit-PatchSet: 3
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Austin Nobis (450)
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Sat, 23 Feb 2019 04:38:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8069: Crash in impala::Sorter::Run::Run

2019-02-22 Thread Paul Rogers (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12562

to look at the new patch set (#2).

Change subject: IMPALA-8069: Crash in impala::Sorter::Run::Run
..

IMPALA-8069: Crash in impala::Sorter::Run::Run

Modifies the planner to omit a sorter node in the obscure case we have
an analytic query with a constant sort. The analytic node requires a
buffered tuple. When the sort is omitted, creates a buffered tuple on
top of the node that would have been input to the sort.

Testing: added a new PlannerTest case for this case. Reran the query
that previously crashed: it now produces proper results.

Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
---
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/order.test
3 files changed, 32 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/12562/2
--
To view, visit http://gerrit.cloudera.org:8080/12562
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I19a57f3791ce9e41fe0ba79dc5834e762341c74e
Gerrit-Change-Number: 12562
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7560: Improve selectivity estimate for !=

2019-02-21 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12427 )

Change subject: IMPALA-7560: Improve selectivity estimate for !=
..


Abandoned

Too risky to do now. Will revisit as part of a new patch later.
--
To view, visit http://gerrit.cloudera.org:8080/12427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
Gerrit-Change-Number: 12427
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7560: Improve selectivity estimate for !=

2019-02-21 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12427 )

Change subject: IMPALA-7560: Improve selectivity estimate for !=
..


Patch Set 5:

(2 comments)

Rebased on master, again (twice). Addressed review comments.

Am re-running pre-review tests after all the recent rebases. Will send a note 
when complete.

http://gerrit.cloudera.org:8080/#/c/12427/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/12427/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@216
PS3, Line 216: // Note: the following case statement is skeletal.
> nit: switch case indentation seems off.
This is Java-style case indents: case at same level as switch. Changed to C++ 
style used elsewhere in the file (case indented one level from switch.)


http://gerrit.cloudera.org:8080/#/c/12427/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@227
PS3, Line 227: }
> nit: I think you could refactor the whole thing like before to make the int
Could, and that is how I started. But, this shape looks ahead to the other 
operators, some of which produce the same estimate regardless of the two 
conditions. Also, there will be the non-single col case in some places. For 
example, for <, <=, >, >=, the answer might always be 1/3 regardless of the 
other factors. For =, we might guess 0.1 without an NDV.

I have deliberately left these as later patches so that we can evaluate each 
operator and each condition one at a time. Too complex (i.e. too hard to 
review) if I lump them all together.

Added a comment to clarify this intent.



--
To view, visit http://gerrit.cloudera.org:8080/12427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
Gerrit-Change-Number: 12427
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Thu, 21 Feb 2019 19:25:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7560: Improve selectivity estimate for !=

2019-02-21 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12427

to look at the new patch set (#5).

Change subject: IMPALA-7560: Improve selectivity estimate for !=
..

IMPALA-7560: Improve selectivity estimate for !=

Impala has historically used a generic selectivity of 0.1 for all
non-equality predicates. However, this can underestimate cardinality
in some cases. The correct value is

sel(c != x) = 1 - sel(c = x)

if c is a column, x is a constant and ndv(c) is known.

Adds the above for the != case and for IS DISTINCT FROM case. There are
many related issues that are left as separate patches.

Tests:
* Used the newly-added expression cardinality tests to highlight the
  change (some formerly broken tests now pass).
* Used the newly-added cardinality tests in PlannerTest to highlight the
  fix.
* Fixed a merge artifact by having the PlannerTest cardinality tests
  actually verify cardinality.

Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
13 files changed, 149 insertions(+), 122 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/12427/5
--
To view, visit http://gerrit.cloudera.org:8080/12427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
Gerrit-Change-Number: 12427
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7560: Improve selectivity estimate for !=

2019-02-21 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12427

to look at the new patch set (#4).

Change subject: IMPALA-7560: Improve selectivity estimate for !=
..

IMPALA-7560: Improve selectivity estimate for !=

Impala has historically used a generic selectivity of 0.1 for all
non-equality predicates. However, this can underestimate cardinality
in some cases. The correct value is

sel(c != x) = 1 - sel(c = x)

if c is a column, x is a constant and ndv(c) is known.

Adds the above for the != case and for IS DISTINCT FROM case. There are
many related issues that are left as separate patches.

Tests:
* Used the newly-added expression cardinality tests to highlight the
  change (some formerly broken tests now pass).
* Used the newly-added cardinality tests in PlannerTest to highlight the
  fix.
* Fixed a merge artifact by having the PlannerTest cardinality tests
  actually verify cardinality.

Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
13 files changed, 158 insertions(+), 131 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/12427/4
--
To view, visit http://gerrit.cloudera.org:8080/12427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
Gerrit-Change-Number: 12427
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-6900: Fix the min version invariant for invalidate metadata

2019-02-21 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12547 )

Change subject: IMPALA-6900: Fix the min version invariant for invalidate 
metadata
..


Patch Set 3: Code-Review+2

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12547
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I22ace3f5917b6167af63c3dfd2620e69ce8ec1cb
Gerrit-Change-Number: 12547
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 21 Feb 2019 18:56:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7781: Clean up expr rewriter predicates

2019-02-21 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11875


Change subject: IMPALA-7781: Clean up expr rewriter predicates
..

IMPALA-7781: Clean up expr rewriter predicates

The expression rewriter often checks if an expression is null or a
literal, but does so using ad-hoc "instance of" or similar checks.
Yet, the expression nodes provide predicates that do the same job.
This change uses the defined predicates in place of the ad-hoc tests.

Add new predicates for the most common cases: literal, null literal,
null value, non-null literal, and literal-value. Converted several Expr
class methods to predicates.

The predicates handle cases that the ad-hoc tests do not. For example,
when the constant folding rule produces a NULL result, it casts that
null to some type: CAST(NULL AS INT), say. Later, other rules want to
check if that result is null. A simple check for a null literal won't
work, we need the IS_NULL_VALUE predicate which checks for the CAST
case.

This change adds predicate and revises the rewrite rules to use the
new predicates.

Also adds a number of minor Java cleanups found while doing the fix.

One of which is the toSql() for numeric values: tests showed that
the form of zero printed depended on the precision and scale, with one
value printing as 0E-38. Modified the code to emit "0" for all zero
values (since, unlike Java, SQL attaches no meaning to 0.0 as oppposed
to just 0.)

Detailed testing revealed missed simplification opportunites, especially
in CASE. Added the missing logic. For CASE, this meant a wholesale
refactoring of the code in order to better deal with nuances such as

CASE null WHEN ... END --> NULL
CASE ... ELSE NULL END --> CASE ... END
CASE ... WHEN ... THEN NULL END CASE ... END

The constant folding rule always wrapped the result in a cast,
even if the result was of the correct type. Skipped the unnecessary
cast to avoid repeated passes through this rule.

Testing:

* Refactored conditional tests into a new class, broken down
by type expression type for easier debugging.
* Created a base class for the existing and new expression test
classes.
* Created a test fixture to manage analysis tests to make it easier
to grab and inspect various aspects of the plan, as well as to control
query and other options.
* Added new tests that failed without these changes, but which pass
with them.
* Some of the plans in PlannerTest changed: verified that the new
plans differ only in the added simplifications.

Change-Id: I92d117145e427fcc5c71299096cf0e2d964ab63c
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/RangePartition.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/common/TreeNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/java/org/apache/impala/rewrite/EqualityDisjunctsToInRule.java
M fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java
M fe/src/main/java/org/apache/impala/rewrite/NormalizeBinaryPredicatesRule.java
M fe/src/main/java/org/apache/impala/rewrite/NormalizeCountStarRule.java
M fe/src/main/java/org/apache/impala/rewrite/NormalizeExprsRule.java
M fe/src/main/java/org/apache/impala/rewrite/RemoveRedundantStringCast.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyDistinctFromRule.java
A fe/src/test/java/org/apache/impala/analysis/AnalysisFixture.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/ExtendedRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalsRuleTest.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M 

[Impala-ASF-CR] IMPALA-7781: Clean up expr rewriter predicates

2019-02-21 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/11875 )

Change subject: IMPALA-7781: Clean up expr rewriter predicates
..


Abandoned

Draft
--
To view, visit http://gerrit.cloudera.org:8080/11875
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I92d117145e427fcc5c71299096cf0e2d964ab63c
Gerrit-Change-Number: 11875
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 


[Impala-ASF-CR] IMPALA-7968, Part 1: JSON serialization framework

2019-02-21 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12079 )

Change subject: IMPALA-7968, Part 1: JSON serialization framework
..


Abandoned

Was meant to help with testing, but moving away from that effort.
--
To view, visit http://gerrit.cloudera.org:8080/12079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ie3101c2708bf6cf4bec61af83a5db9b6d70ddd9c
Gerrit-Change-Number: 12079
Gerrit-PatchSet: 8
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-7866: Predicates, helpers for implicit casts, slot refs

2019-02-21 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/11953 )

Change subject: IMPALA-7866: Predicates, helpers for implicit casts, slot refs
..


Abandoned

Lower priority, not worth keeping this open for many months.
--
To view, visit http://gerrit.cloudera.org:8080/11953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Ieaa0aee1b9015e0aed521f2038bf44513d7f8613
Gerrit-Change-Number: 11953
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12437 )

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12437/4/fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java
File fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java:

http://gerrit.cloudera.org:8080/#/c/12437/4/fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java@64
PS4, Line 64:   ScanAllocation allocateScans(FeFsTable table, TScanRangeSpec 
scanRangeSpecs,
> Where do we document what this does?
Done


http://gerrit.cloudera.org:8080/#/c/12437/4/fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java
File fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java:

http://gerrit.cloudera.org:8080/#/c/12437/4/fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java@102
PS4, Line 102:   public ScanAllocation allocateScans(FeFsTable table, 
TScanRangeSpec scanRangeSpecs,
> Do you know what this is used for?
To be honest, I didn't work out what the method does. I just noticed that this 
code relies on having a real file system; something we don't have in "test case 
mode." The work here simply picks the code up out of HdfsScanNode and plops it 
down here in a way that we can mock for test cases.

That said, if something looks odd, please do file a JIRA to explain the issue 
and perhaps we can figure out a solution as separate task.



--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 21 Feb 2019 01:18:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-20 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12437

to look at the new patch set (#5).

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..

IMPALA-8185: Abstract out real/mock file system operations

The recent addition of the test case builder introduced the idea of
running a query in "test case mode" on a system different from the on on
which the test case was created. A number of if-statements were used to
mock file system operations which depend on the file system to exist.

This patch abstracts out the mock vs. real operations into a set of
"facade" classes. The HDFS facade performs the real opeations, the
"mock" facade mocks the operations well enough to fool the planner.
Selection of the facade depends on the recently-added test case mode
query option.

Tests: Reran all tests: both those that depend on a real file system and
the test case builder tests that depend on the mock file system.

Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java
A fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java
M fe/src/main/java/org/apache/impala/analysis/HdfsUri.java
A fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
9 files changed, 376 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/12437/5
--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-8181: Abbreviate row counts in EXPLAIN

2019-02-20 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12438

to look at the new patch set (#8).

Change subject: IMPALA-8181: Abbreviate row counts in EXPLAIN
..

IMPALA-8181: Abbreviate row counts in EXPLAIN

A recent fix added node cardinality to the standard EXPLAIN output,
displaying a large number like 123456780 as 123.46M. This patch applies
the same fix to the remaining row count numbers: metadata, extrapolated
rows, etc.

Tests:
* Rebased PlannerTest .test files as needed for the new row count
  format.
* Reran all tests to check for dependencies on the old format.

Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
---
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/custom_cluster/test_stats_extrapolation.py
M tests/metadata/test_stats_extrapolation.py
21 files changed, 552 insertions(+), 532 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/12438/8
--
To view, visit http://gerrit.cloudera.org:8080/12438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
Gerrit-Change-Number: 12438
Gerrit-PatchSet: 8
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-20 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12437

to look at the new patch set (#4).

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..

IMPALA-8185: Abstract out real/mock file system operations

The recent addition of the test case builder introduced the idea of
running a query in "test case mode" on a system different from the on on
which the test case was created. A number of if-statements were used to
mock file system operations which depend on the file system to exist.

This patch abstracts out the mock vs. real operations into a set of
"facade" classes. The HDFS facade performs the real opeations, the
"mock" facade mocks the operations well enough to fool the planner.
Selection of the facade depends on the recently-added test case mode
query option.

Tests: Reran all tests: both those that depend on a real file system and
the test case builder tests that depend on the mock file system.

Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java
A fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java
M fe/src/main/java/org/apache/impala/analysis/HdfsUri.java
A fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
9 files changed, 347 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/12437/4
--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-20 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12437

to look at the new patch set (#3).

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..

IMPALA-8185: Abstract out real/mock file system operations

The recent addition of the test case builder introduced the idea of
running a query in "test case mode" on a system different from the on on
which the test case was created. A number of if-statements were used to
mock file system operations which depend on the file system to exist.

This patch abstracts out the mock vs. real operations into a set of
"facade" classes. The HDFS facade performs the real opeations, the
"mock" facade mocks the operations well enough to fool the planner.
Selection of the facade depends on the recently-added test case mode
query option.

Tests: Reran all tests: both those that depend on a real file system and
the test case builder tests that depend on the mock file system.

Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java
A fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java
M fe/src/main/java/org/apache/impala/analysis/HdfsUri.java
A fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
9 files changed, 296 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/12437/3
--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12437 )

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..


Patch Set 2:

Rebased on master and fix style nagger warnings.


--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 20 Feb 2019 21:45:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-20 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12437

to look at the new patch set (#2).

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..

IMPALA-8185: Abstract out real/mock file system operations

The recent addition of the test case builder introduced the idea of
running a query in "test case mode" on a system different from the on on
which the test case was created. A number of if-statements were used to
mock file system operations which depend on the file system to exist.

This patch abstracts out the mock vs. real operations into a set of
"facade" classes. The HDFS facade performs the real opeations, the
"mock" facade mocks the operations well enough to fool the planner.
Selection of the facade depends on the recently-added test case mode
query option.

Tests: Reran all tests: both those that depend on a real file system and
the test case builder tests that depend on the mock file system.

Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java
A fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java
M fe/src/main/java/org/apache/impala/analysis/HdfsUri.java
A fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
9 files changed, 291 insertions(+), 150 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/12437/2
--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..


Patch Set 6:

Fixed typo and rebase on master (again).


--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 20 Feb 2019 20:37:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded a new patch set (#7) to the change originally created 
by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..

IMPALA-7450. Set thread name during refresh/load operations

This adds a small utility class for annotating the current thread's name
during potentially long-running operations such as refresh/load. With
this change, jstack output now includes useful thread names like:

During startup:
  "main [invalidating metadata - 128/428 dbs complete]"

While loading a fresh table:
  "pool-4-thread-12 [Loading metadata for: foo_db.foo_table] [Loading
   metadata for all partition(s) of foo_db.foo_table]"

Pool refreshing metadata for a particular path:
  "pool-23-thread-5 [Refreshing file metadata for path:
   hdfs://nameservice1/path/to/partdir..."

Tests: Verified the patch manually by jstacking a catalogd while performing
some workload. Also added a simple unit test to verify the thread renaming
behavior.

Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
A fe/src/main/java/org/apache/impala/util/ThreadNameAnnotator.java
A fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java
5 files changed, 292 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/11228/7
--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 7
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded a new patch set (#6) to the change originally created 
by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..

IMPALA-7450. Set thread name during refresh/load operations

This adds a small utility class for annotating the current thread's name
during potentially long-running operations such as refresh/load. With
this change, jstack output now includes useful thread names like:

During startup:
  "main [invalidating metadata - 128/428 dbs complete]"

While loading a fresh table:
  "pool-4-thread-12 [Loading metadata for: foo_db.foo_table] [Loading
   metadata for all partition(s) of foo_db.foo_table]"

Pool refreshing metadata for a particular path:
  "pool-23-thread-5 [Refreshing file metadata for path:
   hdfs://nameservice1/path/to/partdir..."

Tests: Verified the patch manually by jstacking a catalogd while performing
some workload. Also added a simple unit test to verify the thread renaming
behavior.

Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
A fe/src/main/java/org/apache/impala/util/ThreadNameAnnotator.java
A fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java
5 files changed, 292 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/11228/6
--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 6
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] Logging current database context when logging analyzing queries.

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12301 )

Change subject: Logging current database context when logging analyzing queries.
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12301/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/12301/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1243
PS1, Line 1243: + queryCtx.session.database);
> I'm not using accessor methods because that was the local style.
No problem. Let's leave it as is.



--
To view, visit http://gerrit.cloudera.org:8080/12301
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I70886be4685e0d3ae7ca9f6e57e8159dc39c67c7
Gerrit-Change-Number: 12301
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 20 Feb 2019 19:26:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11228/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11228/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@356
PS5, Line 356: Attemting
> nit: typo
Done



--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 20 Feb 2019 19:13:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded a new patch set (#5) to the change originally created 
by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..

IMPALA-7450. Set thread name during refresh/load operations

This adds a small utility class for annotating the current thread's name
during potentially long-running operations such as refresh/load. With
this change, jstack output now includes useful thread names like:

During startup:
  "main [invalidating metadata - 128/428 dbs complete]"

While loading a fresh table:
  "pool-4-thread-12 [Loading metadata for: foo_db.foo_table] [Loading
   metadata for all partition(s) of foo_db.foo_table]"

Pool refreshing metadata for a particular path:
  "pool-23-thread-5 [Refreshing file metadata for path:
   hdfs://nameservice1/path/to/partdir..."

Tests: Verified the patch manually by jstacking a catalogd while performing
some workload. Also added a simple unit test to verify the thread renaming
behavior.

Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
A fe/src/main/java/org/apache/impala/util/ThreadNameAnnotator.java
A fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java
5 files changed, 292 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/11228/5
--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11228/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11228/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@356
PS3, Line 356: Attemting to
> nit: I think this should be more like, "Attempting to lock table: "...
Trying to keep name terse, but reworded as requested.


http://gerrit.cloudera.org:8080/#/c/11228/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2436
PS3, Line 2436:  +  Catalog.toCatalogObjectKey(req.object_
> I think you should do Catalog.toCatalogObjectKey(req.object_desc);
Done



--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 20 Feb 2019 18:51:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded a new patch set (#4) to the change originally created 
by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..

IMPALA-7450. Set thread name during refresh/load operations

This adds a small utility class for annotating the current thread's name
during potentially long-running operations such as refresh/load. With
this change, jstack output now includes useful thread names like:

During startup:
  "main [invalidating metadata - 128/428 dbs complete]"

While loading a fresh table:
  "pool-4-thread-12 [Loading metadata for: foo_db.foo_table] [Loading
   metadata for all partition(s) of foo_db.foo_table]"

Pool refreshing metadata for a particular path:
  "pool-23-thread-5 [Refreshing file metadata for path:
   hdfs://nameservice1/path/to/partdir..."

Tests: Verified the patch manually by jstacking a catalogd while performing
some workload. Also added a simple unit test to verify the thread renaming
behavior.

Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
A fe/src/main/java/org/apache/impala/util/ThreadNameAnnotator.java
A fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java
5 files changed, 291 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/11228/4
--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-8181: Abbreviate row counts in EXPLAIN

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12438 )

Change subject: IMPALA-8181: Abbreviate row counts in EXPLAIN
..


Patch Set 7:

Rebased on latest master.


--
To view, visit http://gerrit.cloudera.org:8080/12438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
Gerrit-Change-Number: 12438
Gerrit-PatchSet: 7
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 20 Feb 2019 18:46:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8181: Abbreviate row counts in EXPLAIN

2019-02-20 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12438 )

Change subject: IMPALA-8181: Abbreviate row counts in EXPLAIN
..


Patch Set 6:

Passed pre-review tests after latest changes: 
https://jenkins.impala.io/job/pre-review-test/310/


--
To view, visit http://gerrit.cloudera.org:8080/12438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
Gerrit-Change-Number: 12438
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 20 Feb 2019 17:21:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-19 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded a new patch set (#3) to the change originally created 
by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..

IMPALA-7450. Set thread name during refresh/load operations

This adds a small utility class for annotating the current thread's name
during potentially long-running operations such as refresh/load. With
this change, jstack output now includes useful thread names like:

During startup:
  "main [invalidating metadata - 128/428 dbs complete]"

While loading a fresh table:
  "pool-4-thread-12 [Loading metadata for: foo_db.foo_table] [Loading
   metadata for all partition(s) of foo_db.foo_table]"

Pool refreshing metadata for a particular path:
  "pool-23-thread-5 [Refreshing file metadata for path:
   hdfs://nameservice1/path/to/partdir..."

Tests: Verified the patch manually by jstacking a catalogd while performing
some workload. Also added a simple unit test to verify the thread renaming
behavior.

Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
A fe/src/main/java/org/apache/impala/util/ThreadNameAnnotator.java
A fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java
5 files changed, 291 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/11228/3
--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-7450. Set thread name during refresh/load operations

2019-02-19 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11228 )

Change subject: IMPALA-7450. Set thread name during refresh/load operations
..


Patch Set 2:

(6 comments)

Addressed review comments and rebased.

http://gerrit.cloudera.org:8080/#/c/11228/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11228/2//COMMIT_MSG@24
PS2, Line 24: This patch is tricky to automate tests for, but I verified it 
manually
: by jstacking a catalogd while performing some workload. Also 
added a
: simple unit test to verify the thread renaming behavior
> Can be removed I guess.
Done


http://gerrit.cloudera.org:8080/#/c/11228/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11228/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@301
PS2, Line 301: long end;
> Annotate here? This is one of those common interesting entry points for tab
Done


http://gerrit.cloudera.org:8080/#/c/11228/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2185
PS2, Line 2185:   private TGetPartialCatalogObjectResponse 
doGetPartialCatalogObject(
> This is another interesting entry point RPC for catalog v2 stuff. Add some
Done


http://gerrit.cloudera.org:8080/#/c/11228/2/fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java
File fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java:

http://gerrit.cloudera.org:8080/#/c/11228/2/fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java@24
PS2, Line 24: public class ThreadNameAnnotatorTest {
> great test :-)
Thanks. Handy little item from the bag-o-tricks...


http://gerrit.cloudera.org:8080/#/c/11228/2/fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java@44
PS2, Line 44: wait();
> nit: I hope nothing hangs here forever due to a faulty test? Add a largish
Done


http://gerrit.cloudera.org:8080/#/c/11228/2/fe/src/test/java/org/apache/impala/util/ThreadNameAnnotatorTest.java@107
PS2, Line 107:   public void testExternalRename() throws InterruptedException {
> nit: Add a doc of what it does? Probably difficult to understand without an
Done



--
To view, visit http://gerrit.cloudera.org:8080/11228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c
Gerrit-Change-Number: 11228
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 20 Feb 2019 02:12:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8181: Abbreviate row counts in EXPLAIN

2019-02-19 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12438 )

Change subject: IMPALA-8181: Abbreviate row counts in EXPLAIN
..


Patch Set 5:

(2 comments)

Addressed comments. Found one more cardinality field to abbreviate. Because the 
EXPLAIN output changed again, need to rerun all the pre-review tests. Will post 
an update when that completes.

http://gerrit.cloudera.org:8080/#/c/12438/5/fe/src/main/java/org/apache/impala/common/PrintUtils.java
File fe/src/main/java/org/apache/impala/common/PrintUtils.java:

http://gerrit.cloudera.org:8080/#/c/12438/5/fe/src/main/java/org/apache/impala/common/PrintUtils.java@64
PS5, Line 64: if (value == -1) return "unavailable";
> Shouldn't we instead use printCardinality() or some other helper instead of
Done


http://gerrit.cloudera.org:8080/#/c/12438/5/fe/src/main/java/org/apache/impala/common/PrintUtils.java@66
PS5, Line 66: if (value >= TERA) return new 
DecimalFormat(".00T").format(result / TERA);
> Curious if we should be consistent with the backend metrics which print wit
Created an "exact cardinality" method to format the values as suggested. Then, 
I went in search of which methods should use the "estimated" cardinality format 
vs. the "exact" cardinality format. I found that, in the planner, all 
cardinalities are estimates.

This rounding shows at least three digits of precision, so if a number is, say, 
1, 12 or 123, we'll see the exact number. But, if the number is large, we'll 
see an abbreviation, which is about as accurate as the underlying estimate (and 
easier to read.)

Can add the extra field later if we discover a place where we have exactly 
accurate values.



--
To view, visit http://gerrit.cloudera.org:8080/12438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
Gerrit-Change-Number: 12438
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 20 Feb 2019 00:28:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8181: Abbreviate row counts in EXPLAIN

2019-02-19 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12438

to look at the new patch set (#6).

Change subject: IMPALA-8181: Abbreviate row counts in EXPLAIN
..

IMPALA-8181: Abbreviate row counts in EXPLAIN

A recent fix added node cardinality to the standard EXPLAIN output,
displaying a large number like 123456780 as 123.46M. This patch applies
the same fix to the remaining row count numbers: metadata, extrapolated
rows, etc.

Tests:
* Rebased PlannerTest .test files as needed for the new row count
  format.
* Reran all tests to check for dependencies on the old format.

Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
---
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/custom_cluster/test_stats_extrapolation.py
M tests/metadata/test_stats_extrapolation.py
20 files changed, 546 insertions(+), 527 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/12438/6
--
To view, visit http://gerrit.cloudera.org:8080/12438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
Gerrit-Change-Number: 12438
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7917 (Part 1): Decouple Sentry from Impala

2019-02-19 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12020 )

Change subject: IMPALA-7917 (Part 1): Decouple Sentry from Impala
..


Patch Set 12: Code-Review+2

(1 comment)

Looks great!

http://gerrit.cloudera.org:8080/#/c/12020/12/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/12020/12/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2637
PS12, Line 2637:   
globalState_.maskedPrivilegeReqs.add(Pair.create(privReq, authErrorMsg_));
Nice!



--
To view, visit http://gerrit.cloudera.org:8080/12020
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1fd1df0b38ddd7cfa41299e95f5827f8a9e9c1f
Gerrit-Change-Number: 12020
Gerrit-PatchSet: 12
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Anonymous Coward (402)
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 20 Feb 2019 00:06:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-17 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12437 )

Change subject: IMPALA-8185: Abstract out real/mock file system operations
..


Patch Set 1:

Bharath, this is the code we discussed that abstracts out the file system 
operations into a "facade", then implements "real" and "mock" versions. It is a 
first step toward integrating the text-based dump parser with the framework you 
created in the test case builder.

Pre-review tests passed: https://jenkins.impala.io/job/pre-review-test/308/


--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Sun, 17 Feb 2019 19:56:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8185: Abstract out real/mock file system operations

2019-02-17 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12437


Change subject: IMPALA-8185: Abstract out real/mock file system operations
..

IMPALA-8185: Abstract out real/mock file system operations

The recent addition of the test case builder introduced the idea of
running a query in "test case mode" on a system different from the on on
which the test case was created. A number of if-statements were used to
mock file system operations which depend on the file system to exist.

This patch abstracts out the mock vs. real operations into a set of
"facade" classes. The HDFS facade performs the real opeations, the
"mock" facade mocks the operations well enough to fool the planner.
Selection of the facade depends on the recently-added test case mode
query option.

Tests: Reran all tests: both those that depend on a real file system and
the test case builder tests that depend on the mock file system.

Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/analysis/FileSystemFacade.java
A fe/src/main/java/org/apache/impala/analysis/HdfsFileSystemFacade.java
M fe/src/main/java/org/apache/impala/analysis/HdfsUri.java
A fe/src/main/java/org/apache/impala/analysis/MockFileSystemFacade.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
7 files changed, 282 insertions(+), 141 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/12437/1
--
To view, visit http://gerrit.cloudera.org:8080/12437
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1a385923b64c9fb59cc6e700ee7ee14919398e6d
Gerrit-Change-Number: 12437
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 


[Impala-ASF-CR] IMPALA-8181: Abbreviate row counts in EXPLAIN

2019-02-16 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12438 )

Change subject: IMPALA-8181: Abbreviate row counts in EXPLAIN
..


Patch Set 5:

This should be a simple one: just cleaning up EXPLAIN format a bit.

Pre-review tests passed: https://jenkins.impala.io/job/pre-review-test/307/


--
To view, visit http://gerrit.cloudera.org:8080/12438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
Gerrit-Change-Number: 12438
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Sun, 17 Feb 2019 02:16:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8181: Abbreviate row counts in EXPLAIN

2019-02-16 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12438


Change subject: IMPALA-8181: Abbreviate row counts in EXPLAIN
..

IMPALA-8181: Abbreviate row counts in EXPLAIN

A recent fix added node cardinality to the standard EXPLAIN output,
displaying a large number like 123456780 as 123.46M. This patch applies
the same fix to the remaining row count numbers: metadata, extrapolated
rows, etc.

Tests:
* Rebased PlannerTest .test files as needed for the new row count
  format.
* Reran all tests to check for dependencies on the old format.

Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
---
M fe/src/main/java/org/apache/impala/common/PrintUtils.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/custom_cluster/test_stats_extrapolation.py
M tests/metadata/test_stats_extrapolation.py
19 files changed, 470 insertions(+), 467 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/12438/5
--
To view, visit http://gerrit.cloudera.org:8080/12438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I08faaa9ad7b5ed42dcd7a15a333e8734bb45f10c
Gerrit-Change-Number: 12438
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 


[Impala-ASF-CR] IMPALA-7976 : Add a flag to disable sync using events at a table level

2019-02-15 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12365 )

Change subject: IMPALA-7976 : Add a flag to disable sync using events at a 
table level
..


Patch Set 11: Code-Review+2

LGTM.


--
To view, visit http://gerrit.cloudera.org:8080/12365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If116edba51bae139bc003be858b3214c3f0104e3
Gerrit-Change-Number: 12365
Gerrit-PatchSet: 11
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 15 Feb 2019 21:13:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7976 : Add a flag to disable sync using events at a table level

2019-02-14 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12365 )

Change subject: IMPALA-7976 : Add a flag to disable sync using events at a 
table level
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12365/8/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12365/8/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@815
PS8, Line 815: .thenReturn(dbFlag);
> Is the concern related to using mockito in general or just to mock our own
No problem. This was more a general observation that using purpose-built 
classes is great when we can do it, but if Mockito is the right tool for the 
job, by all means let's use it.



--
To view, visit http://gerrit.cloudera.org:8080/12365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If116edba51bae139bc003be858b3214c3f0104e3
Gerrit-Change-Number: 12365
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 14 Feb 2019 21:29:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7560: Improve selectivity estimate for !=

2019-02-13 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12427

to look at the new patch set (#3).

Change subject: IMPALA-7560: Improve selectivity estimate for !=
..

IMPALA-7560: Improve selectivity estimate for !=

Impala has historically used a generic selectivity of 0.1 for all
non-equality predicates. However, this can underestimate cardinality
in some cases. The correct value is

sel(c != x) = 1 - sel(c = x)

if c is a column, x is a constant and ndv(c) is known.

Adds the above for the != case and for IS DISTINCT FROM case. There are
many related issues that are left as separate patches.

Tests:
* Used the newly-added expression cardinality tests to highlight the
  change (some formerly broken tests now pass).
* Used the newly-added cardinality tests in PlannerTest to highlight the
  fix.
* Fixed a merge artifact by having the PlannerTest cardinality tests
  actually verify cardinality.

Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
12 files changed, 140 insertions(+), 116 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/12427/3
--
To view, visit http://gerrit.cloudera.org:8080/12427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
Gerrit-Change-Number: 12427
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7560: Improve selectivity estimate for !=

2019-02-13 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12427 )

Change subject: IMPALA-7560: Improve selectivity estimate for !=
..


Patch Set 3:

(3 comments)

Addressed review comments.

http://gerrit.cloudera.org:8080/#/c/12427/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12427/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-7560
> IMPALA-7560 is another jira for this.
Thanks. marked IMPALA-8039 as a duplicate and changed title to IMPALA-7560.


http://gerrit.cloudera.org:8080/#/c/12427/2/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/12427/2/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@219
PS2, Line 219: se NULL_MATCHING_EQ: // TODO
 :   if (singleCol && distinctValues > 0) {
 :
> I think these else branches are also taken for complex predicates (since di
Fixed. Computing col1 = col2 is possible, but should be done as a separate 
patch.


http://gerrit.cloudera.org:8080/#/c/12427/2/testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test:

http://gerrit.cloudera.org:8080/#/c/12427/2/testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test@1178
PS2, Line 1178:  NOT p_type LIKE 'MEDIUM POLISHED%', p_brand != 'Brand#45'
> Should we use consistent ordering (sort?)
Looks like PlanNode.orderConjunctsByCost() sorts by some combination of 
selectivity and estimated eval cost. Since we changed the selectivity of != 
this seems to have changed the eval order. The list here presents predicates in 
eval order, so it changed also.

The sort code seems to be observing that we'd like to evaluate the most 
expensive predicates last, and the least selective ones last. It tries to 
combine these factors to create a combined ordering.



--
To view, visit http://gerrit.cloudera.org:8080/12427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
Gerrit-Change-Number: 12427
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Thu, 14 Feb 2019 04:09:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8106: Cleanup of literal expression nodes

2019-02-13 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12265 )

Change subject: IMPALA-8106: Cleanup of literal expression nodes
..


Patch Set 14:

(2 comments)

Rebased on master. Passed pre-review tests: 
https://jenkins.impala.io/job/pre-review-test/303/

This one is ready for review. Yet another step toward merging rewrites into the 
analysis step.

http://gerrit.cloudera.org:8080/#/c/12265/13/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
File fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java:

http://gerrit.cloudera.org:8080/#/c/12265/13/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java@233
PS13, Line 233:   throw new InvalidValueException(
> line too long (97 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/12265/13/fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
File fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java:

http://gerrit.cloudera.org:8080/#/c/12265/13/fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java@233
PS13, Line 233: "CREATE TABLE testDb.test1 (i INT) PARTITIONED BY (a 
TINYINT, " +
> line too long (101 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/12265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I740f877964d5bc677fef98f1ccfa5be32cc8f7ac
Gerrit-Change-Number: 12265
Gerrit-PatchSet: 14
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Thu, 14 Feb 2019 03:42:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8106: Cleanup of literal expression nodes

2019-02-13 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12265

to look at the new patch set (#14).

Change subject: IMPALA-8106: Cleanup of literal expression nodes
..

IMPALA-8106: Cleanup of literal expression nodes

IMPALA-7902 cleaned up numeric literals to simplify cast logic, to
avoid the need to analyze a literal, and so on. This patch does
the same for the other literal expressions which, as a whole, need
less work than the numeric literal did. Specifically:

* Remove the need to analyze any literal expression.
* Clean up the way that code creates and analyzes literals.
* Streamlined code that creates a literal outside of an AST.
* Temporary support for marking a literal with its explicit type.
  This is needed during the reset/analyze part of the process to avoid
  repeated widening of literals. Will be removed in the final patch.
* Began to address the multiple ways that we parse strings to literals:
  for partition keys, in the parser, in a cast, in constant folding.
* Moved the code that parses a string to a number from the string
  literal class to the numeric literal class.

This work is part of the ongoing effort to merge expression
rewrites into a single-pass analysis phase in order to more cleanly
introduce additional rewrite rules.

Tests:
* Reran all FE tests and updated several as needed.
* Added a set of literal expression tests.
* Updated PlannerTests to show the type of NULL values.

Change-Id: I740f877964d5bc677fef98f1ccfa5be32cc8f7ac
---
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
M fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/common/SqlCastException.java
M fe/src/test/java/org/apache/impala/analysis/LiteralExprTest.java
M fe/src/test/java/org/apache/impala/analysis/NumericLiteralTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M testdata/workloads/functional-query/queries/QueryTest/chars-tmp-tables.test
24 files changed, 816 insertions(+), 155 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/12265/14
--
To view, visit http://gerrit.cloudera.org:8080/12265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I740f877964d5bc677fef98f1ccfa5be32cc8f7ac
Gerrit-Change-Number: 12265
Gerrit-PatchSet: 14
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8041, Part 2: Refactor SELECT list

2019-02-13 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12144 )

Change subject: IMPALA-8041, Part 2: Refactor SELECT list
..


Patch Set 4:

Rebased on master to resolve merge conflicts. This is one of several steps 
needed to integrate rewrite rules into analysis to improve performance and 
remove the analyze/rewrite/reset/analyze cycle.


--
To view, visit http://gerrit.cloudera.org:8080/12144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5e173dad717e739c097f1f0cd86a0352648ff886
Gerrit-Change-Number: 12144
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Thu, 14 Feb 2019 03:17:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8041, Part 2: Refactor SELECT list

2019-02-13 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Fredy Wijaya, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12144

to look at the new patch set (#4).

Change subject: IMPALA-8041, Part 2: Refactor SELECT list
..

IMPALA-8041, Part 2: Refactor SELECT list

Another step toward combining analysis and rewrites. This step refactors
the SELECT list to:

* Enable each select list item to hold its "source" (before analysis and
  rewrite) SQL.
* Split the select list item into two classes: one for the wildcard
  (which holds no expresion) and the other for actual select
  expressions.
* Moves some analysis steps into the select list item itself.
* Cleans up a few other minor items.

Tests: No functional change, just refactoring. Reran all FE tests.

Change-Id: I5e173dad717e739c097f1f0cd86a0352648ff886
---
M fe/src/main/cup/sql-parser.cup
A fe/src/main/java/org/apache/impala/analysis/AbstractExpression.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
12 files changed, 296 insertions(+), 124 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12144/4
--
To view, visit http://gerrit.cloudera.org:8080/12144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5e173dad717e739c097f1f0cd86a0352648ff886
Gerrit-Change-Number: 12144
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-4018 Part1: Add FORMAT clause in CAST()

2019-02-13 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12267 )

Change subject: IMPALA-4018 Part1: Add FORMAT clause in CAST()
..


Patch Set 6:

(6 comments)

Great progress. Mostly nits, and one question about tests.

http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
File fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java:

http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java@93
PS6, Line 93:   return new CastExpr(targetType, this, null);
Rather than changing calls everywhere, would recommend leaving the two-arg 
constructor and adding a three-hard version. The two-arg version can call the 
three-arg version with null in the last position.


http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@92
PS6, Line 92:   public CastExpr(TypeDef targetTypeDef, Expr e, String format) {
I suspect only one of these is called from the parser. That one can take the 
third arg. The other one, called only internally, will never pass a format.


http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/Expr.java@1362
PS6, Line 1362: return new CastExpr(targetType, this, null);
See prior comment.


http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
File fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java:

http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java@478
PS6, Line 478:   return new CastExpr(targetType, this, null);
Again.


http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
File fe/src/main/java/org/apache/impala/analysis/StringLiteral.java:

http://gerrit.cloudera.org:8080/#/c/12267/6/fe/src/main/java/org/apache/impala/analysis/StringLiteral.java@157
PS6, Line 157:   return new CastExpr(targetType, this, null);
Again.


http://gerrit.cloudera.org:8080/#/c/12267/6/testdata/workloads/functional-query/queries/QueryTest/cast_format.test
File testdata/workloads/functional-query/queries/QueryTest/cast_format.test:

http://gerrit.cloudera.org:8080/#/c/12267/6/testdata/workloads/functional-query/queries/QueryTest/cast_format.test@4
PS6, Line 4: 
The tests here are great. Only suggestion is that we have to ensure we test 
each and every format in the two-and four-digit versions. Would be great of we 
could do that in Java, but it requires the BE, so have to do it here.



--
To view, visit http://gerrit.cloudera.org:8080/12267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia514aaa9e8f5487d396587d5ed24c7348a492697
Gerrit-Change-Number: 12267
Gerrit-PatchSet: 6
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Thu, 14 Feb 2019 03:10:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8106: Cleanup of literal expression nodes

2019-02-13 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12265

to look at the new patch set (#13).

Change subject: IMPALA-8106: Cleanup of literal expression nodes
..

IMPALA-8106: Cleanup of literal expression nodes

IMPALA-7902 cleaned up numeric literals to simplify cast logic, to
avoid the need to analyze a literal, and so on. This patch does
the same for the other literal expressions which, as a whole, need
less work than the numeric literal did. Specifically:

* Remove the need to analyze any literal expression.
* Clean up the way that code creates and analyzes literals.
* Streamlined code that creates a literal outside of an AST.

This work is part of the ongoing effort to merge expression
rewrites into a single-pass analysis phase in order to more cleanly
introduce additional rewrite rules.

Tests:

* Reran all FE tests and updated several as needed.
* Added a set of literal expression tests.
* Updated PlannerTests to show the type of NULL values.

Change-Id: I740f877964d5bc677fef98f1ccfa5be32cc8f7ac
---
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
M fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/common/SqlCastException.java
M fe/src/test/java/org/apache/impala/analysis/LiteralExprTest.java
M fe/src/test/java/org/apache/impala/analysis/NumericLiteralTest.java
M fe/src/test/java/org/apache/impala/catalog/CatalogObjectToFromThriftTest.java
M fe/src/test/java/org/apache/impala/catalog/HdfsPartitionTest.java
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M testdata/workloads/functional-query/queries/QueryTest/chars-tmp-tables.test
26 files changed, 824 insertions(+), 159 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/12265/13
--
To view, visit http://gerrit.cloudera.org:8080/12265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I740f877964d5bc677fef98f1ccfa5be32cc8f7ac
Gerrit-Change-Number: 12265
Gerrit-PatchSet: 13
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-8182: Add single-node plan to PlanCtx

2019-02-13 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12436

to look at the new patch set (#2).

Change subject: IMPALA-8182: Add single-node plan to PlanCtx
..

IMPALA-8182: Add single-node plan to PlanCtx

Historically, the Impala planner is a function: AST in one end, Thrift
plan out the other. The planner itself creates a rich intermediate plan
representation, but none of those objects were available for testing,
forcing all tests, no matter how detailed, to run as end-to-end tests:
SQL in one end, verify a (very abbreviated) DESCRIBE output out the
other.

A recent change introduced the PlanCtx that provided access to the
parallelized plan fragments. Doing so allowed adding a variety of
cardinality tests.

It turns out that additional testing requires access to the "single node
plan" in order to verify things like join cardinality.

This patch generalizes the previous patch to provide access to all
intermediate plan structures created during planning. Doing so enables
full unit testing of all planning steps.

Tests: This fix is refactoring only. Reran all FE tests.

Change-Id: I983473f259df8af17d8dd38e497fe33e839cd0dc
---
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
3 files changed, 66 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/12436/2
--
To view, visit http://gerrit.cloudera.org:8080/12436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I983473f259df8af17d8dd38e497fe33e839cd0dc
Gerrit-Change-Number: 12436
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 


[Impala-ASF-CR] IMPALA-7976 : Add a flag to disable sync using events at a table level

2019-02-13 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12365 )

Change subject: IMPALA-7976 : Add a flag to disable sync using events at a 
table level
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12365/8/fe/pom.xml
File fe/pom.xml:

http://gerrit.cloudera.org:8080/#/c/12365/8/fe/pom.xml@509
PS8, Line 509: 
Mockito is often a poor choice for mocking when we own the code. Is is really 
needed?


http://gerrit.cloudera.org:8080/#/c/12365/8/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
File 
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java:

http://gerrit.cloudera.org:8080/#/c/12365/8/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@744
PS8, Line 744:   throws TException, CatalogException {
Thanks for adding these tests!


http://gerrit.cloudera.org:8080/#/c/12365/8/fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java@815
PS8, Line 815: .thenReturn(dbFlag);
Here I wonder I we can simply define a test-only catalog service subclass that 
does what we want. We've already got several: one used for unit tests for dummy 
tables, another that Bharath added for a Derby-based local HMS.

Mockito is useful at times, but using it to mock our own code tends to lead to 
tests that are more complex and brittle than if we can define the test-only 
class using plain old Java.

Use your best judgement here: leave this as is if you feel that the solution 
shown here is simpler than the one suggested in the above comment.



--
To view, visit http://gerrit.cloudera.org:8080/12365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If116edba51bae139bc003be858b3214c3f0104e3
Gerrit-Change-Number: 12365
Gerrit-PatchSet: 8
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Bharath Krishna 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 13 Feb 2019 22:01:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7961: Avoid adding unmodified objects to DDL response

2019-02-13 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12428 )

Change subject: IMPALA-7961: Avoid adding unmodified objects to DDL response
..


Patch Set 5: Code-Review+2

(1 comment)

LGTM. One very minor nit which we can skip.

http://gerrit.cloudera.org:8080/#/c/12428/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12428/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2241
PS5, Line 2241: if (LOG.isErrorEnabled()) {
Nit: error should never be disabled, so no harm in doing this one in-line. 
We'll always pay the String.format overhead, so nothing fancy needed.



--
To view, visit http://gerrit.cloudera.org:8080/12428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3e914b70ba796c9b224e9dea559b8c40aa25d83
Gerrit-Change-Number: 12428
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Wed, 13 Feb 2019 20:42:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7927: Enhance Rewritten SQL in test files

2019-02-11 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12033 )

Change subject: IMPALA-7927: Enhance Rewritten SQL in test files
..


Patch Set 2:

Good points. Let me rethink this. The reason for the current approach is that 
the expression can be arbitrarily long: foo + bar /* a0$1 */ -- does the 
reference apply to "bar" or "foo + bar"?

Since this is all pretty obscure, let me find a different way to solve this.


--
To view, visit http://gerrit.cloudera.org:8080/12033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I09ee971a7797c4154bafee5c080ac3c6a1057fc7
Gerrit-Change-Number: 12033
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 12 Feb 2019 01:06:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7927: Enhance Rewritten SQL in test files

2019-02-11 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/12033 )

Change subject: IMPALA-7927: Enhance Rewritten SQL in test files
..


Abandoned

will rethink the approach.
--
To view, visit http://gerrit.cloudera.org:8080/12033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I09ee971a7797c4154bafee5c080ac3c6a1057fc7
Gerrit-Change-Number: 12033
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-8182: Add single-node plan to PlanCtx

2019-02-11 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12436


Change subject: IMPALA-8182: Add single-node plan to PlanCtx
..

IMPALA-8182: Add single-node plan to PlanCtx

Historically, the Impala planner is a function: AST in one end, Thrift
plan out the other. The planner itself creates a rich intermediate plan
representation, but none of those objects were available for testing,
forcing all tests, no matter how detailed, to run as end-to-end tests:
SQL in one end, verify a (very abbreviated) DESCRIBE output out the
other.

A recent change introduced the PlanCtx that provided access to the
parallelized plan fragments. Doing so allowed adding a variety of
cardinality tests.

It turns out that additional testing requires access to the "single node
plan" in order to verify things like join cardinality.

This ticket asks to modify the PlanCtx to capture all the internal plan
nodes: both single node and distributed, to enable full unit testing.

Tests: This fix is refactoring only. Reran all FE tests.

Change-Id: I983473f259df8af17d8dd38e497fe33e839cd0dc
---
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
3 files changed, 37 insertions(+), 22 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/12436/1
--
To view, visit http://gerrit.cloudera.org:8080/12436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I983473f259df8af17d8dd38e497fe33e839cd0dc
Gerrit-Change-Number: 12436
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 


[Impala-ASF-CR] IMPALA-8182: Add single-node plan to PlanCtx

2019-02-11 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12436 )

Change subject: IMPALA-8182: Add single-node plan to PlanCtx
..


Patch Set 1:

A simple one: just providing access to single plan nodes for testing.

Pre-review tests passed: https://jenkins.impala.io/job/pre-review-test/301/


--
To view, visit http://gerrit.cloudera.org:8080/12436
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I983473f259df8af17d8dd38e497fe33e839cd0dc
Gerrit-Change-Number: 12436
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Tue, 12 Feb 2019 00:21:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7961: Avoid adding unmodified objects to DDL response

2019-02-11 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12428 )

Change subject: IMPALA-7961: Avoid adding unmodified objects to DDL response
..


Patch Set 3:

(5 comments)

Initial, superficial review.

http://gerrit.cloudera.org:8080/#/c/12428/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/12428/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2243
PS3, Line 2243: 
FeCatalogUtils.debugString(result.removed_catalog_objects)));
Nit: There is a trick you can use here. The logger has its own (clunky) 
formatting to avoid the need to format a string that we typically throw away.

LOG.error("Couldn't retrieve .. for {} ... Updated objects: {}, deleted 
objects: {}", args);

Or, since the called functions are rather complex, might be easier to leave the 
code as it is, but enclose it in an if-statement that checks for debug level.


http://gerrit.cloudera.org:8080/#/c/12428/3/fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java
File fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java:

http://gerrit.cloudera.org:8080/#/c/12428/3/fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java@107
PS3, Line 107:   LOG.info(String.format("Topic update log GC started. 
GC-ing topics with versions " +
Here you can use the formatting trick mentioned previously.


http://gerrit.cloudera.org:8080/#/c/12428/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12428/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1765
PS3, Line 1765:* Returns 'true' if a new table has been created with the 
given params, false
Nit: @return true if ...


http://gerrit.cloudera.org:8080/#/c/12428/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1766
PS3, Line 1766:* otherwise.
Might be useful to explain the response. I see a comment saying we are not 
adding an existing table. But, it is not immediately clear why that is or is 
not a good thing. What is the response supposed to represent? How will it be 
used? What guarantees does this method make about the response?


http://gerrit.cloudera.org:8080/#/c/12428/3/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@1782
PS3, Line 1782:   }
Nit: Using the log formatting can avoid the if-statement



--
To view, visit http://gerrit.cloudera.org:8080/12428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3e914b70ba796c9b224e9dea559b8c40aa25d83
Gerrit-Change-Number: 12428
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Mon, 11 Feb 2019 19:27:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8039: Incorrect selectivity estimate for not-equals predicate

2019-02-11 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12427 )

Change subject: IMPALA-8039: Incorrect selectivity estimate for not-equals 
predicate
..


Patch Set 2:

Bharath, this is a first small selectivity fix. Please review at your 
convenience.

Pre-review tests passed: https://jenkins.impala.io/job/pre-review-test/300/


--
To view, visit http://gerrit.cloudera.org:8080/12427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
Gerrit-Change-Number: 12427
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Mon, 11 Feb 2019 17:08:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8039: Incorrect selectivity estimate for not-equals predicate

2019-02-11 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12427


Change subject: IMPALA-8039: Incorrect selectivity estimate for not-equals 
predicate
..

IMPALA-8039: Incorrect selectivity estimate for not-equals predicate

Impala has historically used a generic selectivity of 0.1 for all
non-equality predicates. However, this can underestimate cardinality
in some cases. The correct value is

sel(c != x) = 1 - sel(c = x)

if c is a column, x is a constant and ndv(c) is known.

Adds the above for the != case and for IS DISTINCT FROM case. There are
many related issues that are left as separate patches.

Tests:
* Used the newly-added expression cardinality tests to highlight the
  change (some formerly broken tests now pass).
* Used the newly-added cardinality tests in PlannerTest to highlight the
  fix.
* Fixed a merge artifact by having the PlannerTest cardinality tests
  actually verify cardinality.

Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
---
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/card-scan.test
M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-views.test
12 files changed, 136 insertions(+), 114 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/12427/2
--
To view, visit http://gerrit.cloudera.org:8080/12427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8f6013c9ef95a89d55d8b25f0b5433c81582a62f
Gerrit-Change-Number: 12427
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 


[Impala-ASF-CR] Additions to .gitignore

2019-02-10 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12430 )

Change subject: Additions to .gitignore
..


Patch Set 1:

This is a very simple one. Please take a look at your convenience.


--
To view, visit http://gerrit.cloudera.org:8080/12430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia8ea436f5e108cc08389f43d639a4cb7315271c1
Gerrit-Change-Number: 12430
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Mon, 11 Feb 2019 02:36:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] Additions to .gitignore

2019-02-10 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12430


Change subject: Additions to .gitignore
..

Additions to .gitignore

Adds entries for Eclipse-created files and for a couple of temporary
files commonly created during front-end debugging.

Change-Id: Ia8ea436f5e108cc08389f43d639a4cb7315271c1
---
M .gitignore
1 file changed, 10 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/12430/1
--
To view, visit http://gerrit.cloudera.org:8080/12430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia8ea436f5e108cc08389f43d639a4cb7315271c1
Gerrit-Change-Number: 12430
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 


[Impala-ASF-CR] IMPALA-8177: Log DDL failures in coordinator logs

2019-02-10 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12414 )

Change subject: IMPALA-8177: Log DDL failures in coordinator logs
..


Patch Set 4:

Late to the party, but we want to log detailed stack trace on the Java side if 
any unexpected exception occurs. This is work in progress.


--
To view, visit http://gerrit.cloudera.org:8080/12414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie89291ee27156c701e07cea44ad3ee07ec54ab42
Gerrit-Change-Number: 12414
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sun, 10 Feb 2019 21:00:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8095: Detailed expression cardinality tests

2019-02-08 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12248 )

Change subject: IMPALA-8095: Detailed expression cardinality tests
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12248/7/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/12248/7/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@122
PS7, Line 122:// Bug: NDV of partition columns is -1 though it is listed as
 : // 2 in the shell with: SHOW COLUMN STATS alltypes
 : //verifyTableCol(allTypes, "year", 2, 0);
 : // Bug: When tests are run in Eclipse we get the result 
above.
 : // But, when the same test is run using maven from the 
command line,
 : // we get the result shown below.
 : // Unit test in Eclipse see the above, unit tests run from 
the
 : // Disabling both to avoid a flaky test,
 : // Same issue for the next three tests.
 : //verifyTableCol(allTypes, "year", -1, -1);
 : //verifyTableCol(allTypes, "month", 12, 0);
 : //verifyTableCol(allTypes, "month", -1, -1);
> (This and multiple places below)
We should. We should also dig into the many bugs here.

That said, the purpose of this patch is to create the test baseline, pointing 
out the many things to fix, including this one.

OK to create the baseline first, then do the fixes as separate patches?



--
To view, visit http://gerrit.cloudera.org:8080/12248
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3da58ee9b0beebeffb170b9430bd36d20dcd2401
Gerrit-Change-Number: 12248
Gerrit-PatchSet: 7
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 08 Feb 2019 22:30:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5872: Testcase builder for query planner

2019-02-07 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12221 )

Change subject: IMPALA-5872: Testcase builder for query planner
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12221/7/fe/src/main/java/org/apache/impala/analysis/HdfsUri.java
File fe/src/main/java/org/apache/impala/analysis/HdfsUri.java:

http://gerrit.cloudera.org:8080/#/c/12221/7/fe/src/main/java/org/apache/impala/analysis/HdfsUri.java@79
PS7, Line 79:   boolean registerPrivReq, boolean throwIfNotExists) throws 
AnalysisException {
> I'm not sure I got your comment. Can you please clarify?
Point was, is helpful for the variable to state the behavior it controls ("the 
file must exist", "the file must not exist") rather than describing 
implementation "throwIfNotExists." It is a nit, but helps future readers.


http://gerrit.cloudera.org:8080/#/c/12221/7/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/12221/7/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@1173
PS7, Line 1173: "supported in a single expression: " + 
conjunct.toSql());
> I think preconditions check failure generally means that an invariant is vi
Agreed. But the wording sounds much like what we'd issue elsewhere with an 
AnalysisException. So, it raises the question, is this the only check? Is this 
message targeted at the user? Should it be an Analysis exception?

Since this is code invariant violation, the message is a) not really needed, or 
b) should be targeted at the developer who got the error: "somehow you got here 
without triggering the analysis exception for this condition: go back and 
figure out what got screwed up."

Again, this is a nit, but helps future readers be certain of the intent.


http://gerrit.cloudera.org:8080/#/c/12221/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/12221/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@411
PS7, Line 411:   if (ret != null) {
> I commented about "clashes" below. The way this is implemented,  we don't h
Then we have an inconsistency. In Catalog.java, the code is:

  public Db addDb(Db db) {
return dbCache_.get().put(db.getName().toLowerCase(), db);
  }

That the code works with both versions suggests that the behavior is actually 
both undefined and unused. Perhaps don't return the DB at all (i.e. change 
method to void) to avoid the need to resolve an issue that we don't actually 
care about.


http://gerrit.cloudera.org:8080/#/c/12221/7/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@424
PS7, Line 424:   t.getLock().lock();
> The lock is actually annoying. We need that because we have an assertion in
Makes sense. Perhaps a comment or two in the code to explain this thinking will 
help future readers.



--
To view, visit http://gerrit.cloudera.org:8080/12221
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec83eeb2dc5136768b70ed581fb8d3ed0335cb52
Gerrit-Change-Number: 12221
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Comment-Date: Thu, 07 Feb 2019 19:22:08 +
Gerrit-HasComments: Yes


  1   2   3   4   5   >