[Impala-ASF-CR] IMPALA-8533: Impala daemon crash on sort

2020-03-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15473 )

Change subject: IMPALA-8533: Impala daemon crash on sort
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15473/1/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
File fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java:

http://gerrit.cloudera.org:8080/#/c/15473/1/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java@366
PS1, Line 366:   if(sortInfo.getSortTupleDescriptor().getSlots().size() > 
0) {
Should this same check be added for Sorts in other types of plans, not only for 
the Analytic functions ?  e.g when creating the sort in SingleNodePlanner - 
either a total sort or a sort with limit.  Do those work correctly for the case 
where an ORDER BY occurs after a UNION which is projecting a constant literal ?



--
To view, visit http://gerrit.cloudera.org:8080/15473
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If19303fbf55927c1e1b76b9b22ab354322b21c54
Gerrit-Change-Number: 15473
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 18 Mar 2020 02:52:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8533: Impala daemon crash on sort

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15473 )

Change subject: IMPALA-8533: Impala daemon crash on sort
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5514/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15473
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If19303fbf55927c1e1b76b9b22ab354322b21c54
Gerrit-Change-Number: 15473
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 18 Mar 2020 01:18:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15463 )

Change subject: IMPALA-9530: query option to limit preagg memory
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5515/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9
Gerrit-Change-Number: 15463
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 18 Mar 2020 01:18:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15463 )

Change subject: IMPALA-9530: query option to limit preagg memory
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15463/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/15463/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@585
PS1, Line 585:   // Aggregations should generally not use significantly 
more than the max reservation,
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/15463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9
Gerrit-Change-Number: 15463
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 18 Mar 2020 00:34:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9530: query option to limit preagg memory

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15463


Change subject: IMPALA-9530: query option to limit preagg memory
..

IMPALA-9530: query option to limit preagg memory

This adds an advanced PREAGG_BYTES_LIMIT query option that
allows limiting the memory consumption of preaggregation
operators in a query.

It works by setting a maximum reservation on each grouping
aggregator in a preaggregation node. The aggregators switch
to passthrough mode automatically when hitting this limit,
the same as if they were hitting the query memory limit.

The default behaviour is unchanged.

Testing:
Add a planner test with estimates higher and lower than limit
to ensure that resource estimates correctly reflect the option.

Add an end-to-end test that verifies that the option forces
passthrough when the memory limit is hit.

Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9
---
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/preagg-bytes-limit.test
M testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test
9 files changed, 156 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/15463/1
--
To view, visit http://gerrit.cloudera.org:8080/15463
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I87f7a5c68da93d068e304ef01afbcbb0d56807d9
Gerrit-Change-Number: 15463
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8533: Impala daemon crash on sort

2020-03-17 Thread Kurt Deschler (Code Review)
Kurt Deschler has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15473


Change subject: IMPALA-8533: Impala daemon crash on sort
..

IMPALA-8533: Impala daemon crash on sort

This crash was caused by an empty sort tuple descriptor that was
generated as a result of union substitutions replacing all sort
fields with literals that were subsequently removed from the ordering
spec. There was no check in place to prevent the empty tuple descriptor
from being sent to impalad where it caused a divide-by-zero crash.

This fix avoids inserting a sort node when there are no fields remaining
to sort on. Also added a precondition to the SortNode that will prevent
similar issues from crashing impalad.

Testing:
Testcase added to PlannerTest/union.test

Change-Id: If19303fbf55927c1e1b76b9b22ab354322b21c54
---
M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
3 files changed, 46 insertions(+), 14 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/15473/1
--
To view, visit http://gerrit.cloudera.org:8080/15473
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If19303fbf55927c1e1b76b9b22ab354322b21c54
Gerrit-Change-Number: 15473
Gerrit-PatchSet: 1
Gerrit-Owner: Kurt Deschler 


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 20: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 20
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 23:29:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same way as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31%   | -2.02   | -3.96   |
| TPCH(30) | TPCH-Q9  | parquet / none / none | 15.59 

[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..

IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option

- Minor edit

Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
(cherry picked from commit 0f1c87f814b5af4f6b37c5ebf095e887916424ef)
Reviewed-on: http://gerrit.cloudera.org:8080/15457
Tested-by: Impala Public Jenkins 
Reviewed-by: Aman Sinha 
Reviewed-by: Joe McDonnell 
---
M docs/impala.ditamap
A docs/topics/impala_broadcast_bytes_limit.xml
2 files changed, 68 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Aman Sinha: Looks good to me, but someone else must approve
  Joe McDonnell: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 6
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 5
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 23:03:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 5: Code-Review+1

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 5
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 23:01:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 5: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/568/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 5
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:58:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15462 )

Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive 
normal form
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/15462/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15462/3//COMMIT_MSG@35
PS3, Line 35: Testing:
Maybe add Q13 to 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test so 
we're testing this directly.

Did any of the existing planner tests change at all?


http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java:

http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@37
PS3, Line 37:  * it can work for single table predicates also (primarily 
intended for testing).
The single table rewrites might be useful for pushing predicates down into the 
storage (a lot of optimisations like min-max filtering or pushdown into Kudu 
won't work for OR conjuncts). I think this is OK as-is but maybe worth thinking 
about as an extension.


http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@132
PS3, Line 132:   } else if (rhs instanceof CompoundPredicate &&
Consider factoring the logic of the two branches out into a common function, 
since it's essentialyl the same with lhs and rhs swapped (although I guess the 
order of the expressions in the output is slightly different, so OK to ignore 
if that was your intent.)


http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@136
PS3, Line 136: List disjuncts = new ArrayList<>();
I think it would be more concise to use Arrays.asList() to construct the list. 
That would make it significantly more readable IMO by reducing the visual noise.


http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@156
PS3, Line 156: if (forMultiTablesOnly_) {
It would be good to factor this check out into a common function.



--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:58:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 5:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/568/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 5
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:51:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Kristine Hahn (Code Review)
Hello Aman Sinha, Tim Armstrong, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15457

to look at the new patch set (#5).

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..

IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option

- Minor edit

Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
(cherry picked from commit 0f1c87f814b5af4f6b37c5ebf095e887916424ef)
---
M docs/impala.ditamap
A docs/topics/impala_broadcast_bytes_limit.xml
2 files changed, 68 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15457/5
--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 5
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Add support for centos8

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15465 )

Change subject: Add support for centos8
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15465/1/docker/all/postinstall.sh
File docker/all/postinstall.sh:

http://gerrit.cloudera.org:8080/#/c/15465/1/docker/all/postinstall.sh@31
PS1, Line 31: alternatives --set python /usr/bin/python2
> Just a question: does this alias pip2 to pip as well?
Kinda.

pip doesn't get installed through the OS packages. It gets installed on line 39 
of this script. Since it gets called using python2, we end up with pip running 
under python2.7 (os dependent).


So basically:
dnf install python2 python3

# which python
which: no python in 
(/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
# which pip
which: no pip in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
# alternatives --set python /usr/bin/python2
# which python
/usr/bin/python
# which pip
which: no pip in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
# python get-pip.py
# which pip
/usr/bin/pip


http://gerrit.cloudera.org:8080/#/c/15465/1/docker/redhat8.df
File docker/redhat8.df:

http://gerrit.cloudera.org:8080/#/c/15465/1/docker/redhat8.df@48
PS1, Line 48: RUN postinstall.sh
:
: COPY ./all
> nit: I guess this comment is stale now
Done



--
To view, visit http://gerrit.cloudera.org:8080/15465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
Gerrit-Change-Number: 15465
Gerrit-PatchSet: 2
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:49:44 +
Gerrit-HasComments: Yes


[native-toolchain-CR] WIP: Add platform tag for Ubuntu 14.04

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has abandoned this change. ( 
http://gerrit.cloudera.org:8080/15469 )

Change subject: WIP: Add platform tag for Ubuntu 14.04
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/15469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I32dac434ddf12ec3a2367f6a0974b3504bd137c5
Gerrit-Change-Number: 15469
Gerrit-PatchSet: 2
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 


[native-toolchain-CR] Add support for centos8

2020-03-17 Thread Hector Acosta (Code Review)
Hello Laszlo Gaal,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15465

to look at the new patch set (#2).

Change subject: Add support for centos8
..

Add support for centos8

This commit adds a new centos8 docker image. Most of this is pretty 
straightforward with
the exception of having to explicitly set our default python. This needs to 
happen
early in the postinstall process since other tools (aws cli) depend on it.

Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
---
M docker/all/assert-dependencies-present.py
M docker/all/postinstall.sh
A docker/redhat8.df
M in-docker.py
4 files changed, 66 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/65/15465/2
--
To view, visit http://gerrit.cloudera.org:8080/15465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
Gerrit-Change-Number: 15465
Gerrit-PatchSet: 2
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 


[native-toolchain-CR] Remove centos6, ubuntu12, ubuntu14 platforms

2020-03-17 Thread Hector Acosta (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15464

to look at the new patch set (#2).

Change subject: Remove centos6, ubuntu12, ubuntu14 platforms
..

Remove centos6, ubuntu12, ubuntu14 platforms

This commit removes unused platform redhat6, and EOL platforms ubuntu12 and 
ubuntu14.

Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403
---
M Makefile
D docker/redhat6.df
D docker/ubuntu1404.df
M in-docker.py
4 files changed, 0 insertions(+), 111 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/64/15464/2
--
To view, visit http://gerrit.cloudera.org:8080/15464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403
Gerrit-Change-Number: 15464
Gerrit-PatchSet: 2
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15457/4/docs/topics/impala_broadcast_bytes_limit.xml
File docs/topics/impala_broadcast_bytes_limit.xml:

http://gerrit.cloudera.org:8080/#/c/15457/4/docs/topics/impala_broadcast_bytes_limit.xml@59
PS4, Line 59: -- Change the default value to 16GB.
Sorry, I should have mentioned in the prior feedback .. the word 'default' can 
be skipped here as well since we cannot change the default through the SET 
command (default is specified in the code).  How about just ' Change the limit 
to 16GB'



--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 4
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:36:47 +
Gerrit-HasComments: Yes


[native-toolchain-CR] WIP: Add platform tag for Ubuntu 14.04

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15469 )

Change subject: WIP: Add platform tag for Ubuntu 14.04
..


Patch Set 1:

> I must be missing something here: I was convinced that the earlier
 > change, https://gerrit.cloudera.org/c/15464/1 removed the
 > Ubuntu-14.04 toolchain build (so artifacts wouldn't be prodiced
 > either). How come that the build is still producing artifacts then?
 > Wouldn't the right fix be to make sure that Ubuntu-14.04 bits are
 > just not built?

I think I found the problem. The list of supported platforms is listed in the 
Makefile. I'll remove them from there and will abandon this RR.


--
To view, visit http://gerrit.cloudera.org:8080/15469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32dac434ddf12ec3a2367f6a0974b3504bd137c5
Gerrit-Change-Number: 15469
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:34:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 4: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/567/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 4
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:25:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5513/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 8
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:21:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 4:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/567/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 4
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 22:16:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Kristine Hahn (Code Review)
Hello Aman Sinha, Tim Armstrong, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15457

to look at the new patch set (#4).

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..

IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option

- Remove "default", explain statistics basis and benefits.
- Revise examples.

Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
(cherry picked from commit 0f1c87f814b5af4f6b37c5ebf095e887916424ef)
---
M docs/impala.ditamap
A docs/topics/impala_broadcast_bytes_limit.xml
2 files changed, 68 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15457/4
--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 4
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/566/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 3
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 21:56:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 20: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5485/


--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 20
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 21:48:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Kristine Hahn (Code Review)
Hello Aman Sinha, Tim Armstrong, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15457

to look at the new patch set (#3).

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..

IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option

Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
(cherry picked from commit 0f1c87f814b5af4f6b37c5ebf095e887916424ef)
---
M docs/impala.ditamap
A docs/topics/impala_broadcast_bytes_limit.xml
2 files changed, 65 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/15457/3
--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 3
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/566/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 3
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 21:48:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-17 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..

IMPALA-9466: impala-shell client retry for hs2-http protocol

Added retries for idempotent rpcs:
OpenSession, PingImpalaHS2Service, GetResultSetMetadata,
CloseImpalaOperation (non dmls), CancelOperation, GetOperationStatus,
GetRuntimeProfile, GetExecSummary, GetLog

Retries were also added to the 'set all' query execution and subsequent
result fetch in the ImpalaHS2Client._open_session()

The retries are only supported for hs2-http protocol and enabled by
default. At most there are 3 tries for a failed rpc with at least 2
second wait duration between tries.

Only failed rpcs due to an error in the http transport are retried and
if an rpc failed because the server returned an error in the rpc
response then such scenarios are not retriable.

Improved error diagnostics by dumping stack trace when ImpalaShell.
_execute_stmt() gets an 'Unknown Exception'.

Testing:
- Added a custom_cluster test which injects fault into the http transport
and checks expected behavior from the various rpcs. Some of these tests
leave the session in an open state and so these tests are not suitable
for the e2e test framework which have metric verifiers expecting related
metrics to be 0 at the end of the test.
- Manually tested real world scenarios with impala-shell client
communicating with an impala coordinator via a fault injecting istio mesh.
- Manually tested dropping connections on an nginx ingress gateway by sending
SIGTERM to all worker processes.

Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
---
M shell/impala_client.py
M shell/impala_shell.py
A tests/custom_cluster/test_hs2_fault_injection.py
3 files changed, 461 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/15378/8
--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 8
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-17 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..


Patch Set 7:

(9 comments)

Addressed the comments so far.

http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@645
PS7, Line 645: max_tries
> nit: document what this variable represents
Done.


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@649
PS7, Line 649: self.retry_sleep_duration_s = 2
> why 2 seconds?
I picked up a duration and yes this might need some tuning. I think it's 
probably better to wait before retrying right away in case of failures.


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@674
PS7, Line 674: max_tries = self.max_tries
> nit: why is this necessary?
Removed the redundant variable.


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@679
PS7, Line 679: execute_query
> would it make sense to add an option to execute_query that adds the option
I think we still need the retry logic for fetch. And, if the fetch rpcs fail, 
we may not have the correct fetch results (even if we try to retry fetch)? And 
so the only way to ensure that we are able to run the 'set all' query and get 
its results properly is by retrying the 'execute' and 'fetch' rpcs together as 
a whole.


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@689
PS7, Line 689: {1}
> nit: 'type={1}'
Done.


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@693
PS7, Line 693:   if set_all_handle is not None:
 : self.close_query(set_all_handle)
> should this be in a finally block?
Done.


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@934
PS7, Line 934: """Executes the provided 'rpc' callable and tranlates any 
exceptions in the
> nit: document new option, would recommend including some docs explaining ho
Done.


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@960
PS7, Line 960: Num remaining tries: {3}
> i think this is potentially confusing to this for all rpcs, especially for
Good point. I made changes so that we dump 'Num remaining tries <>' only if 
there is a remaining try. In other cases we dump nothing. The output looks like 
following:
```
Caught exception HTTP code 502: Injected Fault, type= in OpenSession. Num remaining tries: 2
Caught exception HTTP code 502: Injected Fault, type= in OpenSession. Num remaining tries: 1
Caught exception HTTP code 502: Injected Fault, type= in OpenSession.
```


http://gerrit.cloudera.org:8080/#/c/15378/7/tests/custom_cluster/test_hs2_fault_injection.py
File tests/custom_cluster/test_hs2_fault_injection.py:

http://gerrit.cloudera.org:8080/#/c/15378/7/tests/custom_cluster/test_hs2_fault_injection.py@123
PS7, Line 123:   @pytest.mark.execute_serially
> why do these all need to be executed serially?
All the tests share the instance variable self.custom_hs2_http_client and 
self.transport. So probably not a good idea to run the tests in parallel.



--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 7
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 17 Mar 2020 21:34:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15457/2/docs/topics/impala_broadcast_bytes_limit.xml
File docs/topics/impala_broadcast_bytes_limit.xml:

http://gerrit.cloudera.org:8080/#/c/15457/2/docs/topics/impala_broadcast_bytes_limit.xml@42
PS2, Line 42: Sets the default limit for the size of the broadcast 
input. Setting such a limit
Remove the word 'default' .  Also, this is based on estimated sizes (based on 
statistics), not actual size of the broadcast input.  So, how about something 
like:
'Sets the limit for the size of the broadcast input based on estimated size.  
The Impala planner may in some rare cases make a bad choice to broadcast a 
large table or intermediate result and encounter performance problems due to 
high memory pressure. Setting this limit will make the planner pick a partition 
based hash join instead of broadcast and avoid such performance problems.'


http://gerrit.cloudera.org:8080/#/c/15457/2/docs/topics/impala_broadcast_bytes_limit.xml@51
PS2, Line 51: The default value is 32GB. A value of 0 causes the 
option to be ignored.
Since the value is supposed to be specified in bytes (rather than in MB or GB), 
lets say something like 'The default value is 34359738368 (32 GB)'


http://gerrit.cloudera.org:8080/#/c/15457/2/docs/topics/impala_broadcast_bytes_limit.xml@57
PS2, Line 57: set broadcast_bytes_limit=64;
Similar to the prior comment, since this is specified in bytes, if you want to 
change it 64 GB you will have to give the  expanded value.  Let's use a smaller 
number as example:
-- Change the default value to 16 GB
set broadcast_bytes_limit=17179869184;



--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 2
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 21:26:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9517: [DOCS] Document broadcast bytes limit query option

2020-03-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15457 )

Change subject: IMPALA-9517: [DOCS] Document broadcast_bytes_limit query option
..


Patch Set 2:

Can you rebase this change to the latest?


--
To view, visit http://gerrit.cloudera.org:8080/15457
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f7eaacd5a885a7a5292d7694241d58e4f7b6282
Gerrit-Change-Number: 15457
Gerrit-PatchSet: 2
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 20:26:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Add CentOS 8.1 support to bootstrap system.sh

2020-03-17 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15461 )

Change subject: WIP: Add CentOS 8.1 support to bootstrap_system.sh
..


Patch Set 1:

General remark: this change gets the build bootstrap process to the point of 
building the Impala virtualenv.
I have tested it both in a regular Docker container using 
docker/test-woth-docker.py (the Centos 8 base container there is a pretty slim 
one) and with a Centos 8.1 container instance on our internal build cloud. The 
bootstrap process progresses, then fails the same way at the same point in both 
environments.


--
To view, visit http://gerrit.cloudera.org:8080/15461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67a58ec007219020e1fb562216d7a0d1ff38b0bd
Gerrit-Change-Number: 15461
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 17 Mar 2020 20:24:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9029: [DOCS] Impala 3.4 Release Notes

2020-03-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14863 )

Change subject: IMPALA-9029: [DOCS] Impala 3.4 Release Notes
..

IMPALA-9029: [DOCS] Impala 3.4 Release Notes

-Added broadcast_bytes_limit query option

Change-Id: I4385749de35f8379ecf6566fe515ed500b42d6cc
Reviewed-on: http://gerrit.cloudera.org:8080/14863
Tested-by: Impala Public Jenkins 
Reviewed-by: Joe McDonnell 
---
M docs/shared/impala_common.xml
M docs/topics/impala_incompatible_changes.xml
M docs/topics/impala_known_issues.xml
M docs/topics/impala_new_features.xml
M docs/topics/impala_txtfile.xml
5 files changed, 223 insertions(+), 224 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Joe McDonnell: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/14863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4385749de35f8379ecf6566fe515ed500b42d6cc
Gerrit-Change-Number: 14863
Gerrit-PatchSet: 6
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kristine Hahn 


[Impala-ASF-CR] IMPALA-9029: [DOCS] Impala 3.4 Release Notes

2020-03-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14863 )

Change subject: IMPALA-9029: [DOCS] Impala 3.4 Release Notes
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4385749de35f8379ecf6566fe515ed500b42d6cc
Gerrit-Change-Number: 14863
Gerrit-PatchSet: 5
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kristine Hahn 
Gerrit-Comment-Date: Tue, 17 Mar 2020 20:13:32 +
Gerrit-HasComments: No


[native-toolchain-CR] Add support for centos8

2020-03-17 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15465 )

Change subject: Add support for centos8
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15465/1/docker/all/postinstall.sh
File docker/all/postinstall.sh:

http://gerrit.cloudera.org:8080/#/c/15465/1/docker/all/postinstall.sh@31
PS1, Line 31: alternatives --set python /usr/bin/python2
Just a question: does this alias pip2 to pip as well?


http://gerrit.cloudera.org:8080/#/c/15465/1/docker/redhat8.df
File docker/redhat8.df:

http://gerrit.cloudera.org:8080/#/c/15465/1/docker/redhat8.df@48
PS1, Line 48: # We get a newer java-1.8.0-openjdk-devel from centos:7.4.
: # The java-1.8.0-openjdk version shipped with centos:7.2 is 
unable to handle ECDHE
: # ciphers.
nit: I guess this comment is stale now



--
To view, visit http://gerrit.cloudera.org:8080/15465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
Gerrit-Change-Number: 15465
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 17 Mar 2020 20:13:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 19:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5512/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 19
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:48:30 +
Gerrit-HasComments: No


[native-toolchain-CR] Remove CYRUS SASL

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15466 )

Change subject: Remove CYRUS_SASL
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2
Gerrit-Change-Number: 15466
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:18:55 +
Gerrit-HasComments: No


[native-toolchain-CR] Remove centos6, ubuntu12, ubuntu14 platforms

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15464 )

Change subject: Remove centos6, ubuntu12, ubuntu14 platforms
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403
Gerrit-Change-Number: 15464
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:18:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 19: Code-Review+2

Added a DCHECK based on feedback from csaba, fixed the long line.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 19
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:01:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 18: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5484/


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:02:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 20: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 20
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:02:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5486/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 20
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:02:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#19).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same way as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31% 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 18: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:49:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15462 )

Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive 
normal form
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5511/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:41:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-17 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15378/6/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/15378/6/shell/impala_client.py@933
PS6, Line 933: _do_hs2_rpc
> yeah, it would be good to understand why we still see failed rpcs then. its
I calculated the time for rpcs and in all cases its much lesser than 240s. It 
probably does depend on lot of other factors, but in general it should be much 
less than 240s. And so most cases where nginx reloads worker processes, there 
are no connection drops. I have however seen nginx still drop connections (and 
not wait for graceful timeout of 240s), if there are multiple reloads in a very 
short period of time which is a possible scenario and something we've seen. I 
have a test which reproduces this scenario.

I think, in general the assumption that there will always be a stable 
connection isn’t really true for cloud deployments.



--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:30:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9029: [DOCS] Impala 3.4 Release Notes

2020-03-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15451 )

Change subject: IMPALA-9029: [DOCS] Impala 3.4 Release Notes
..


Patch Set 1:

You can abandon this change, now that it is addressed in the other review.


--
To view, visit http://gerrit.cloudera.org:8080/15451
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I31e9fcc1e1aa98c784b2c597a6df5aeb75be44c5
Gerrit-Change-Number: 15451
Gerrit-PatchSet: 1
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:27:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-17 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..


Patch Set 7:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@645
PS7, Line 645: max_tries
nit: document what this variable represents


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@649
PS7, Line 649: self.retry_sleep_duration_s = 2
why 2 seconds?


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@674
PS7, Line 674: max_tries = self.max_tries
nit: why is this necessary?


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@679
PS7, Line 679: execute_query
would it make sense to add an option to execute_query that adds the option 
'retry_on_error' and just pass it through to '_do_hs2_rpc'? that way we don't 
have to implement the retry logic twice. once here and again in _do_hs2_rpc


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@689
PS7, Line 689: {1}
nit: 'type={1}'


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@693
PS7, Line 693:   if set_all_handle is not None:
 : self.close_query(set_all_handle)
should this be in a finally block?


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@934
PS7, Line 934: """Executes the provided 'rpc' callable and tranlates any 
exceptions in the
nit: document new option, would recommend including some docs explaining how 
self.max_tries and retry_on_error interact


http://gerrit.cloudera.org:8080/#/c/15378/7/shell/impala_client.py@960
PS7, Line 960: Num remaining tries: {3}
i think this is potentially confusing to this for all rpcs, especially for 
those that can't be retried. from a client perspective, it makes it sound like 
the rpc can be retried and the retries were exhaustive, when in reality the rpc 
was not retried at all


http://gerrit.cloudera.org:8080/#/c/15378/7/tests/custom_cluster/test_hs2_fault_injection.py
File tests/custom_cluster/test_hs2_fault_injection.py:

http://gerrit.cloudera.org:8080/#/c/15378/7/tests/custom_cluster/test_hs2_fault_injection.py@123
PS7, Line 123:   @pytest.mark.execute_serially
why do these all need to be executed serially?



--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 7
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:12:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: Add CentOS 8.1 support to bootstrap system.sh

2020-03-17 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15461 )

Change subject: WIP: Add CentOS 8.1 support to bootstrap_system.sh
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15461/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15461/1//COMMIT_MSG@27
PS1, Line 27: - TOOLCHAIN_ID is bumped to a build that already has CentOS 8 
binaries.
Add remark for updating  the docker-based tests as well


http://gerrit.cloudera.org:8080/#/c/15461/1//COMMIT_MSG@28
PS1, Line 28:
Add description of chronyd vs ntpd for Centos8


http://gerrit.cloudera.org:8080/#/c/15461/1/bin/bootstrap_system.sh
File bin/bootstrap_system.sh:

http://gerrit.cloudera.org:8080/#/c/15461/1/bin/bootstrap_system.sh@262
PS1, Line 262:
fix indentation


http://gerrit.cloudera.org:8080/#/c/15461/1/bin/bootstrap_system.sh@282
PS1, Line 282: redhat8 inycloud sudo alternatives --install /usr/bin/python 
python /usr/bin/python2 90 --slave /usr/bin/pip pip /usr/bin/pip2
line too long



--
To view, visit http://gerrit.cloudera.org:8080/15461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67a58ec007219020e1fb562216d7a0d1ff38b0bd
Gerrit-Change-Number: 15461
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:02:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9042: Milestone 1: properly scan files that has full ACID schema

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15395 )

Change subject: IMPALA-9042: Milestone 1: properly scan files that has full 
ACID schema
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5510/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb
Gerrit-Change-Number: 15395
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:01:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

2020-03-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15462 )

Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive 
normal form
..


Patch Set 3:

> Patch Set 2:
>
> (4 comments)

Fixed


--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:00:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

2020-03-17 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15462 )

Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive 
normal form
..


Patch Set 3:

> Patch Set 1:
>
> (11 comments)

Fixed


--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 17 Mar 2020 17:59:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

2020-03-17 Thread Aman Sinha (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15462

to look at the new patch set (#3).

Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive 
normal form
..

IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

Added an expression rewrite rule to convert a disjunctive predicate to
conjunctive normal form (CNF). Converting to CNF enables multi-table
predicates that were only evaluated by a Join operator to be converted
into either single-table conjuncts that are eligible for predicate pushdown
to the scan operator or other multi-table conjuncts that are eligible to
be pushed to a Join below. This helps improve performance for such queries.

Since converting to CNF expands the number of expressions, we place a
limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr)
that are considered. Once the MAX_CNF_EXPRS limit (default is 100) is
exceeded, whatever expression was supplied to the rule is returned without
further transformation. A setting of -1 or 0 allows unlimited number of
CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES
enables or disables the entire rewrite. This is False by default until we
have done more thorough functional testing.

Examples of rewrites:
 original: (a AND b) OR c
 rewritten: (a OR c) AND (b OR c)

 original: (a AND b) OR (c AND d)
 rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d)

 original: NOT(a OR b)
 rewritten: NOT(a) AND NOT(b)

Testing:
 - Added new unit tests with variations of disjunctive predicates
   and verified their Explain plans
 - Manually tested the result correctness on impala shell by running
   these queries with ENABLE_CNF_REWRITES enabled and disabled
 - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor
   shows almost 5x improvement:
  Original baseline: 47.5 sec
  With this patch and CNF rewrite enabled: 9.4 sec

Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
---
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
10 files changed, 532 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/3
--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-17 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15378/6/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/15378/6/shell/impala_client.py@933
PS6, Line 933: _do_hs2_rpc
> I was able to test connection drops by killing worker processes in nginx.
yeah, it would be good to understand why we still see failed rpcs then. its 
possible some RPCs take longer than 240 seconds under high load, although I'm 
not sure.

given my current understanding of the nginx + istio issues, i'm still not sure 
why the retries are necessary. if RPCs finish under 240 seconds, then no 
connection should be dropped. looks like test_connection_drop confirms that the 
connection is reset between rpcs, which is good to know and validate.

regardless, retrying idempotent rpcs is probably good to do in general, so I 
think this patch makes sense. dealing with non-idempotent rpcs will be 
trickier, so if we can find a way to avoid doing that (e.g. is there something 
else in istio / nginx causing the connection drops?), that would be nice.



--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Tue, 17 Mar 2020 17:50:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 20:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5509/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 20
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 17:38:51 +
Gerrit-HasComments: No


[native-toolchain-CR] WIP: Add platform tag for Ubuntu 14.04

2020-03-17 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15469 )

Change subject: WIP: Add platform tag for Ubuntu 14.04
..


Patch Set 1:

I must be missing something here: I was convinced that the earlier change, 
https://gerrit.cloudera.org/c/15464/1 removed the Ubuntu-14.04 toolchain build 
(so artifacts wouldn't be prodiced either). How come that the build is still 
producing artifacts then? Wouldn't the right fix be to make sure that 
Ubuntu-14.04 bits are just not built?


--
To view, visit http://gerrit.cloudera.org:8080/15469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32dac434ddf12ec3a2367f6a0974b3504bd137c5
Gerrit-Change-Number: 15469
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Tue, 17 Mar 2020 17:37:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15047 )

Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1
Gerrit-Change-Number: 15047
Gerrit-PatchSet: 12
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 17 Mar 2020 17:34:20 +
Gerrit-HasComments: No


[native-toolchain-CR] Remove CYRUS SASL

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15466


Change subject: Remove CYRUS_SASL
..

Remove CYRUS_SASL

CYRUS_SASL requires libdb4 which is not available in rhel8.

Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2
---
M buildall.sh
1 file changed, 0 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/66/15466/1
--
To view, visit http://gerrit.cloudera.org:8080/15466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2
Gerrit-Change-Number: 15466
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 


[native-toolchain-CR] Remove centos6, ubuntu12, ubuntu14 platforms

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15464


Change subject: Remove centos6, ubuntu12, ubuntu14 platforms
..

Remove centos6, ubuntu12, ubuntu14 platforms

This commit removes unused platform redhat6, and EOL platforms ubuntu12 and 
ubuntu14.

Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403
---
D docker/redhat6.df
D docker/ubuntu1404.df
M in-docker.py
3 files changed, 0 insertions(+), 108 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/64/15464/1
--
To view, visit http://gerrit.cloudera.org:8080/15464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403
Gerrit-Change-Number: 15464
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 


[native-toolchain-CR] Fix bison compilation with glibc 2.28

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15467


Change subject: Fix bison compilation with glibc 2.28
..

Fix bison compilation with glibc 2.28

Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a
---
M .gitignore
M buildall.sh
A source/bison/bison-3.0.4-patches/110-glibc-change-work-around.patch
3 files changed, 35 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/67/15467/1
--
To view, visit http://gerrit.cloudera.org:8080/15467
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a
Gerrit-Change-Number: 15467
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 


[native-toolchain-CR] WIP: Add platform tag for Ubuntu 14.04

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15469


Change subject: WIP: Add platform tag for Ubuntu 14.04
..

WIP: Add platform tag for Ubuntu 14.04

Without this the packages for Ubuntu 14.04 are uploaded with tarball
names ending in 'generic', which makes bootstrap_toolchain.py unable
to find them later, during an Impala build.

Change-Id: I32dac434ddf12ec3a2367f6a0974b3504bd137c5
---
M in-docker.py
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/69/15469/1
--
To view, visit http://gerrit.cloudera.org:8080/15469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I32dac434ddf12ec3a2367f6a0974b3504bd137c5
Gerrit-Change-Number: 15469
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 


[native-toolchain-CR] Add support for centos8

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15465


Change subject: Add support for centos8
..

Add support for centos8

This commit adds a new docker image..

Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
---
M docker/all/assert-dependencies-present.py
M docker/all/postinstall.sh
A docker/redhat8.df
M in-docker.py
4 files changed, 72 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/65/15465/1
--
To view, visit http://gerrit.cloudera.org:8080/15465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
Gerrit-Change-Number: 15465
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 


[native-toolchain-CR] Compile thrift using the python in our toolchain.

2020-03-17 Thread Hector Acosta (Code Review)
Hector Acosta has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15468


Change subject: Compile thrift using the python in our toolchain.
..

Compile thrift using the python in our toolchain.

Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60
---
M buildall.sh
M source/python/build.sh
M source/thrift/build.sh
3 files changed, 8 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/68/15468/1
--
To view, visit http://gerrit.cloudera.org:8080/15468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60
Gerrit-Change-Number: 15468
Gerrit-PatchSet: 1
Gerrit-Owner: Hector Acosta 


[Impala-ASF-CR] IMPALA-9042: Milestone 1: properly scan files that has full ACID schema

2020-03-17 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15395 )

Change subject: IMPALA-9042: Milestone 1: properly scan files that has full 
ACID schema
..


Patch Set 3:

(7 comments)

Thanks for reviewing it. Most of the changes are related to tests.

For new reviewers: I think it's worth to start with PS2 because it contains the 
essence of this CR.

http://gerrit.cloudera.org:8080/#/c/15395/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15395/2//COMMIT_MSG@63
PS2, Line 63: TODO:
> Also need tests on column masking since we also have some hacks in path res
Updated test_ranger.py


http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/hdfs-orc-scanner.cc@185
PS2, Line 185:   if (scan_node_->hdfs_table()->IsFullAcid() && 
!schema_resolver_->IsFullAcid()) {
> I think this is too strict. The test on file schema can be false positive.
Done


http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/orc-metadata-utils.cc
File be/src/exec/orc-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/orc-metadata-utils.cc@101
PS2, Line 101: DCHECK(table_idx >= num_part_cols);
> I think this can be a DCHECK since partition columns should be skipped by h
Done


http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/orc-metadata-utils.cc@106
PS2, Line 106:
> Could you add a DCHECK that the resulted index won't overflow?
Done


http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/orc-metadata-utils.cc@234
PS2, Line 234: root_->getSubtype(0)->getKind() != orc::TypeKind::INT ||
 :   root_->getSubtype(1)->getKind() !
> I think we should also check other fields and their types.
Done


http://gerrit.cloudera.org:8080/#/c/15395/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/15395/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@847
PS2, Line 847: HIVEFULLACIDWRITE
> We don't have write support yet. Is this required somewhere?
We need it to create tables.


http://gerrit.cloudera.org:8080/#/c/15395/2/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/15395/2/tests/query_test/test_scanners.py@1297
PS2, Line 1297: self.client.execute("create table %s.%s like tpch.lineitem 
stored as orc" % (db, tbl))
> Do we need this? Doesn't the table being translated to EXTERNAL table?
Removed it.



--
To view, visit http://gerrit.cloudera.org:8080/15395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb
Gerrit-Change-Number: 15395
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 17 Mar 2020 17:18:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9042: Milestone 1: properly scan files that has full ACID schema

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15395 )

Change subject: IMPALA-9042: Milestone 1: properly scan files that has full 
ACID schema
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15395/3/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/15395/3/testdata/bin/generate-schema-statements.py@319
PS3, Line 319: '
flake8: E129 visually indented line with same indent as next logical line


http://gerrit.cloudera.org:8080/#/c/15395/3/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

http://gerrit.cloudera.org:8080/#/c/15395/3/tests/query_test/test_scanners_fuzz.py@197
PS3, Line 197: .
flake8: E501 line too long (91 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/15395/3/tests/query_test/test_scanners_fuzz.py@282
PS3, Line 282:
flake8: E261 at least two spaces before inline comment


http://gerrit.cloudera.org:8080/#/c/15395/3/tests/query_test/test_scanners_fuzz.py@300
PS3, Line 300: n
flake8: E129 visually indented line with same indent as next logical line



--
To view, visit http://gerrit.cloudera.org:8080/15395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb
Gerrit-Change-Number: 15395
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 17 Mar 2020 17:15:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9042: Milestone 1: properly scan files that has full ACID schema

2020-03-17 Thread Zoltan Borok-Nagy (Code Review)
Hello Quanlong Huang, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15395

to look at the new patch set (#3).

Change subject: IMPALA-9042: Milestone 1: properly scan files that has full 
ACID schema
..

IMPALA-9042: Milestone 1: properly scan files that has full ACID schema

Full ACID row format looks like this:

{
  "operation": 0,
  "originalTransaction": 1,
  "bucket": 536870912, "rowId": 0,
  "currentTransaction": 1,
  "row": {"i": 1}
}

User columns are nested under "row". In the frontend we need to create
slot descriptors that correspond to the file schema. In the catalog we
could mimic the file schema but that would introduce several
complexities and corner cases in column resolution. Also in query
results the heading of the above user column would be "row.i". Star
expansion should also be modified, etc.

Because of that in the Catalog I create the exact opposite of the above
schema:

{
  "row__id":
  {
"operation": 0,
"originalTransaction": 1,
"bucket": 536870912,
"rowId": 0,
"currentTransaction": 1
  }
  "i": 1
}

This way very little modification is needed in the frontend. And the
hidden columns can be easily retrieved via 'SELECT row__id.*' when we
need those for debugging/testing.

We only need to change Path.getAbsolutePath() to return a schema path
that corresponds to the file schema. Also in the backend we need some
extra juggling in OrcSchemaResolver::ResolveColumn() to retrieve the
table schema path from the file schema path.

Testing:
I changed data loading to load ORC files in full ACID format by default.
With this change we should be able to scan full ACID tables that are
not minor-compacted, don't have deleted rows, and don't have original
files.

Newly added Tests:
 * specific queries about hidden columns (full-acid-rowid.test)
 * SHOW CREATE TABLE (show-create-table-full-acid.test)
 * INSERT should be forbidden (acid-negative.test)
 * added tests for column masking (
   ranger_column_masking_complex_types.test)

TODO:
 * Currently ALTER TABLE is enabled => assess consequences
 * make all tests green in exhaustive

Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb
---
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/orc-metadata-utils.cc
M be/src/exec/orc-metadata-utils.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M common/thrift/CatalogObjects.thrift
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
M testdata/bin/generate-schema-statements.py
M testdata/datasets/README
M testdata/datasets/functional/functional_schema_template.sql
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-query/queries/DataErrorsTest/orc-type-checks.test
M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test
M 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test
A testdata/workloads/functional-query/queries/QueryTest/full-acid-rowid.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
A 
testdata/workloads/functional-query/queries/QueryTest/show-create-table-full-acid.test
M tests/authorization/test_ranger.py
M tests/metadata/test_show_create_table.py
M tests/query_test/test_acid.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_scanners.py
M tests/query_test/test_scanners_fuzz.py
43 files changed, 1,184 insertions(+), 517 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/15395/3
--
To view, 

[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15462 )

Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive 
normal form
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5508/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 17 Mar 2020 17:11:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9467: [DOCS] live progress enabled by default in interactive mode

2020-03-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15442 )

Change subject: IMPALA-9467: [DOCS] live_progress enabled by default in 
interactive mode
..

IMPALA-9467: [DOCS] live_progress enabled by default in interactive mode

The following documents were impacted by the change:
- impala_live_progress.xml, revised to explain new behavior
- impala_shell_options.xml, added --disable_live_progress option

Change-Id: I94e624b7bb916ecb5aeb4f007c0610807f7b18cf
Reviewed-on: http://gerrit.cloudera.org:8080/15442
Tested-by: Impala Public Jenkins 
Reviewed-by: Alice Fan 
Reviewed-by: Joe McDonnell 
---
M docs/shared/impala_common.xml
M docs/topics/impala_live_progress.xml
M docs/topics/impala_shell_options.xml
3 files changed, 33 insertions(+), 32 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Alice Fan: Looks good to me, but someone else must approve
  Joe McDonnell: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/15442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I94e624b7bb916ecb5aeb4f007c0610807f7b18cf
Gerrit-Change-Number: 15442
Gerrit-PatchSet: 3
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Alice Fan 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9467: [DOCS] live progress enabled by default in interactive mode

2020-03-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15442 )

Change subject: IMPALA-9467: [DOCS] live_progress enabled by default in 
interactive mode
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15442
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94e624b7bb916ecb5aeb4f007c0610807f7b18cf
Gerrit-Change-Number: 15442
Gerrit-PatchSet: 2
Gerrit-Owner: Kristine Hahn 
Gerrit-Reviewer: Alice Fan 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:57:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5485/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 20
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:54:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15132/20/bin/impala-shell.sh
File bin/impala-shell.sh:

http://gerrit.cloudera.org:8080/#/c/15132/20/bin/impala-shell.sh@55
PS20, Line 55: PYTHONPATH=${PYTHONPATH} exec "${IMPALA_PYTHON_EXECUTABLE}" 
${SHELL_HOME}/impala_shell.py "$@"
line too long (94 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 20
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:55:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-17 Thread David Knupp (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15132

to look at the new patch set (#20).

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..

IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

This patch makes the impala-shell code cross-compatible with python 2 and
python 3. The goal is wind up with a version of the shell that will pass
python e2e tests irrepsective of the version of python used to launch the shell,
under the assumption that the test framework itself will continue to run with
python 2.7.x.

There are a few isolated tests that weren't able to pass under both versions,
and the reasons have been documented in comments in the test themselves.

Notable changes for reviewers to consider:

- With regard to validating the patch, my assumption is that simply passing
  the existing set of e2e shell tests is sufficient to confirm that the shell
  is functioning properly. No new tests were added.

- Many of the simpler changes derive from the fact that a few built-in functions
  and/or types have either been removed or have else changed in python 3.x,
  E.g., xrange and basestring are both gone, dict.iteritems() has been removed,
  dict.items() behaves differently, the unicode() function and the method
  str.decode() have both been removed, etc.

  Also, catching exceptions using "Exception, e" no longer works, and (as most
  know), using print() as a function is required now.

- A new pytest command line option was added in conftest.py to enable a user
  to specify a path to an alternate impala-shell executable to test. It's
  possible to use this to point to an instance of the impala-shell that was
  installed as a standalone python package in a separate virtualenv.

  Example usage:
  USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=//bin/impala-shell -sv shell/test_shell_commandline.py

  The target virtualenv may be based on either python3 or python2. However,
  this has no effect on the version of python used to run the test framework,
  which remains tied to python 2.7.x for the foreseeable future.

- The $IMPALA_HOME/bin/impala-shell.sh now sets up the impala-shell python
  environment independenty from bin/set-pythonpath.sh. (See IMPALA-9489)

- thrift_sasl.py was updated to match the current public alpha, 0.4a1

- The wording of the header changed a bit to include the python version
  used to run the shell.

Starting Impala Shell with no authentication using Python 3.7.5
Opened TCP connection to localhost:21000
...

OR

Starting Impala Shell with LDAP-based authentication using Python 2.7.12
Opened TCP connection to localhost:21000
...

- By far, the biggest hassle has been juggling str versus unicode versus
  bytes data types. Python 2.x was fairly loose and inconsistent in
  how it dealt with strings. As a quick demo of what I mean:

  Python 2.7.12 (default, Nov 12 2018, 14:36:49)
  [GCC 5.4.0 20160609] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> d = 'like a duck'
  >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == 
d.decode('utf-8')
  True

  ...and yet there are weird unexpected gotchas.

  >>> d.decode('utf-8') == d.encode('utf-8')
  True
  >>> d.encode('utf-8') == bytearray(d, 'utf-8')
  True
  >>> d.decode('utf-8') == bytearray(d, 'utf-8')   # fails the eq property?
  False

  As a result of this, the way we handled strings in the impala-shell code had
  become equally loose and inconsistent -- mainly in the form of frequent and
  liberal use of str.encode() and str.decode() -- but things still just worked.

  In python3, there's a much clearer distinction between strings and bytes, and
  as such, much tighter type consistency is expected by standard libs like
  subprocess, re, sqlparse, prettytable, etc., which are used throughout the
  shell. Even simple calls that worked in python 2.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  ['foo']

  ...can throw exceptions in python 3.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  Traceback (most recent call last):
File "", line 1, in 
File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall
  return _compile(pattern, flags).findall(string)
  TypeError: cannot use a string pattern on a bytes-like object

  Exceptions like this resulted in a many, if not most shell tests failing
  under python 3.

  At first, I tried to go one-by-one to the site of each failure, and correct
  by checking instance type and re-encoding as necessary, but this only led to
  even more str.encode() calls littering the code, which just seemed like a
  code-smell. (Wiki "code smell" if you don't know the term.)

  What ultimately seemed like a better approach was to try to weed out as many
  existing spurious 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5507/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:46:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15462 )

Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive 
normal form
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java:

http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@122
PS2, Line 122: (CompoundPredicate) 
CompoundPredicate.createDisjunctivePredicate(disjuncts);
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@127
PS2, Line 127: (CompoundPredicate) 
CompoundPredicate.createDisjunctivePredicate(disjuncts);
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@140
PS2, Line 140: (CompoundPredicate) 
CompoundPredicate.createDisjunctivePredicate(disjuncts);
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@145
PS2, Line 145: (CompoundPredicate) 
CompoundPredicate.createDisjunctivePredicate(disjuncts);
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:26:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

2020-03-17 Thread Aman Sinha (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15462

to look at the new patch set (#2).

Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive 
normal form
..

IMPALA-9183: Convert disjunctive predicates to conjunctive normal form

Added an expression rewrite rule to convert a disjunctive predicate to
conjunctive normal form (CNF). Converting to CNF enables multi-table
predicates that were only evaluated by a Join operator to be converted
into either single-table conjuncts that are eligible for predicate pushdown
to the scan operator or other multi-table conjuncts that are eligible to
be pushed to a Join below. This helps improve performance for such queries.

Since converting to CNF expands the number of expressions, we place a
limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr)
that are considered. Once the MAX_CNF_EXPRS limit (default is 100) is
exceeded, whatever expression was supplied to the rule is returned without
further transformation. A setting of -1 or 0 allows unlimited number of
CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES
enables or disables the entire rewrite. This is False by default until we
have done more thorough functional testing.

Examples of rewrites:
 original: (a AND b) OR c
 rewritten: (a OR c) AND (b OR c)

 original: (a AND b) OR (c AND d)
 rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d)

 original: NOT(a OR b)
 rewritten: NOT(a) AND NOT(b)

Testing:
 - Added new unit tests with variations of disjunctive predicates
   and verified their Explain plans
 - Manually tested the result correctness on impala shell by running
   these queries with ENABLE_CNF_REWRITES enabled and disabled
 - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor
   shows almost 5x improvement:
  Original baseline: 47.5 sec
  With this patch and CNF rewrite enabled: 9.4 sec

Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
---
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
10 files changed, 532 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/2
--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 2
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9183: Convert certain disjunctive predicates to conjunctive normal form

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15462 )

Change subject: IMPALA-9183: Convert certain disjunctive predicates to 
conjunctive normal form
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5506/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:23:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5484/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:02:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15096/17/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/17/be/src/exec/partitioned-hash-join-builder.cc@188
PS17, Line 188:   // the AddBarrierToCancel() mechanism ensures that 
cancellation happens after the overall
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:02:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 18: Code-Review+1

I'm going to carry as a +1. Just checking with Csaba if he's going to look 
again.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:01:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#17).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same way as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31% 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 16:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15096/16/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/15096/16/be/src/exec/partitioned-hash-join-builder.h@650
PS16, Line 650: anr
> nit: and
Done


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc@708
PS15, Line 708: probe_barrier_->Wait
> 
Agree this is an issue. I think there are multiple issues like this and I 
wanted to revisit the timing as a separate follow-up JIRA - IMPALA-9422. In 
this specific case I'm not sure if it's better to count the time against the 
builder or if that would be misleading.


http://gerrit.cloudera.org:8080/#/c/15096/16/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/15096/16/fe/src/main/java/org/apache/impala/planner/JoinNode.java@193
PS16, Line 193: fragments
> nit: fragment
Done


http://gerrit.cloudera.org:8080/#/c/15096/16/tests/query_test/test_spilling.py
File tests/query_test/test_spilling.py:

http://gerrit.cloudera.org:8080/#/c/15096/16/tests/query_test/test_spilling.py@150
PS16, Line 150: splitsb
> nit: typo
Done



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:01:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9183: Convert certain disjunctive predicates to conjunctive normal form

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15462 )

Change subject: IMPALA-9183: Convert certain disjunctive predicates to 
conjunctive normal form
..


Patch Set 1:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/15462/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/15462/1/common/thrift/ImpalaInternalService.thrift@413
PS1, Line 413:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@486
PS1, Line 486:   rules.add(new 
ConvertToCNFRule(queryCtx.getClient_request().getQuery_options().getMax_cnf_exprs(),
line too long (108 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java:

http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@121
PS1, Line 121: Expr lhs1 = (CompoundPredicate) 
CompoundPredicate.createDisjunctivePredicate(disjuncts);
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@125
PS1, Line 125: Expr rhs1 = (CompoundPredicate) 
CompoundPredicate.createDisjunctivePredicate(disjuncts);
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@126
PS1, Line 126: Predicate newPredicate = (CompoundPredicate) 
CompoundPredicate.createConjunction(lhs1, rhs1);
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@136
PS1, Line 136: Expr lhs1 = (CompoundPredicate) 
CompoundPredicate.createDisjunctivePredicate(disjuncts);
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@140
PS1, Line 140: Expr rhs1 = (CompoundPredicate) 
CompoundPredicate.createDisjunctivePredicate(disjuncts);
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@141
PS1, Line 141: Predicate newPredicate = (CompoundPredicate) 
CompoundPredicate.createConjunction(lhs1, rhs1);
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@165
PS1, Line 165: Predicate newPredicate = (CompoundPredicate) 
CompoundPredicate.createConjunction(lhs1, rhs1);
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java@817
PS1, Line 817: RewritesOk("(int_col > 10 AND float_col < 5.0) OR (int_col < 
20 AND float_col > 15.0)", rule,
line too long (97 > 90)


http://gerrit.cloudera.org:8080/#/c/15462/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java@819
PS1, Line 819: "AND int_col > 10 OR float_col > 15.0 AND 
int_col > 10 OR int_col < 20");
line too long (93 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 17 Mar 2020 15:37:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9183: Convert certain disjunctive predicates to conjunctive normal form

2020-03-17 Thread Aman Sinha (Code Review)
Aman Sinha has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15462


Change subject: IMPALA-9183: Convert certain disjunctive predicates to 
conjunctive normal form
..

IMPALA-9183: Convert certain disjunctive predicates to conjunctive normal form

Added an expression rewrite rule to convert a disjunctive predicate to
conjunctive normal form (CNF). Converting to CNF enables multi-table
predicates that were only evaluated by a Join operator to be converted
into either single-table conjuncts that are eligible for predicate pushdown
to the scan operator or other multi-table conjuncts that are eligible to
be pushed to a Join below. This helps improve performance for such queries.

Since converting to CNF expands the number of expressions, we place a
limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr)
that are considered. Once the MAX_CNF_EXPRS limit (default is 100) is
exceeded, whatever expression was supplied to the rule is returned without
further transformation. A setting of -1 or 0 allows unlimited number of
CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES
enables or disables the entire rewrite. This is False by default until we
have done more thorough functional testing.

Examples of rewrites:
 original: (a AND b) OR c
 rewritten: (a OR c) AND (b OR c)

 original: (a AND b) OR (c AND d)
 rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d)

 original: NOT(a OR b)
 rewritten: NOT(a) AND NOT(b)

Testing:
 - Added new unit tests with variations of disjunctive predicates
   and verified their Explain plans
 - Manually tested the result correctness on impala shell by running
   these queries with ENABLE_CNF_REWRITES enabled and disabled
 - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor
   shows almost 5x improvement:
  Original baseline: 47.5 sec
  With this patch and CNF rewrite enabled: 9.4 sec

Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
---
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test
10 files changed, 522 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/1
--
To view, visit http://gerrit.cloudera.org:8080/15462
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Gerrit-Change-Number: 15462
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha 


[Impala-ASF-CR] WIP: Add CentOS 8.1 support to bootstrap system.sh

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15461 )

Change subject: WIP: Add CentOS 8.1 support to bootstrap_system.sh
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5505/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67a58ec007219020e1fb562216d7a0d1ff38b0bd
Gerrit-Change-Number: 15461
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 17 Mar 2020 14:25:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: Add CentOS 8.1 support to bootstrap system.sh

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15461 )

Change subject: WIP: Add CentOS 8.1 support to bootstrap_system.sh
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15461/1/bin/bootstrap_system.sh
File bin/bootstrap_system.sh:

http://gerrit.cloudera.org:8080/#/c/15461/1/bin/bootstrap_system.sh@282
PS1, Line 282: redhat8 inycloud sudo alternatives --install /usr/bin/python 
python /usr/bin/python2 90 --slave /usr/bin/pip pip /usr/bin/pip2
line too long (126 > 90)


http://gerrit.cloudera.org:8080/#/c/15461/1/bin/bootstrap_system.sh@296
PS1, Line 296: redhat sudo sha512sum -c - <<< 
'487dbd1d7f678a92924ba884a57e910ccb4fe565c554278795a8fdfc80c4e88d81ebc2ccecb5a8f353f0b2076572bb921499a2cadb064e0f44fc406a3c31da20
  apache-ant-1.9.14-bin.tar.gz'
line too long (191 > 90)


http://gerrit.cloudera.org:8080/#/c/15461/1/bin/bootstrap_system.sh@305
PS1, Line 305:   sudo sha512sum -c - <<< 
'2a803f578f341e164f6753e410413d16ab60fabe31dc491d1fe35c984a5cce696bc71f57757d4538fe7738be04065a216f3ebad4ef7e0ce1bb4c51bc36d6be86
  apache-maven-3.5.4-bin.tar.gz'
line too long (187 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/15461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I67a58ec007219020e1fb562216d7a0d1ff38b0bd
Gerrit-Change-Number: 15461
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 17 Mar 2020 14:20:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: Add CentOS 8.1 support to bootstrap system.sh

2020-03-17 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15461


Change subject: WIP: Add CentOS 8.1 support to bootstrap_system.sh
..

WIP: Add CentOS 8.1 support to bootstrap_system.sh

CentOS 8.1 is a new major version of the CentOS family.
It is now stable and popular enough to start supporting it for Impala
development.

Prepare a raw CentOS 8.1 system to support Impala development and testing.
This should work on a standalone computer, on a vrtual machine,
or inside a Docker container.

Details:
- curl is added to the list of required packages, required by
  IMPALA-9149
- snappy-devel moved to the PowerTools repo, so it needs to get
  installed from there
- CentOS 8 has no default Python version. The bootstrap script installs
  (or configures) Python2 with pip2, then makes them the default via the
  "alternatives" mechanism.
- The toolchain package tag "ec2-centos-8" is added to
  bootstrap_toolchain.py
- TOOLCHAIN_ID is bumped to a build that already has CentOS 8 binaries.

Do not merge this change in the current state! The last change will
most likely break other platforms, it will have to be fixed before
this change can be merged to master.

Change-Id: I67a58ec007219020e1fb562216d7a0d1ff38b0bd
---
M bin/bootstrap_system.sh
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M docker/entrypoint.sh
4 files changed, 99 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/15461/1
--
To view, visit http://gerrit.cloudera.org:8080/15461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I67a58ec007219020e1fb562216d7a0d1ff38b0bd
Gerrit-Change-Number: 15461
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15047 )

Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5503/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1
Gerrit-Change-Number: 15047
Gerrit-PatchSet: 12
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 17 Mar 2020 13:26:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6505: Min-Max predicate push down in ORC scanner

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15403 )

Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5504/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845
Gerrit-Change-Number: 15403
Gerrit-PatchSet: 2
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 17 Mar 2020 13:16:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6505: Min-Max predicate push down in ORC scanner

2020-03-17 Thread Norbert Luksa (Code Review)
Norbert Luksa has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15403 )

Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner
..


Patch Set 1:

(6 comments)

Thanks Quanlong and Csaba for reviewing. Addressed some comments, will do the 
testing with the next patch.
Also, I will play with different predicates since now I see huge regression for 
some queries. Eg. the following query takes around twice as much as before:
select * from lineitem where l_suppkey = 10

http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@875
PS1, Line 875: UtcToUnixTime
> Looks like it's an ORC bug that the ORC writers (both Java and C++ versions
Changed to FloorUtcToUnixTimeMillis.
I see Csaba opened ORC-611 regarding the bug.


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@888
PS1, Line 888: case TYPE_VARCHAR:
 : case TYPE_CHAR: {
> I am not sure about the correctness in case of CHAR(N) and VARCHAR(N)
Added padding.
Maybe we should wait here with ORC-612?


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@902
PS1, Line 902: type.GetByteSize()
> I think this should be "literal_expr->type().GetByteSize()" since it's cons
Done


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@914
PS1, Line 914: static_cast((dv16->value() << 64) >> 64)
> Can we cast int128_t to uint64_t directly?
Looks like we can, done.


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@959
PS1, Line 959: orc::Literal literal =
 : GetLiteralSearchArguments(eval, slot_desc->type(), 
_type);
 : 
 : if (fn_name == "lt") {
> This logic is not enough if there is some mismatch between Impala's and ORC
Same as above, wait for ORC-612?


http://gerrit.cloudera.org:8080/#/c/15403/1/be/src/exec/hdfs-orc-scanner.cc@984
PS1, Line 984:   
row_reader_options_.setSearchArgument(std::move(final_sarg));
> Could you add a VLOG_FILE level logging for 'final_sarg'? It would be helpf
Done



--
To view, visit http://gerrit.cloudera.org:8080/15403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845
Gerrit-Change-Number: 15403
Gerrit-PatchSet: 1
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 17 Mar 2020 12:55:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6505: Min-Max predicate push down in ORC scanner

2020-03-17 Thread Norbert Luksa (Code Review)
Norbert Luksa has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/15403 )

Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner
..

IMPALA-6505: Min-Max predicate push down in ORC scanner

This commit implements min/max predicate pushdown for the ORC scanner
leveraging on the external ORC library's search arguments. We build
the search arguments when we open the scanner as we need not to
modify them later.

Also added a query option orc_read_statistics. If the option is set
to true (it is by default) predicate pushdown will take effect,
otherwise it will be skipped.

Tests:
 - Run scanner tests on ORC files.
 - Run TPCH benchmark, here is no improvement, nor regression.
   On the other hand, certain selective queries gained significant
   speed-up.

Further TODO: Bump ORC version since predicate pushdown is not yet
implemented in the upstream ORC lib (in review).

Change-Id: I136622413db21e0941d238ab6aeea901a6464845
---
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-orc-scanner.h
M be/src/exec/orc-metadata-utils.cc
M be/src/exec/orc-metadata-utils.h
M be/src/exprs/scalar-expr.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
10 files changed, 250 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/15403/2
--
To view, visit http://gerrit.cloudera.org:8080/15403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845
Gerrit-Change-Number: 15403
Gerrit-PatchSet: 2
Gerrit-Owner: Norbert Luksa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Norbert Luksa 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output

2020-03-17 Thread Tamas Mate (Code Review)
Tamas Mate has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15199 )

Change subject: IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output
..


Patch Set 7:

Missed to update the AnalysisException's message in the unit tests, will get 
back with a new patch soon.


--
To view, visit http://gerrit.cloudera.org:8080/15199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041
Gerrit-Change-Number: 15199
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 17 Mar 2020 12:50:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15047 )

Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5483/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/15047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1
Gerrit-Change-Number: 15047
Gerrit-PatchSet: 12
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Tue, 17 Mar 2020 12:47:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8361: Propagate predicates of outer-joined InlineView

2020-03-17 Thread Xianqing He (Code Review)
Xianqing He has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/15047 )

Change subject: IMPALA-8361: Propagate predicates of outer-joined InlineView
..

IMPALA-8361: Propagate predicates of outer-joined InlineView

This is an improvement that tries to propagate predicates of the
nullable side of the outer join into inline view.

For example:
SELECT *
FROM functional.alltypessmall a
LEFT JOIN (
SELECT id, upper(string_col) AS upper_val,
length(string_col) AS len
FROM functional.alltypestiny
) b ON a.id = b.id
WHERE b.upper_val is NULL and b.len = 0
Before this change, the predicate b.len=0 can't be migrated into inline
view since that is on the nullable side of an outer join if the
predicate evaluates in the inline view nulls will not be rejected.
However, we can be more aggressive. In particular, some predicates that
must be evaluted at a join node can also be safely evaluted by the
outer-joined inline view. Such predicates are not marked as assigned.
The predicates propagate into the inline view and also be evaluated at
a join node.

We can divide predicates into two types. One that satisfies the condition
that same as Analyzer#canEvalPredicate can be migrated into inline view,
and one that satisfies the below three conditions is safe to be propagated
into the nullable side of an outer join.
1) The predicate needs to be bound by tupleIds.
2) The predicate is not on-clause.
3) The predicate evaluates to false when all its referenced tuples are NULL.

Therefore, 'b.upper_val is NULL' cannot be propagated to inline view but
‘b.len = 0’ can be propagated to inline view.

Tests:
* Add plan tests in inline-view.test
* One baseline plan in inline-view.test, one in nested-collections.test
and two in predicate-propagation.test had to be updated
* Ran the full set of verifications in Impala Public Jenkins

Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
7 files changed, 317 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/15047/12
--
To view, visit http://gerrit.cloudera.org:8080/15047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6c23a45aeb5dd1aa06a95c9aa8628ecbe37ef2c1
Gerrit-Change-Number: 15047
Gerrit-PatchSet: 12
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianqing He 


[Impala-ASF-CR] IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15199 )

Change subject: IMPALA-5308: Resolve confusing Kudu SHOW TABLE STATS output
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5481/


--
To view, visit http://gerrit.cloudera.org:8080/15199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ice4b8df65f0a53fe14b8fbe35d82c9887ab9a041
Gerrit-Change-Number: 15199
Gerrit-PatchSet: 7
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 17 Mar 2020 11:53:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 19: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5482/


--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 19
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 11:50:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9042: Milestone 1: properly scan files that has full ACID schema

2020-03-17 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15395 )

Change subject: WIP IMPALA-9042: Milestone 1: properly scan files that has full 
ACID schema
..


Patch Set 2:

(7 comments)

The solution looks good to me. Thanks for putting this together so quickly!

http://gerrit.cloudera.org:8080/#/c/15395/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15395/2//COMMIT_MSG@63
PS2, Line 63:  * make all tests green in exhaustive
Also need tests on column masking since we also have some hacks in path 
resolution when the table contains nested column and column masking policies on 
other primitive columns (IMPALA-9330).


http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/hdfs-orc-scanner.cc@185
PS2, Line 185:   if (scan_node_->hdfs_table()->IsFullAcid() != 
schema_resolver_->IsFullAcid()) {
I think this is too strict. The test on file schema can be false positive. 
Users that play around with acid orc files may use CREATE TABLE xxx LIKE ORC 
file to create a non-acid table. But with this check they won't be able to read 
it.


http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/orc-metadata-utils.cc
File be/src/exec/orc-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/orc-metadata-utils.cc@101
PS2, Line 101: if (table_idx >= numPartCols) {
I think this can be a DCHECK since partition columns should be skipped by 
https://github.com/apache/impala/blob/4a8221877cc1782b861184f0ccf86238d002af13/be/src/exec/hdfs-orc-scanner.cc#L364


http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/orc-metadata-utils.cc@106
PS2, Line 106: table_idx - numPartCols
Could you add a DCHECK that the resulted index won't overflow?


http://gerrit.cloudera.org:8080/#/c/15395/2/be/src/exec/orc-metadata-utils.cc@234
PS2, Line 234: (root_->getFieldName(0) != "operation" ||
 :root_->getFieldName(5) != "row")
I think we should also check other fields and their types.


http://gerrit.cloudera.org:8080/#/c/15395/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/15395/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@847
PS2, Line 847: HIVEFULLACIDWRITE
We don't have write support yet. Is this required somewhere?


http://gerrit.cloudera.org:8080/#/c/15395/2/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

http://gerrit.cloudera.org:8080/#/c/15395/2/tests/query_test/test_scanners.py@1297
PS2, Line 1297: self.client.execute("alter table %s.%s set 
tblproperties('transactional'='false')" %
Do we need this? Doesn't the table being translated to EXTERNAL table?



--
To view, visit http://gerrit.cloudera.org:8080/15395
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2e2afec00c9a5cf87f1d61b5fe52b0085844bcb
Gerrit-Change-Number: 15395
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 17 Mar 2020 11:41:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-17 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 19:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15132/19/bin/impala-shell.sh
File bin/impala-shell.sh:

http://gerrit.cloudera.org:8080/#/c/15132/19/bin/impala-shell.sh@55
PS19, Line 55: PYTHONPATH=${PYTHONPATH} exec "${IMPALA_PYTHON_EXECUTABLE}" 
${SHELL_HOME}/impala_shell.py "$@"
> line too long (94 > 90)
Ack



--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 19
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 07:56:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15132 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 19:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5482/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/15132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb75e162bac0faeae3e12106c15da39cbfb8b462
Gerrit-Change-Number: 15132
Gerrit-PatchSet: 19
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 07:07:15 +
Gerrit-HasComments: No


  1   2   >