[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 20: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 20
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 23:29:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same way as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31%   | -2.02   | -3.96   |
| TPCH(30) | TPCH-Q9  | parquet / none / none | 15.59 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 19:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5512/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 19
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:48:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 19: Code-Review+2

Added a DCHECK based on feedback from csaba, fixed the long line.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 19
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:01:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 18: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5484/


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:02:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 20: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 20
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:02:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5486/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 20
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 19:02:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#19).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same way as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31% 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 18: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 18:49:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5507/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:46:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5484/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:02:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15096/17/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/17/be/src/exec/partitioned-hash-join-builder.cc@188
PS17, Line 188:   // the AddBarrierToCancel() mechanism ensures that 
cancellation happens after the overall
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 17
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:02:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 18: Code-Review+1

I'm going to carry as a +1. Just checking with Csaba if he's going to look 
again.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 18
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:01:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#17).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same way as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31% 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 16:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15096/16/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/15096/16/be/src/exec/partitioned-hash-join-builder.h@650
PS16, Line 650: anr
> nit: and
Done


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc@708
PS15, Line 708: probe_barrier_->Wait
> 
Agree this is an issue. I think there are multiple issues like this and I 
wanted to revisit the timing as a separate follow-up JIRA - IMPALA-9422. In 
this specific case I'm not sure if it's better to count the time against the 
builder or if that would be misleading.


http://gerrit.cloudera.org:8080/#/c/15096/16/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/15096/16/fe/src/main/java/org/apache/impala/planner/JoinNode.java@193
PS16, Line 193: fragments
> nit: fragment
Done


http://gerrit.cloudera.org:8080/#/c/15096/16/tests/query_test/test_spilling.py
File tests/query_test/test_spilling.py:

http://gerrit.cloudera.org:8080/#/c/15096/16/tests/query_test/test_spilling.py@150
PS16, Line 150: splitsb
> nit: typo
Done



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 16:01:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 16: Code-Review+2

(6 comments)

http://gerrit.cloudera.org:8080/#/c/15096/16/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/15096/16/be/src/exec/partitioned-hash-join-builder.h@650
PS16, Line 650: anr
nit: and


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc@708
PS15, Line 708: probe_barrier_->Wait
> maybe add this wait time to the join node's idle/wait time. This would help



http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/runtime/runtime-state.cc@268
PS15, Line 268:  for (CyclicBarrier* cb2 : cancellation_cbs_) {
  : // Don't add if already present.
  : if (cb == cb2) return;
  :   }
> I figured that this would always be small enough that the linear search wou
makes sense


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/util/cyclic-barrier.h
File be/src/util/cyclic-barrier.h:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/util/cyclic-barrier.h@46
PS15, Line 46:  template 
> This is actually a best practice, cause lambda functions have an "un-nameab
Got it. Thanks for the explanation


http://gerrit.cloudera.org:8080/#/c/15096/16/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/15096/16/fe/src/main/java/org/apache/impala/planner/JoinNode.java@193
PS16, Line 193: fragments
nit: fragment


http://gerrit.cloudera.org:8080/#/c/15096/16/tests/query_test/test_spilling.py
File tests/query_test/test_spilling.py:

http://gerrit.cloudera.org:8080/#/c/15096/16/tests/query_test/test_spilling.py@150
PS16, Line 150: splitsb
nit: typo



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Mar 2020 03:14:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5495/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 14 Mar 2020 01:10:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-13 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15096/16/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/16/be/src/exec/partitioned-hash-join-builder.cc@188
PS16, Line 188:   // the AddBarrierToCancel() mechanism ensures that 
cancellation happens after the overall
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 14 Mar 2020 00:25:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 15:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/15096/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15096/15//COMMIT_MSG@42
PS15, Line 42: was
> nit: way
Done


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/join-builder.cc
File be/src/exec/join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/join-builder.cc@123
PS15, Line 123:   // Don't let probe side pick up the builder when we're going 
to clean it up.
  :   // Query cancellation will propagate to the probe finstance.
  :   ready_to_probe_ = !build_side_state->is_cancelled();
  :   VLOG(2) << "JoinBuilder (id=" << join_node_id_ << ")"
  :   << " all probes complete. cancelled=" << 
build_side_state->is_cancelled()
  :   << " outstanding_probes_=" << outstanding_probes_;
> do we still need this?
Done


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.h@648
PS15, Line 648:   /// Calculates the amount of memory to be transferred for 
probe streams when probing
> nit: per probe thread / join node instance
Done


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc@182
PS15, Line 182:
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-node.cc@1210
PS15, Line 1210: spilled_partitions_.emplace(
   : build_partition->id(), 
std::move(probe_hash_partitions_[i]));
> nit: might be useful to mention in the comment for spilled_partitions_ in t
Done


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/runtime/runtime-state.cc@268
PS15, Line 268:  for (CyclicBarrier* cb2 : cancellation_cbs_) {
  : // Don't add if already present.
  : if (cb == cb2) return;
  :   }
> nit: would it be worth using an unordered_set? or will that be an overkill?
I figured that this would always be small enough that the linear search 
wouldn't be an issue. Plus it avoids including unordered_set, etc in the header 
and dragging that in.


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/util/cyclic-barrier.h
File be/src/util/cyclic-barrier.h:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/util/cyclic-barrier.h@46
PS15, Line 46:  template 
> can we instead use a typedef to define the function signature in order to e
This is actually a best practice, cause lambda functions have an "un-nameable" 
type that encodes all of the parameters, lambda captures, etc. If we used 
std::function the lambda would get converted to that less-efficient 
representation, which usually requires a heap allocation.

https://www.drdobbs.com/cpp/efficient-use-of-lambda-expressions-and/232500059


http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/JoinNode.java@193
PS15, Line 193: fragments
> nit: fragment instances
Done


http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@371
PS15, Line 371: One instance is scheduled per instance of the fragment
> nit: slightly ambiguous because of the "instance" terminology, how about: "
That's clearer.


http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/PlanNode.java@130
PS15, Line 130:   // Estimated number of instances that the scheduler would 
generate for a fragment with
> nit: across all nodes
Done



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-13 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#16).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same way as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31% 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-12 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 15:

(11 comments)

Looks good, mostly nits, have yet to take a look at the tests

http://gerrit.cloudera.org:8080/#/c/15096/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15096/15//COMMIT_MSG@42
PS15, Line 42: was
nit: way


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/join-builder.cc
File be/src/exec/join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/join-builder.cc@123
PS15, Line 123:   // Don't let probe side pick up the builder when we're going 
to clean it up.
  :   // Query cancellation will propagate to the probe finstance.
  :   ready_to_probe_ = !build_side_state->is_cancelled();
  :   VLOG(2) << "JoinBuilder (id=" << join_node_id_ << ")"
  :   << " all probes complete. cancelled=" << 
build_side_state->is_cancelled()
  :   << " outstanding_probes_=" << outstanding_probes_;
do we still need this?


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.h@648
PS15, Line 648:   /// Calculates the amount of memory to be transferred for 
probe streams when probing
nit: per probe thread / join node instance


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc@182
PS15, Line 182:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-builder.cc@708
PS15, Line 708: probe_barrier_->Wait
maybe add this wait time to the join node's idle/wait time. This would help 
debug issues where one thread will be holding up others


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/exec/partitioned-hash-join-node.cc@1210
PS15, Line 1210: spilled_partitions_.emplace(
   : build_partition->id(), 
std::move(probe_hash_partitions_[i]));
nit: might be useful to mention in the comment for spilled_partitions_ in the 
header file when partitions in it can be empty.
And/or update the method comment


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/runtime/runtime-state.cc@268
PS15, Line 268:  for (CyclicBarrier* cb2 : cancellation_cbs_) {
  : // Don't add if already present.
  : if (cb == cb2) return;
  :   }
nit: would it be worth using an unordered_set? or will that be an overkill?


http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/util/cyclic-barrier.h
File be/src/util/cyclic-barrier.h:

http://gerrit.cloudera.org:8080/#/c/15096/15/be/src/util/cyclic-barrier.h@46
PS15, Line 46:  template 
can we instead use a typedef to define the function signature in order to 
enforce the constraint mentioned in the method comment?


http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/JoinNode.java@193
PS15, Line 193: fragments
nit: fragment instances


http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@371
PS15, Line 371: One instance is scheduled per instance of the fragment
nit: slightly ambiguous because of the "instance" terminology, how about: "One 
instance is scheduled per node, for all instances of the fragment containing 
the destination join node.


http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/15096/15/fe/src/main/java/org/apache/impala/planner/PlanNode.java@130
PS15, Line 130:   // Estimated number of instances that the scheduler would 
generate for a fragment with
nit: across all nodes



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 15:

I forgot to mention, but PS15 is a non-WIP, i.e. it has all the tests I planned 
to add and has all the TODOs I had fixed.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 11 Mar 2020 18:32:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5462/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 09 Mar 2020 22:07:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5460/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 09 Mar 2020 22:02:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-09 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#15).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same was as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31% 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 14:

More testing revealed that the cancellation approach didn't work right in all 
cases - a deadlock could happen with a thread in 
PhjBuilder::HandoffToProbesAndWait() and one or more, but not all of, the probe 
threads blocked on probe_barrier_. An external cancellation will not wake up 
any of these threads.

Switched to using the same cancellation approach uniformly where 
RuntimeState::Cancel() cancels barriers as well as signalling condition 
variables. So if the coordinator cancels the query, it gets propagated. Or if 
one of the other threads on the backend fails, it unblocks the remaining 
threads.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 09 Mar 2020 21:23:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-09 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#14).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when cancelling
fragments on the backend.

BufferPool now correctly handles multiple threads calling
CleanPages() concurrently, which makes other methods thread-safe.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same was as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31% 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15096/13/be/src/runtime/bufferpool/buffer-pool.h
File be/src/runtime/bufferpool/buffer-pool.h:

http://gerrit.cloudera.org:8080/#/c/15096/13/be/src/runtime/bufferpool/buffer-pool.h@392
PS13, Line 392:   /// This is safe to call concurrently from multiple threads, 
as long as those threads
Spilling tests are failing intermittently because this isn't strictly true - 
sometimes CleanPages() doesn't clean enough pages if multiple threads call it 
concurrently for the builder. I'm gonna work on a unit test and a fix.



--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 07 Mar 2020 21:49:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 11:

PS9->PS12 adds some tests and fixes some accounting bugs that were discovered 
with the tests, specifically

* THe spilled partitions got out of sync between the join nodes because of the 
empty probe side optimisation. I just disabled that optimisation because i 
don't think it's worth the complexity.
* I didn't multiply the probe reservation by the number of threads in one place
* num_spilled_probe_rows wasn't added together correctly across threads
* NodeDebugString had an invalid dcheck that got hit when I set 
-vmodule=partitioned-hash-join-node=3


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 07 Mar 2020 00:57:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#12).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when closing the
join builder.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same was as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31%   | -2.02   | -3.96   |
| TPCH(30) | TPCH-Q9  | parquet / none / none | 15.59  | 17.02   | I -8.38% 
  |   0.95%   |   0.43%

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#11).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when closing the
join builder.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same was as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31%   | -2.02   | -3.96   |
| TPCH(30) | TPCH-Q9  | parquet / none / none | 15.59  | 17.02   | I -8.38% 
  |   0.95%   |   0.43%

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 10:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.h
File be/src/exec/partitioned-hash-join-builder.h:

http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.h@444
PS10, Line 444: void IncrementNumSpilledProbeRows(int64_t count) { 
num_spilled_probe_rows_.Add(count); }
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc
File be/src/exec/partitioned-hash-join-builder.cc:

http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@601
PS10, Line 601:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
probe_stream_reservation_=" << probe_stream_reservation_.GetReservation();
line too long (125 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@603
PS10, Line 603:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
probe_stream_reservation_=" << probe_stream_reservation_.GetReservation();
line too long (125 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@627
PS10, Line 627:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
probe_stream_reservation_=" << probe_stream_reservation_.GetReservation();
line too long (125 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@629
PS10, Line 629:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
probe_stream_reservation_=" << probe_stream_reservation_.GetReservation();
line too long (125 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@632
PS10, Line 632: VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
transferring " << probe_reservation << " back to builder.";
line too long (112 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@712
PS10, Line 712: VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
transferring " << probe_reservation << " back to builder.";
line too long (112 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@776
PS10, Line 776: VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
transferring " << probe_reservation << " back to builder.";
line too long (112 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@940
PS10, Line 940:   int64_t saved_probe_reservation = need_probe_buffer ? 
max_row_buffer_size_ * num_probe_threads_ : 0;
line too long (102 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@941
PS10, Line 941:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
probe_stream_reservation_=" << probe_stream_reservation_.GetReservation();
line too long (125 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@944
PS10, Line 944:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
probe_stream_reservation_=" << probe_stream_reservation_.GetReservation();
line too long (125 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@966
PS10, Line 966:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
probe_stream_reservation_=" << probe_stream_reservation_.GetReservation();
line too long (125 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@969
PS10, Line 969:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
probe_stream_reservation_=" << probe_stream_reservation_.GetReservation();
line too long (125 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/be/src/exec/partitioned-hash-join-builder.cc@1051
PS10, Line 1051:   VLOG(3) << "PHJ(node_id=" << join_node_id_ << ") 
transferring " << bytes << " back to builder.";
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/15096/10/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@201
PS10, Line 201:   DataStreamSink streamSink = new 
DataStreamSink((ExchangeNode)destNode_, outputPartition_);
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@371
PS10, Line 371:   // join. ParallelPlanner sets the destination fragment 
when adding the JoinBuildSink.
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/15096/10/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@400
PS10, Line 400:   // join. ParallelPlanner sets the destination fragment 
when adding the 

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#10).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when closing the
join builder.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same was as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31%   | -2.02   | -3.96   |
| TPCH(30) | TPCH-Q9  | parquet / none / none | 15.59  | 17.02   | I -8.38% 
  |   0.95%   |   0.43%

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5449/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 07 Mar 2020 01:28:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5451/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 07 Mar 2020 01:38:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5450/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 07 Mar 2020 01:37:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5454/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 07 Mar 2020 02:43:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Tim Armstrong (Code Review)
Hello Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15096

to look at the new patch set (#13).

Change subject: IMPALA-9156: share broadcast join builds
..

IMPALA-9156: share broadcast join builds

The scheduler will only create one join build finstance per
backend in cases where this is supported.

The builder is aware of the number of finstances executing the
probe and hands off the build data structures to the builders.

Nested loop join requires minimal modifications because the
build data structures are read-only after initial construction.
The only significant change is that memory can't be transferred
to the multiple consumers, so MarkNeedsDeepCopy() needs to be
used instead.

Hash join requires additional synchronisation because the
spilling algorithm mutates build-side data structures. This
patch adds synchronisation so that rebuilding spilled
partitions is done in a thread-safe manner, using a single
thread. This uses the CyclicBarrier added in an earlier patch.

Threads blocked on CyclicBarrier need to be cancellable,
which is handled by cancelling the barrier when closing the
join builder.

Update planner to cost broadcast join and estimate memory
consumption based on a single instance per node.

Planner estimates of number of instances are improved. Instead of
assuming mt_dop instances per node, use the total number of input
splits (also called scan ranges in places) as an upper bound on
the number of instances generated by scans. These instance
estimates from the scan nodes are then propagated up the
plan tree in the same was as the numNodes estimates. The instance
estimate for the join build fragment is fixed to be based on
the destination fragment.

The profile now correctly accounts for time waiting for the
builder, counting it in inactive time and showing it in the
node timeline. Additional improvements/cleanup to the time
accounting are deferring until IMPALA-9422.

Testing:
* Updated planner tests
* Ran a single node stress test with TPC-H and TPC-DS
* Add a targeted test for spilling broadcast joins, both repartitioning
  and not repartitioning.
* Add a targeted test for a spilling broadcast join with empty probe
* Add a targeted test for spilling broadcast join with empty build
  partitions.
* Add a broadcast join to test_cancellation and test_failpoints.

Perf:

I did a single node run on my desktop:
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 6.26| -15.70%| 4.63   | 
-16.16%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.97  | 23.25   | R +7.38% 
  |   0.51%   |   0.22%| 5 | R +6.95%   | 2.31| 27.93   |
| TPCH(30) | TPCH-Q4  | parquet / none / none | 2.83   | 2.79|   +1.31% 
  |   1.86%   |   0.36%| 5 |   +1.88%   | 1.15| 1.53|
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.28   | 1.28|   -0.01% 
  |   1.64%   |   1.63%| 5 |   -0.11%   | -0.58   | -0.01   |
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.65   | 2.68|   -0.94% 
  |   0.84%   |   1.46%| 5 |   -0.21%   | -0.87   | -1.25   |
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.69   | 4.72|   -0.56% 
  |   1.29%   |   0.52%| 5 |   -1.04%   | -1.15   | -0.89   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 10.64  | 10.80   |   -1.48% 
  |   0.61%   |   0.60%| 5 |   -1.39%   | -1.73   | -3.91   |
| TPCH(30) | TPCH-Q15 | parquet / none / none | 4.11   | 4.32|   -4.92% 
  |   0.05%   |   0.40%| 5 |   -4.93%   | -2.31   | -27.46  |
| TPCH(30) | TPCH-Q20 | parquet / none / none | 3.47   | 3.67| I -5.41% 
  |   0.81%   |   0.03%| 5 | I -5.70%   | -2.31   | -15.75  |
| TPCH(30) | TPCH-Q17 | parquet / none / none | 7.58   | 8.14| I -6.93% 
  |   3.13%   |   2.62%| 5 | I -9.31%   | -2.02   | -3.96   |
| TPCH(30) | TPCH-Q9  | parquet / none / none | 15.59  | 17.02   | I -8.38% 
  |   0.95%   |   0.43%

[Impala-ASF-CR] IMPALA-9156: share broadcast join builds

2020-03-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15096 )

Change subject: IMPALA-9156: share broadcast join builds
..


Patch Set 13:

Rebased and fix an overly strict dcheck that was triggered by 
TestSpillingNoDebugActionDimensions::test_spilling_naaj_no_deny_reservation


--
To view, visit http://gerrit.cloudera.org:8080/15096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c67e4b2c87ed0fba648f1e1710addb885d66dc7
Gerrit-Change-Number: 15096
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 07 Mar 2020 01:59:10 +
Gerrit-HasComments: No