[Impala-ASF-CR] IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress patch for Distributing runtime filtering aggregation.

2018-12-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change. ( 
http://gerrit.cloudera.org:8080/10937 )

Change subject: IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress 
patch for Distributing runtime filtering aggregation.
..


Abandoned

https://gerrit.cloudera.org/#/c/11055/
--
To view, visit http://gerrit.cloudera.org:8080/10937
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I0289ee39787d8e9c13a16c3d7d64f471bc554536
Gerrit-Change-Number: 10937
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress patch for Distributing runtime filtering aggregation.

2018-07-26 Thread Rahul Shivu Mahadev (Code Review)
Rahul Shivu Mahadev has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10937 )

Change subject: IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress 
patch for Distributing runtime filtering aggregation.
..


Patch Set 1:

(9 comments)

Comments addressed in new change https://gerrit.cloudera.org/#/c/11055/

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/common/global-flags.cc@132
PS1, Line 132: DEFINE_bool(enable_distributed_filter_aggregation, true,
 : "Enables aggregation of filters"
 : "in a distributed manner, set to false to revert to 
coordinator based aggregation");
> After looking through the code, it seems better to avoid having this flag.
Will keep it until testing


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@93
PS1, Line 93: x.second.ToThrift();
> Move inside if(), since it's only needed in that case.
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@94
PS1, Line 94: AggregatorRoutingTable::const_iterator it = 
aggregator_routing_table.find(x.first);
: if (it != aggregator_routing_table.end()) {
:   tfs.__set_aggregator_address(it->second);
:   rpc_params->filter_routing_table.insert(
:   std::pair(x.first, tfs));
: }
> Wouldn't all filters that are present in 'filter_routing_table' be present
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@101
PS1, Line 101:
> We also don't need to send the filter routing table to the nodes that don't
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator.cc@284
PS1, Line 284:
> We can set the aggregator address in the FilterState here, and not have the
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-exec-mgr.h@22
PS1, Line 22: #include 
> None of these new #includes should be needed since you haven't added any ne
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-state.cc@559
PS1, Line 559:
> Add a TODO here to call PublishFilterFromAggregator() asynchronously using
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/runtime-filter-bank.cc@205
PS1, Line 205:   SendFilterToAggregator(
> What if the aggregator and the producer are on the same node? We can consid
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/10937/1/bin/bootstrap_toolchain.py@430
PS1, Line 430: "flatbuffers", "gcc", "gflags", "glog",
> I'm guessing this was a mistake?
Done



--
To view, visit http://gerrit.cloudera.org:8080/10937
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0289ee39787d8e9c13a16c3d7d64f471bc554536
Gerrit-Change-Number: 10937
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:22:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress patch for Distributing runtime filtering aggregation.

2018-07-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10937 )

Change subject: IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress 
patch for Distributing runtime filtering aggregation.
..


Patch Set 1:

(9 comments)

I added some high level comments. After addressing these high level comments, 
you can clean up the code so we can do a second round of reviews.

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/common/global-flags.cc@132
PS1, Line 132: DEFINE_bool(enable_distributed_filter_aggregation, true,
 : "Enables aggregation of filters"
 : "in a distributed manner, set to false to revert to 
coordinator based aggregation");
After looking through the code, it seems better to avoid having this flag. 
Let's get rid of coordinator-filter-state.* completely, and go with the new 
approach you've done. When we need to compare the 2 approaches for benchmarking 
purposes, we can use 2 branches.


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@93
PS1, Line 93: x.second.ToThrift();
Move inside if(), since it's only needed in that case.


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@94
PS1, Line 94: AggregatorRoutingTable::const_iterator it = 
aggregator_routing_table.find(x.first);
: if (it != aggregator_routing_table.end()) {
:   tfs.__set_aggregator_address(it->second);
:   rpc_params->filter_routing_table.insert(
:   std::pair(x.first, tfs));
: }
Wouldn't all filters that are present in 'filter_routing_table' be present in 
'aggregator_routing_table' ?

If that's the case, we're doing the same work multiple times here (once per 
BackendState), and we would probably be better off not having the 
'aggregator_routing_table' at all and just set the aggregator address in 
'FilterState' as part of InitFilterRoutingTable().


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@101
PS1, Line 101:
We also don't need to send the filter routing table to the nodes that don't 
aggregate or don't produce right? If it's not simple to do, you can leave a 
TODO for now.


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator.cc@284
PS1, Line 284:
We can set the aggregator address in the FilterState here, and not have the 
aggregator_routing_table.


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-exec-mgr.h@22
PS1, Line 22: #include 
None of these new #includes should be needed since you haven't added any new 
code to this file.


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-state.cc@559
PS1, Line 559:
Add a TODO here to call PublishFilterFromAggregator() asynchronously using a 
thread pool.


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/runtime-filter-bank.cc@205
PS1, Line 205:   SendFilterToAggregator(
What if the aggregator and the producer are on the same node? We can consider 
short-circuiting that path. Add a TODO for that.


http://gerrit.cloudera.org:8080/#/c/10937/1/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/10937/1/bin/bootstrap_toolchain.py@430
PS1, Line 430: "flatbuffers", "gcc", "gflags", "glog",
I'm guessing this was a mistake?



--
To view, visit http://gerrit.cloudera.org:8080/10937
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0289ee39787d8e9c13a16c3d7d64f471bc554536
Gerrit-Change-Number: 10937
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 19 Jul 2018 16:42:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress patch for Distributing runtime filtering aggregation.

2018-07-12 Thread Rahul Shivu Mahadev (Code Review)
Rahul Shivu Mahadev has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10937


Change subject: IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress 
patch for Distributing runtime filtering aggregation.
..

IMPALA-3825: [WIP]Distributed RuntimeFiltering
Work in progress patch for Distributing runtime filtering
aggregation.

Change-Id: I0289ee39787d8e9c13a16c3d7d64f471bc554536
---
M be/src/common/global-flags.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
A be/src/runtime/filter-state.cc
A be/src/runtime/filter-state.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-filter.inline.h
M be/src/service/impala-server.cc
M bin/bootstrap_toolchain.py
M common/thrift/ImpalaInternalService.thrift
18 files changed, 738 insertions(+), 72 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/10937/1
--
To view, visit http://gerrit.cloudera.org:8080/10937
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0289ee39787d8e9c13a16c3d7d64f471bc554536
Gerrit-Change-Number: 10937
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev