Sailesh Mukil has posted comments on this change. (
http://gerrit.cloudera.org:8080/11055 )
Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation
..
Patch Set 3:
(40 comments)
http://gerrit.cloudera.org:8080/#/c/11055/3//COMMIT_MSG
Commit Message:
http://gerrit.cloudera.org:8080/#/c/11055/3//COMMIT_MSG@14
PS3, Line 14:
Add comments about the following, so that it's easier for reviewers:
- Explain in a para or two how your patch achieves the distribution; i.e.
explain your approach in plain english.
- What kind of new failure modes can happen because of this change, as opposed
to before? (Talk about the Exec() RPC race with the UpdateFilter() RPC)
- How long do we expect the aggregators to be accessible? i.e. what is the
lifetime of an aggregator tied to?
- How are you updating the runtime profile?
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator-backend-state.cc@95
PS3, Line 95: x
naming
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator-backend-state.cc@98
PS3, Line 98: LOG(INFO) << "DebuggingPublishFilter AggregatorAddress " <<
tfs.aggregator_address;
remove
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator-backend-state.cc@95
PS3, Line 95: for (auto const& x : filter_routing_table) {
: TFilterState tfs;
: x.second.ToThrift();
: LOG(INFO) << "DebuggingPublishFilter AggregatorAddress " <<
tfs.aggregator_address;
: rpc_params->filter_routing_table.insert(
: std::pair(x.first, tfs));
: }
Add a comment about what you're doing here.
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator.cc@260
PS3, Line 260: x
naming. Call it 'filter' or something similar
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator.cc@262
PS3, Line 262: num_filters_
Is this needed as a member variable? Doesn't seem to be used anywhere.
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator.cc@690
PS3, Line 690: if (params.__isset.filter_updates_received) {
Add a comment above this line:
"Update aggregator's filter metrics in the runtime profile"
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator.cc@691
PS3, Line 691: if (backend_state->GetNumReceivedFilters() <
params.filter_updates_received) {
Add another comment above this line:
"Make sure not to double count filter updates from the same aggregator."
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator.cc@690
PS3, Line 690: if (params.__isset.filter_updates_received) {
: if (backend_state->GetNumReceivedFilters() <
params.filter_updates_received) {
: filter_updates_received_->Add(
: params.filter_updates_received -
backend_state->GetNumReceivedFilters());
:
backend_state->SetNumReceivedFilters(params.filter_updates_received);
: }
: }
There's a race here. If 2 updates for the same Backend execute in parallel,
you'll end up having an incorrect number of updated filters.
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/coordinator.cc@884
PS3, Line 884: LOG(INFO)
: << "DebuggingPublishFilter Coordinator sending filter
to fragment with id "
: << fragment_idx;
remove
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/filter-state.h
File be/src/runtime/filter-state.h:
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/filter-state.h@141
PS3, Line 141: FilterTarget
const ref
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/filter-state.h@150
PS3, Line 150: /// Need to cast the int type of this class to int32_t of thrift
Still needed?
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/filter-state.h@151
PS3, Line 151: set
Why not unordered?
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/filter-state.h@152
PS3, Line 152: int i
int32_t here and remove the cast in the next line. Also, rename to 'idx'.
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/filter-state.h@167
PS3, Line 167: TFilterTarget
const ref
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/filter-state.h@171
PS3, Line 171: boost
std
http://gerrit.cloudera.org:8080/#/c/11055/3/be/src/runtime/filter-state.h@130
PS3, Line 130: bool disabled() const {
: if (is_bloom_filter()) {
: return bloom_filter_.always_true;
: } else {
: DCHECK(is_min_max_filter());
: return