Sailesh Mukil has posted comments on this change. Change subject: IMPALA-3610: Account for memory used by filters in the coordinator ......................................................................
Patch Set 9: (13 comments) http://gerrit.cloudera.org:8080/#/c/4066/9/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: Line 262: /// A filter may be discarded if an always_true filter update is received, an OOM is hit, > you use both disabled and discarded to mean the same thing (i think). let's Done Line 303: /// Resets bloom_filter_ to NULL, sets pending_count_ to 0, releases any memory > don't reference internal state in the comment of a public function (the rea Done Line 390: // comprehensive logging of all mem-trackers. > add TODO-MT: remove Done Line 718: // Cast the const away to read non-const member functions. No modification is done. > why do you need non-const access? I can't call non-const functions otherwise, specifically state.targets() and state.src_fragment_instance_idxs(). They need to be non-const because they are written to elsewhere. Is there a better way to do this? Line 726: for (const auto& target: *state.targets()) { > no auto Done Line 2048: for (auto& filter: filter_routing_table_) { > why make this a ref? I don't understad, versus? Not having a ref would implicitly call a copy-constructor. Line 2086: DCHECK_GT(state->pending_count(), 0); > from l2086 to l2090 can also go into ApplyUpdate() Done Line 2099: // If the filter was discarded, we should send out an always_true filter immediately. > this comment doesn't describe the following statement Done. Removed the comment. The next line describes this anyway. Line 2105: // No more filters are pending on this filter ID. Create a distribution payload and > "no more updates are..." Done Line 2107: state->set_completion_time(query_events_->ElapsedTime()); > move into ApplyUpdate This has to be set only once we are sure that we aren't going to wait for anymore updates. So this should be set only once after all updates are received or if a filter is disabled. Line 2119: if (state->desc().is_broadcast_join) { > also swap in the filter in ApplyUpdate for broadcast joins, then you don't That would cause broadcast filters having to swap() twice, which is two times the memory copy which could be noticeable for large filters. Also, we would TryConsume() memory for broadcast filters too then. It's probably okay since we process only one update for broadcast filters, but if we're okay with that I'll make the change quickly. Line 2183: pending_count_ = 0L; > does the pending count still matter at this point (and if so, why?) Not really. Setting it to 0 here has no more meaning. I removed this assignment. http://gerrit.cloudera.org:8080/#/c/4066/9/be/src/util/bloom-filter-test.cc File be/src/util/bloom-filter-test.cc: Line 277: BloomFilter::ToThrift(&bf3, &bf3_thrift); > trailing spaces Done -- To view, visit http://gerrit.cloudera.org:8080/4066 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
