Sailesh Mukil has posted comments on this change. Change subject: IMPALA-3610: Account for memory used by filters in the coordinator ......................................................................
Patch Set 5: (14 comments) http://gerrit.cloudera.org:8080/#/c/4066/5/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: Line 284: /// coarsley for every filter update. > remove "coarsley" (sic) Done Line 289: // Apply an update to the aggregated filter with the received filter. > describe role of params Done Line 292: TPublishFilterParams* rpc_params, const TUniqueId& query_id); > in/out params go at end I removed the out params. Done. Line 343: // logging. > what do the scanner threads do with this mem tracker? When a MemLimitExceeded error is hit, MemTracker::LogUsage() is called which touches all the mem-trackers of that query using the parent-child relationship and gives a report of memory usage up to that point. If the coordinator is torn down before the scanner threads (which happens a lot apparently), the scanner threads will end up touching this already destroyed filter. Line 2102: TPublishFilterParams* rpc_params, const TUniqueId& query_id) { > pass in Coordinator* instead of the mem tracker and query_id, that's more c Done Line 2103: filter_updates_received->Add(1); > move this out into the caller, there is no other use of this parameter (and Done Line 2105: if (!desc.is_broadcast_join) { > if (is_broadcast_join) return; Done Line 2115: disabled = true; > why don't you move this into Discard and then check that explicitly in Upda Done Line 2131: // 'bloom_filter' will process subsequent updates otherwise. > the reason this wasn't apparent is that it wasn't stated what a discarded/i I moved disabled_ = true here, so that it's not so subtle what constitutes as a discarded filter. Also updated comments to reflect what a discarded filter looks like. http://gerrit.cloudera.org:8080/#/c/4066/6/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: Line 2066: TBloomFilter* non_const_filter; > rather than defining non_const_filter temporary here, I think it would be c Done PS6, Line 2068: non_const_filter > this shadows the def at L2066, but as I said above, let's get rid of 2066. Done Line 2074: non_const_filter = state->bloom_filter.get(); > And then here: Done Line 2085: state->DiscardFilter(filter_mem_tracker_.get()); > Does this do anything in cases #1 and #3, or is it only for case #2 (aggreg It's essentially a no-op for cases #1 and #3. I thought of moving it to L2077 but I felt it was more readable this way. http://gerrit.cloudera.org:8080/#/c/4066/4/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: Line 82: void Or(const BloomFilter& other); > in that case, please remove it. Done -- To view, visit http://gerrit.cloudera.org:8080/4066 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
