Marcel Kornacker has posted comments on this change. Change subject: IMPALA-2956: Filters should be able to target multiple scan nodes ......................................................................
Patch Set 4: (13 comments) i think the handling of filters in the coordinator needs to be refined. http://gerrit.cloudera.org:8080/#/c/2932/4/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: Line 151: if (state->query_options().runtime_filter_mode == TRuntimeFilterMode::LOCAL && why do we need this check now? http://gerrit.cloudera.org:8080/#/c/2932/4/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: Line 492: if (filter_mode_ == TRuntimeFilterMode::LOCAL && !filter.has_local_targets) { if a filter doesn't have any remote targets, there's also no need to register it in the coordinator Line 505: // If this is a broadcast join with non-local targets, only build and publish it "with only non-local" Line 518: int target_ndx = it->second; const TRuntimeFilterTargetDesc& target = ..; and then use that Line 2034: vector<unordered_set<int32_t>> target_fragment_instance_idxs; if these are all fragment instance indices, why not just keep them in a single unordered_set? Line 2044: DCHECK(!state->desc.has_local_targets || state->desc.has_remote_targets) dcheck(state->desc.has_remote_targets) Line 2081: if (state->is_local_target[i]) continue; we shouldn't see local targets here http://gerrit.cloudera.org:8080/#/c/2932/4/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: Line 370: std::vector<bool> is_local_target; comment what it's indexed by (presumably mirrors targets, but good to spell out) Line 373: // Index into fragment_instance_states_ for target fragment instances. same here. also, who owns memory? given that you have a lot of parallel vectors, why not have a vector<struct FilterTarget>? http://gerrit.cloudera.org:8080/#/c/2932/4/be/src/runtime/runtime-filter.cc File be/src/runtime/runtime-filter.cc: Line 111 that still shouldn't be the case, no? Line 98: bool has_local_targets = false; i'd still stick with the singular, because it only needs to have one (local or remote) http://gerrit.cloudera.org:8080/#/c/2932/4/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: Line 88: // produced the runtime filter and false == it's a remote target, i assume? Line 104: 4: required map<Types.TPlanNodeId, i32> planid_to_target_ndx why not simply put the node id into TRuntimeFilterTargetDesc? -- To view, visit http://gerrit.cloudera.org:8080/2932 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iad2ce4e579a30616c469312a4e658140d317507b Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-HasComments: Yes
