Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14538 )
Change subject: IMPALA-4400: aggregate runtime filters locally ...................................................................... Patch Set 20: (14 comments) I missed the earlier round of comments. Should be addressed now. http://gerrit.cloudera.org:8080/#/c/14538/15//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14538/15//COMMIT_MSG@18 PS15, Line 18: can done > typo Done http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/exec/kudu-scan-node-base.cc File be/src/exec/kudu-scan-node-base.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/exec/kudu-scan-node-base.cc@110 PS15, Line 110: KUDU_RETURN_IF_ERROR(client_->OpenTable(table_desc->table_name(), &table_), > Maybe outside the scope of this patch, but it looks like the hdfs scanners Yeah I'd prefer to avoid this behavioural change (it seems fairly low risk, but this patch is so big already). http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h@136 PS15, Line 136: > const Done http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h@140 PS15, Line 140: > const Done http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.h@161 PS15, Line 161: > const Done http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.cc@a679 PS15, Line 679: > Looks like maybe this was the only use of 'fragment_map_' anywhere and its Nice catch, removed it http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h File be/src/runtime/runtime-filter-bank.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h@68 PS15, Line 68: AllocateScratchBloomFilter() > Not your change, but this is wrong, since the sentence refers to both bloom Done http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h@79 PS15, Line 79: PublishGlobalFilte : /// () > typo Done http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-state.h@63 PS15, Line 63: /// A collection of items that are part of the global state of a query and shared across > Not your change, but this comment seems wrong/confusing to me. RuntimeState Yeah this was very stale. Updated the comment to reflect the current state of things. http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/scheduling/query-schedule.cc File be/src/scheduling/query-schedule.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/scheduling/query-schedule.cc@246 PS15, Line 246: } > Looks like you're relying on the fact that all instances for a particular h I also depend on this for some of the later mt patches, so good call in documenting it. http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/bloom-filter.cc File be/src/util/bloom-filter.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/bloom-filter.cc@251 PS15, Line 251: if (other.AlwaysFalse()) return; > Doesn't look like you're handling the always_true case, maybe add a DCHECK? Good point. It has to be handled externally anyway in RuntimeFilter because of the decision to represent ALWAYS_TRUE_FILTER as a null pointer - we can't convert this to an always true filter in-place and it also isn't valid to pass in a null reference anyway. Anyway, I added a dcheck and comments. http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.h@74 PS15, Line 74: virtual void SetAlwaysTrue() = 0; > protected? Done http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.cc File be/src/util/min-max-filter.cc: http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.cc@324 PS15, Line 324: // Sets 'always_true_' to true and clears the values of 'min_', 'max_', 'min_buffer_', > Seems pretty clear from the code, so probably unnecessary, or maybe leave i Done http://gerrit.cloudera.org:8080/#/c/14538/15/common/protobuf/data_stream_service.proto File common/protobuf/data_stream_service.proto: http://gerrit.cloudera.org:8080/#/c/14538/15/common/protobuf/data_stream_service.proto@136 PS15, Line 136: // optional int32 dst_fragment_idx = 3; > Any reason not to just completely remove this and renumber? Its completely Not really, did as suggested. Can't remember why I decided to do this. -- To view, visit http://gerrit.cloudera.org:8080/14538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabeeab5eec869ff2197250ad41c1eb5551704acc Gerrit-Change-Number: 14538 Gerrit-PatchSet: 20 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Tue, 21 Jan 2020 20:07:06 +0000 Gerrit-HasComments: Yes
