Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13882 )
Change subject: IMPALA-7984: Port runtime filter from Thrift RPC to KRPC ...................................................................... Patch Set 18: (16 comments) Hi Thomas and Michael, I have addressed all the previous comments. Please review my revised patch. Thank you very much! http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.h@48 PS17, Line 48: namespace impala { > don't use 'using' in header files Done http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc@986 PS17, Line 986: : : exe > You can leave the formatting as-is Done http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc@1041 PS17, Line 1041: > Its unfortunate that this logic is duplicated between here and BloomFilter: Thanks very much for the detailed explanation! As we have discussed, we will adopt the first option since it requires much fewer modifications to the current code. Otherwise, other than 'BloomFilter::Or()' and those places that make use of that 'BloomFilterPB' and its respective directory stored by the class 'Coordinator::FilterState', we also have to make the corresponding changes to 'bloom-filter-test.cc' and 'bloom-filter-benchmark.cc', which might take much more time to implement. http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc@1089 PS17, Line 1089: reinterpret_cast<const char*>(&((bloom_filter_directory)[0])), : static_cast<unsigned long>(bloom_filter > Might be cleaner to check has_directory_sidecar_idx() here, which should be Done http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc@1096 PS17, Line 1096: Coordinator* coord, RpcContext* context) { > You need to handle this error, i.e. call Disable() Done http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.h File be/src/runtime/runtime-filter-bank.h: http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.h@52 PS17, Line 52: /// RuntimeFilters are produced and consumed by plan nodes at run time to propagate > I don't think this is used anywhere Thanks for pointing this out! I have removed it. http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.h@142 PS17, Line 142: > I think this can be private. Thanks for pointing this out. I have modified the name and made it private. http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.h@154 PS17, Line 154: boost::uno > num_inflight_rpcs_ Done http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc File be/src/runtime/runtime-filter-bank.cc: http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@133 PS17, Line 133: DCHECK_NE(state_->query_options().runtime_filter_mode, TRuntimeFilterMode::OFF) > While thinking about the locking protocol in Close(), I noticed that we don Thanks very much for the detailed explanation! I have double-checked what you have described above and I think your argument is correct. I have also added an additional 'DCHECK(!closed_);' with an additional brief comment as suggested. On the other hand, I also found that RuntimeState::ReleaseResources() will also be called by Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow() in fe-support.cc. This function will do the work for the frontend to rewrite a given batch of const expressions according to the comments of this function. So I guess in this case we do not have to worry about RuntimeState::ReleaseResources() being called before RuntimeFilterBank::UpdateFilterFromLocal() since RuntimeFilterBank::UpdateFilterFromLocal() will not be called is such queries. http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@187 PS17, Line 187: // Use 'proxy' to send the filter to the coordinator. : std::unique_ptr<DataStreamServiceProxy> proxy; : Status get_proxy_status = : > This needs to be moved down after the check for get_proxy_status.ok() Thanks for pointing this out! I have moved this down after the check for get_proxy_status.ok(). http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@199 PS17, Line 199: // Use 'num_inflight_rpcs_' to keep track of the number of current in-flight > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@238 PS17, Line 238: kudu::Status status = > I don't think that we can just return here, we need to actually handle the Thanks for pointing this out. I have set 'bloom_filter' to 'BloomFilter::ALWAYS_TRUE_FILTER' if 'status.ok()' is false. http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@243 PS17, Line 243: bloom_filter = BloomFilt > This is doing an unnecessary copy, to Slice::ToString() copies all of the d Thanks for the suggestion. I have added a function 'BloomFilter::Init()' that takes a 'const uint8_t*' and a 'size_t' provided by 'sidecar_slice.data()' and 'sidecar_slice.size()', respectively. http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@314 PS17, Line 314: BloomFilter::FalsePositiveProb(observed_ndv, BitUtil::Log2Ceiling64(filter_size)); : return fpp > FLAGS_max_filter_error_rate; : } > This comment is a little weird, maybe rephrase it more like "Wait for all i Thank you for the suggestion. I have rephrased this comment as suggested. http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@317 PS17, Line 317: : void RuntimeFilterBank::Close() { : // Wait for all inflight rpcs to complete before closing the filters. : { > I don't think it technically matters here, but always better not to hold tw Thanks for the explanation! I have put this in its own {} block. http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@324 PS17, Line 324: } > unnecessary extra whitespace Done -- To view, visit http://gerrit.cloudera.org:8080/13882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a Gerrit-Change-Number: 13882 Gerrit-PatchSet: 18 Gerrit-Owner: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Comment-Date: Mon, 09 Sep 2019 15:12:11 +0000 Gerrit-HasComments: Yes
