Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13882 )

Change subject: IMPALA-7984: Port runtime filter from Thrift RPC to KRPC
......................................................................


Patch Set 18:

(16 comments)

Hi Thomas and Michael, I have addressed all the previous comments. Please 
review my revised patch. Thank you very much!

http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.h@48
PS17, Line 48: namespace impala {
> don't use 'using' in header files
Done


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc@986
PS17, Line 986:
              :
              :   exe
> You can leave the formatting as-is
Done


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc@1041
PS17, Line 1041:
> Its unfortunate that this logic is duplicated between here and BloomFilter:
Thanks very much for the detailed explanation!

As we have discussed, we will adopt the first option since it requires much 
fewer modifications to the current code. Otherwise, other than 
'BloomFilter::Or()' and those places that make use of that 'BloomFilterPB' and 
its respective directory stored by the class 'Coordinator::FilterState', we 
also have to make the corresponding changes to 'bloom-filter-test.cc' and 
'bloom-filter-benchmark.cc', which might take much more time to implement.


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc@1089
PS17, Line 1089: reinterpret_cast<const char*>(&((bloom_filter_directory)[0])),
               :           static_cast<unsigned long>(bloom_filter
> Might be cleaner to check has_directory_sidecar_idx() here, which should be
Done


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/coordinator.cc@1096
PS17, Line 1096:     Coordinator* coord, RpcContext* context) {
> You need to handle this error, i.e. call Disable()
Done


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.h
File be/src/runtime/runtime-filter-bank.h:

http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.h@52
PS17, Line 52: /// RuntimeFilters are produced and consumed by plan nodes at 
run time to propagate
> I don't think this is used anywhere
Thanks for pointing this out! I have removed it.


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.h@142
PS17, Line 142:
> I think this can be private.
Thanks for pointing this out. I have modified the name and made it private.


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.h@154
PS17, Line 154: boost::uno
> num_inflight_rpcs_
Done


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@133
PS17, Line 133:   DCHECK_NE(state_->query_options().runtime_filter_mode, 
TRuntimeFilterMode::OFF)
> While thinking about the locking protocol in Close(), I noticed that we don
Thanks very much for the detailed explanation! I have double-checked what you 
have described above and I think your argument is correct.

I have also added an additional 'DCHECK(!closed_);' with an additional brief 
comment as suggested.

On the other hand, I also found that RuntimeState::ReleaseResources() will also 
be called by 
Java_org_apache_impala_service_FeSupport_NativeEvalExprsWithoutRow() in 
fe-support.cc. This function will do the work for the frontend to rewrite a 
given batch of const expressions according to the comments of this function. So 
I guess in this case we do not have to worry about 
RuntimeState::ReleaseResources() being called before 
RuntimeFilterBank::UpdateFilterFromLocal() since 
RuntimeFilterBank::UpdateFilterFromLocal() will not be called is such queries.


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@187
PS17, Line 187: // Use 'proxy' to send the filter to the coordinator.
              :     std::unique_ptr<DataStreamServiceProxy> proxy;
              :     Status get_proxy_status =
              :
> This needs to be moved down after the check for get_proxy_status.ok()
Thanks for pointing this out! I have moved this down after the check for 
get_proxy_status.ok().


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@199
PS17, Line 199:     // Use 'num_inflight_rpcs_' to keep track of the number of 
current in-flight
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@238
PS17, Line 238:         kudu::Status status =
> I don't think that we can just return here, we need to actually handle the
Thanks for pointing this out. I have set 'bloom_filter' to 
'BloomFilter::ALWAYS_TRUE_FILTER' if 'status.ok()' is false.


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@243
PS17, Line 243: bloom_filter = BloomFilt
> This is doing an unnecessary copy, to Slice::ToString() copies all of the d
Thanks for the suggestion. I have added a function 'BloomFilter::Init()' that 
takes a 'const uint8_t*' and a 'size_t' provided by 'sidecar_slice.data()' and 
'sidecar_slice.size()', respectively.


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@314
PS17, Line 314:     BloomFilter::FalsePositiveProb(observed_ndv, 
BitUtil::Log2Ceiling64(filter_size));
              :   return fpp > FLAGS_max_filter_error_rate;
              : }
> This comment is a little weird, maybe rephrase it more like "Wait for all i
Thank you for the suggestion. I have rephrased this comment as suggested.


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@317
PS17, Line 317:
              : void RuntimeFilterBank::Close() {
              :   // Wait for all inflight rpcs to complete before closing the 
filters.
              :   {
> I don't think it technically matters here, but always better not to hold tw
Thanks for the explanation! I have put this in its own {} block.


http://gerrit.cloudera.org:8080/#/c/13882/17/be/src/runtime/runtime-filter-bank.cc@324
PS17, Line 324:     }
> unnecessary extra whitespace
Done



--
To view, visit http://gerrit.cloudera.org:8080/13882
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a
Gerrit-Change-Number: 13882
Gerrit-PatchSet: 18
Gerrit-Owner: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Comment-Date: Mon, 09 Sep 2019 15:12:11 +0000
Gerrit-HasComments: Yes

Reply via email to