Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14975 )

Change subject: IMPALA-9154: Make runtime filter propagation asynchronous
......................................................................


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-backend-state.h@21
PS2, Line 21: #include <condition_variable>
I don't think you need this here


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-backend-state.cc@593
PS2, Line 593:     if (state->num_inflight_rpcs() == 0) {
Its a little subtle why this is correct - since we hold FilterState::lock_ 
while we issue all of the PublishFilter rpcs, for us to be able to get the lock 
again here on line 591 means that all PublishFilters that will be issued for 
this FilterState must have already been issued, this filter has been disabled, 
and if we reach 0 we know that there won't be any more PublishFilters issued.

Can you add a DCHECK(state->disabled()) and a brief comment mentioning this?


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h
File be/src/runtime/coordinator-filter-state.h:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h@58
PS2, Line 58: /// Once a filter is disabled, subsequent updates for that filter 
are ignored.
We should document the thread safety of this class now, eg. if you take my 
advice from some of my other comments it will be something like "This class is 
not thread safe. Callers must always take 'lock()' themselves when calling any 
FilterState functions if thread safety is needed"


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h@92
PS2, Line 92: get_lock()
just 'lock()'


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h@161
PS2, Line 161: update_filter_done_cv_
publish_filter_done_cv_


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator-filter-state.h@178
PS2, Line 178:   SpinLock update_lock;
This isn't used anywhere anymore, right?


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.h@100
PS2, Line 100: /// Lock ordering: (lower-numbered acquired before 
higher-numbered)
Please update this comment to reflect the new protocol - what's listed here as 
filter_lock_ was replaced by FilterRoutingTable::lock in a previous patch so 
you can go ahead and update that too, and filter_update_lock_ has been replaced 
with FilterState::lock_ (and of course check that we are in fact following the 
ordering shown here for those locks)


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.cc@1050
PS2, Line 1050:   for (auto& filter : filter_routing_table_->id_to_filter) {
              :     filter.second.WaitForPublishFilter();
              :   }
              :
              :   for (auto& filter : filter_routing_table_->id_to_filter) {
              :     FilterState* state = &filter.second;
              :     state->DisableAndRelease(filter_mem_tracker_);
              :   }
> I have noticed that this patch failed the core tests in the ASAN build due
Yes - you definitely have to do WaitForPublishFilter() and DisableAndRelease() 
atomically with respect to FilterState::lock_ somehow, since otherwise after 
caling WaitForPublishFilter() you could get another call to UpdateFilter() 
which will attempt to make PublishFilter rpcs even though we think all the 
PublishFilter rpcs for this FilterState are done.

Possibly the most straight forward way of doing that (if you follow my other 
comments) is to not have WaitForPublishFilter() take FilterState::lock_, but 
instead take FilterState::get_lock() ourselves here in the 'for' loop and hold 
it while calling both WaitForPublishFilter() and DisableAndRelease()


http://gerrit.cloudera.org:8080/#/c/14975/2/be/src/runtime/coordinator.cc@1315
PS2, Line 1315:   unique_lock<SpinLock> l(lock_);
I think that this is the only place where you take FilterState::lock_ from 
within a FilterState function instead of using FilterState::get_lock(). Of 
course, usually that sort of thing is better encapsulation, but there isn't 
really a good way to avoid having a FilterState::get_lock() function, eg. 
because its needed in Coordinator::UpdateFilter, so I think it might be better 
to just be consistent and always require that callers take 
FilterState::get_lock() themselves before calling any FilterState functions.



--
To view, visit http://gerrit.cloudera.org:8080/14975
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb6726d349be701f3a0602b2ad5a934082f188a0
Gerrit-Change-Number: 14975
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 10 Jan 2020 20:09:56 +0000
Gerrit-HasComments: Yes

Reply via email to