Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9718 )

Change subject: IMPALA-6338: Fix flaky test_profile_fragment_instances
......................................................................


Patch Set 1:

(5 comments)

Obviously this is all a complete mess.

I've thought about some ways to clean it up (see eg: 
https://gerrit.cloudera.org/#/c/9656/) but haven't come up with something very 
compelling. If anyone has any ideas, I'd love to hear them.

My understanding is that there is some work planned soon around query lifecycle 
and profile reporting as part of the scale project, so hopefully we'll be able 
to make more progress towards making this sane.

http://gerrit.cloudera.org:8080/#/c/9718/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/9718/1/be/src/runtime/coordinator-backend-state.h@119
PS1, Line 119:   boost::mutex* lock() { return &lock_; }
> If we keep this, I think this could help prevent coding errors by handing o
Is there a reasonable way to make that work with the way that we have a list of 
Backends that need to be iterated over and locked?


http://gerrit.cloudera.org:8080/#/c/9718/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/9718/1/be/src/runtime/coordinator-backend-state.cc@237
PS1, Line 237:   return num_remaining_instances_ == 0 || (!status_.ok() && 
!status_.IsCancelled());
> Is it possible to move the !status_.IsCancelled() to where we expect it to
Sure, that would be equivalent. I don't have a strong feeling either way about 
which is preferable.

I chose to go this route because the behavior of ApplyExecStatusReport() 
returning a bool indicating is IsDone() became true is already unfortunate (to 
be clear, I wrote that behavior to fix a different coordinator/runtime profile 
bug, so its my fault) but at least it was tied to something concrete.

I suppose we could just say "ApplyExecStatusReport() returns true if 
'num_remaining_backends_' should be decreased", but then we end up with the 
situation where a backend is still considered to be executing by the 
coordinator even though IsDone() returns true, which also screws up the logic 
of LogFirstInProgress().


http://gerrit.cloudera.org:8080/#/c/9718/1/be/src/runtime/coordinator-backend-state.cc@249
PS1, Line 249:   for (const TFragmentInstanceExecStatus& instance_exec_status:
> Instead of acquiring all locks in ComputeQuerySummary(), we could acquire a
I didn't want to add any more locking to the profile update path, which may be 
called a very large number of times for queries that are long running or have 
many fragments.

The relevant root profile would be for the FragmentStats for the fragment this 
instance corresponds to, and there could be contention with other backends 
trying to update different instances of the same fragment.

Adding the extra locking in ComputeQuerySummary() seemed better because it 
should only be called once at the end of the query.


http://gerrit.cloudera.org:8080/#/c/9718/1/be/src/runtime/coordinator-backend-state.cc@251
PS1, Line 251: instance_status
> Can we check here that (!instance_status.IsCancelled() || status_.IsCancell
Not quite, since (at least for successfully completed queries where fragments 
are cancelled because returned_all_results_ becomes true) BackendState::status_ 
is only updated as a result of these status reports, so the first time an 
instance reports CANCELLED, status_ won't already be CANCELLED.

What you would need is access to Coordinator::query_status_, which we could 
pass into ApplyExecStatusReport(), except that accessing it requires taking 
Coordinator::lock_, which of course we don't want to hold for the duration of 
ApplyExecStatusReport().

So, I think that what we could do is: in UpdateBackendExecStatus, take 
Coordinator::lock_ and make a copy of Coordinator::query_status_. Release lock_ 
and pass the copy into ApplyExecStatusReport(), along with a copy of the value 
of returned_all_results_. We can then say something like:
DCHECK(!instance_status.IsCancelled() || (!query_status_.ok() || 
returned_all_results_))


http://gerrit.cloudera.org:8080/#/c/9718/1/be/src/runtime/coordinator-backend-state.cc@314
PS1, Line 314:   return IsDone();
> The new definition of IsDone() will return false for cancelled backends her
Right, that's the intended behavior.

num_remaining_backends_ is used to decide when to return from 
WaitForBackendCompletion(). There are two cases:

- The cancellation is happening because returned_all_results_ became true. By 
preventing num_remaining_backends_ from decreasing, we ensure that we wait for 
all fragments to Finalize() and provide us with final profile info. 
num_remaining_backends_ will still be decreased, but only once 
num_remaining_instances_ reaches 0.

- The cancellation is happening because the user initiated it or because there 
was an error. In this case, WaitForBackendCompletion() will return even if 
num_remaining_backends_ hasn't reached 0, as in either case 
backend_completion_cv_ will be notified in CancelInternal().



--
To view, visit http://gerrit.cloudera.org:8080/9718
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f8295e4ec580b476f7668c77aec1348fb2e868c
Gerrit-Change-Number: 9718
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Comment-Date: Tue, 20 Mar 2018 18:01:45 +0000
Gerrit-HasComments: Yes

Reply via email to