Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14973 )

Change subject: IMPALA-9274: cyclic barrier implementation
......................................................................


Patch Set 4:

(1 comment)

Thanks for looking into this in detail, it would be easy for something subtle 
to slip in. I explained why I think it's ok (it took me a little bit to 
convince myself).

http://gerrit.cloudera.org:8080/#/c/14973/4/be/src/util/cyclic-barrier.h
File be/src/util/cyclic-barrier.h:

http://gerrit.cloudera.org:8080/#/c/14973/4/be/src/util/cyclic-barrier.h@61
PS4, Line 61:     barrier_cv_.NotifyAll();
> Shouldn't this (+ the same call in Cancel()) happen while we hold the lock?
Dropping the lock before calling notify is deliberate since the threads can't 
do anything until the mutex is released, this avoid the scenario where the 
threads wake up and then immediately block on the mutex. I remember Zoli 
converted a few places to this pattern.

I think I see what you're saying - there's scenarios like this one:

* The barrier is created for 1 thread.
* thread 1 calls Wait(), sees its the last thread, and executes up to l61, 
where it has dropped the lock
* thread 2 calls Cancel(), acquires the lock, sets cancel_status_, then returns.

The potential problem is that Wait() returned OK after Cancel() returned. The 
function comment doesn't claim that Wait() invocations that return after 
Cancel() returns always return a CANCELLED status - it only says that they must 
returned CANCELLED if they block, or if they are future calls (i.e. that start 
after Cancel() returns).

I don't think that we need the stronger invariant (actually it's kinda 
impossible to guarantee), because the behaviour from the caller's point of view 
is equivalent to the following scenario, which could occur even if you held 
mutex until after the notification occurred:
* thread 1 calls Wait(), sees its the last thread, and executes up to the 
return instruction of Wait(). At this point it has released the lock.
* thread 2 calls Cancel(), acquires the lock, sets cancel_status_, then returns.



--
To view, visit http://gerrit.cloudera.org:8080/14973
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id5c8cc6b3023d2dc03c8116d462440c5e25c3bb0
Gerrit-Change-Number: 14973
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 07 Jan 2020 06:03:32 +0000
Gerrit-HasComments: Yes

Reply via email to