Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14973 )
Change subject: IMPALA-9274: cyclic barrier implementation ...................................................................... Patch Set 4: (1 comment) Thanks for looking into this in detail, it would be easy for something subtle to slip in. I explained why I think it's ok (it took me a little bit to convince myself). http://gerrit.cloudera.org:8080/#/c/14973/4/be/src/util/cyclic-barrier.h File be/src/util/cyclic-barrier.h: http://gerrit.cloudera.org:8080/#/c/14973/4/be/src/util/cyclic-barrier.h@61 PS4, Line 61: barrier_cv_.NotifyAll(); > Shouldn't this (+ the same call in Cancel()) happen while we hold the lock? Dropping the lock before calling notify is deliberate since the threads can't do anything until the mutex is released, this avoid the scenario where the threads wake up and then immediately block on the mutex. I remember Zoli converted a few places to this pattern. I think I see what you're saying - there's scenarios like this one: * The barrier is created for 1 thread. * thread 1 calls Wait(), sees its the last thread, and executes up to l61, where it has dropped the lock * thread 2 calls Cancel(), acquires the lock, sets cancel_status_, then returns. The potential problem is that Wait() returned OK after Cancel() returned. The function comment doesn't claim that Wait() invocations that return after Cancel() returns always return a CANCELLED status - it only says that they must returned CANCELLED if they block, or if they are future calls (i.e. that start after Cancel() returns). I don't think that we need the stronger invariant (actually it's kinda impossible to guarantee), because the behaviour from the caller's point of view is equivalent to the following scenario, which could occur even if you held mutex until after the notification occurred: * thread 1 calls Wait(), sees its the last thread, and executes up to the return instruction of Wait(). At this point it has released the lock. * thread 2 calls Cancel(), acquires the lock, sets cancel_status_, then returns. -- To view, visit http://gerrit.cloudera.org:8080/14973 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id5c8cc6b3023d2dc03c8116d462440c5e25c3bb0 Gerrit-Change-Number: 14973 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Andrew Sherman <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 07 Jan 2020 06:03:32 +0000 Gerrit-HasComments: Yes
