[GitHub] flink pull request #4364: [FLINK-7216] [distr. coordination] Guard against c...

StephanEwen Thu, 20 Jul 2017 06:24:40 -0700

Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4364#discussion_r128512830
  
    --- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionGraphTestUtils.java
 ---
    @@ -189,6 +189,35 @@ public static void finishAllVertices(ExecutionGraph 
eg) {
                }
        }
     
    +   /**
    +    * Turns a newly scheduled execution graph into a state where all 
vertices run.
    +    * This waits until all executions have reached state 'DEPLOYING' and 
then switches them to running.
    +    */
    +   public static void waitUntilDeployedAndSwitchToRunning(ExecutionGraph 
eg, long timeout) throws TimeoutException {
    +           // wait until everything is running
    +           for (ExecutionVertex ev : eg.getAllExecutionVertices()) {
    +                   final Execution exec = ev.getCurrentExecutionAttempt();
    +                   waitUntilExecutionState(exec, ExecutionState.DEPLOYING, 
timeout);
    +           }
    +
    +           // Note: As ugly as it is, we need this minor sleep, because 
between switching
    +           // to 'DEPLOYED' and when the 'switchToRunning()' may be called 
lies a race check
    +           // against concurrent modifications (cancel / fail). We can 
only switch this to running
    +           // once that check is passed. For the actual runtime, this 
switch is triggered by a callback
    +           // from the TaskManager, which comes strictly after that. For 
tests, we use mock TaskManagers
    +           // which cannot easily tell us when that condition has 
happened, unfortunately.
    +           try {
    +                   Thread.sleep(2);
    --- End diff --
    
    In very rare cases, it might. I want to change the `Execution` a bit on the 
`master` to make this unnecessary.
    
    However, that is too much surgery in a critical part for a bugfix release, 
so I decided to be conservative in the runtime code and rather pay this price 
in the tests.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4364: [FLINK-7216] [distr. coordination] Guard against c...

Reply via email to