Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21898#discussion_r206709985
  
    --- Diff: core/src/main/scala/org/apache/spark/BarrierTaskContext.scala ---
    @@ -27,6 +27,33 @@ trait BarrierTaskContext extends TaskContext {
        * Sets a global barrier and waits until all tasks in this stage hit 
this barrier. Similar to
        * MPI_Barrier function in MPI, the barrier() function call blocks until 
all tasks in the same
        * stage have reached this routine.
    +   *
    +   * This function is expected to be called by EVERY tasks in the same 
barrier stage in the SAME
    +   * pattern, otherwise you may get a SparkException. Some examples of 
misuses listed below:
    --- End diff --
    
    * It is hard to describe the following as "the SAME pattern": 
    
    ~~~scala
    if (context.partitionId() == 0) {
      // do something
      context.barrier()
    } else {
      context.barrier()
    }
    ~~~
    
    I would rephrase it as:
    
    "CAUTION! In a barrier stage, each task must have the same number of 
barrier() calls, in all possible code branches. Otherwise, you may get the job 
hanging or a SparkException after timeout."


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to