Github user jiangxb1987 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18150#discussion_r121089252
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/ShuffleMapStage.scala ---
    @@ -133,6 +133,28 @@ private[spark] class ShuffleMapStage(
       }
     
       /**
    +   * Removes all shuffle outputs associated with this host. Note that this 
will also remove
    +   * outputs which are served by an external shuffle server (if one 
exists), as they are still
    +   * registered with this execId.
    +   */
    +  def removeOutputsOnHost(host: String): Unit = {
    --- End diff --
    
    This function is pretty overlap in code with `removeOutputsOnExecutor`, How 
about combine them as:
    ```
      def removeOutputsByFilter(f: (BlockManagerId) => Boolean): Unit = {
        ......
          val newList = prevList.filterNot(m => f(m))
        ......
      }
    ```
    And in DAGScheduler, we can simply pass in the filter functions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to