[GitHub] spark pull request: [SPARK-9851] Support submitting map stages ind...

zsxwing Tue, 08 Sep 2015 04:56:51 -0700

Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8180#discussion_r38916643
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/ActiveJob.scala ---
    @@ -23,18 +23,42 @@ import org.apache.spark.TaskContext
     import org.apache.spark.util.CallSite
     
     /**
    - * Tracks information about an active job in the DAGScheduler.
    + * A running job in the DAGScheduler. Jobs can be of two types: a result 
job, which computes a
    + * ResultStage to execute an action, or a map-stage job, which computes 
the map outputs for a
    + * ShuffleMapStage before any downstream stages are submitted. The latter 
is used for adaptive
    + * query planning, to look at map output statistics before submitting 
later stages. We distinguish
    + * between these two types of jobs using the finalStage field of this 
class.
    + *
    + * Jobs are only tracked for "leaf" stages that clients directly 
submitted, through DAGScheduler's
    + * submitJob or submitMapStage methods. However, either type of job may 
cause the execution of
    + * may other earlier stages (for RDDs in the DAG it depends on), and 
multiple jobs may share some
    --- End diff --
    
    nit: `may` is redundant



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-9851] Support submitting map stages ind...

Reply via email to