GitHub user ilganeli opened a pull request:

    https://github.com/apache/spark/pull/4708

    [SPARK-4655] Split Stage into ShuffleMapStage and ResultStage subclasses

    Hi all - this patch includes splitting up Stage into ShuffleMapStage and 
ResultStage and updating their usage within DAGScheduler.
    
    I wanted to confirm that it's appropriate to move the outputLocs variable 
and its associated methods into ShuffleMapStage. Within the shuffleMapTask 
handler (line 1004) I wanted to confirm that we are guaranteed for that stage 
to be a ShuffleMapStage
    
    I believe I've split up the functionality of ResultStage and 
ShuffleMapStage appropriately but I wanted to make sure I'm not missing 
something.
    
    Lastly, I suggest opening a JIRA to further improve readability of two 
functions within the DAGScheduler:
    
    1) The monolithic handleTaskCompletion 
    2) The cleanupStateForJobAndIndependentStages function
    
    This last item is already code-complete in a separate PR but it should 
really be done within its own issue. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ilganeli/spark SPARK-4655

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4708.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4708
    
----
commit 83494e94ff45e0f275966d405d777d22cb167800
Author: Ilya Ganelin <[email protected]>
Date:   2015-02-20T21:22:18Z

    Added new Stage classes

commit cfd6f10d4f30d1940c7dbc61b58af22cfd647980
Author: Ilya Ganelin <[email protected]>
Date:   2015-02-20T21:54:29Z

    Updated DAGScheduler to use new ResultStage and ShuffleMapStage classes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to