[ 
https://issues.apache.org/jira/browse/SPARK-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522919#comment-14522919
 ] 

Patrick Wendell commented on SPARK-7292:
----------------------------------------

Yes, this would mean sacrificing fault tolerance. For certain applications the 
need to checkpoint is such a major performance cost proportional to overall 
runtime (for instance GraphX applications that have extremely long dependency 
chains), that some users have asked for an "unsafe" checkpoint that lets them 
continue and forgo this guarantee.

> Provide operator to truncate lineage without persisting RDD's
> -------------------------------------------------------------
>
>                 Key: SPARK-7292
>                 URL: https://issues.apache.org/jira/browse/SPARK-7292
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: Patrick Wendell
>
> Checkpointing exists in Spark to truncate a lineage chain. I've heard 
> requests from some users to allow truncation of lineage in a way that is 
> "cheap" and doesn't serialized and persist the RDD. This is possible if the 
> user is willing to forgo fault tolerance for that RDD (for instance, for 
> shorter running jobs or ones that use a small number of machines). It's 
> pretty easy to allow this so we should look into it for Spark 1.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to