Hi,As I know when spark run a job, it will build a DAG. Then split the DAG
into stages. Each stage has task set, the  taskscheduler will send the task
set to workers and wait until the stage complete then start the next stage.
If it is true, I want to know when one worker is failed, the taskscheduler
can resubmit the lineage information and the input RDD to another worker. If
the input RDD is not on the failed worker node, it is worked that new worker
node can fetch the RDD from the node which contain the input RDD. However,
if the input RDD is in failed worker node, how do the fault-tolerant
function? Is it use checkpoint to store the input RDD to file(such as
HDFS)?Thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Fault-tolerant-question-tp1008.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to