Gentle bump on this topic; how to test the fault tolerance and previous benchmark results are both things we are interested in as well. Mike
<div>-------- Original message --------</div><div>From: 牛兆捷 <nzjem...@gmail.com> </div><div>Date:07-09-2015 04:19 (GMT-05:00) </div><div>To: dev@spark.apache.org, u...@spark.apache.org </div><div>Subject: Questions about Fault tolerance of Spark </div><div> </div>Hi All: We already know that Spark utilizes the lineage to recompute the RDDs when failure occurs. I want to study the performance of this fault-tolerant approach and have some questions about it. 1) Is there any benchmark (or standard failure model) to test the fault tolerance of these kinds of in-memory data processing systems? 2) How do you emulate the failures in testing spark? (e.g., kill a computation task? or kill the computation nodes?) Thanks!!! -- Regards, Zhaojie