Thanks so much Matei! From: Matei Zaharia [mailto:[email protected]] Sent: February-28-14 10:59 AM To: [email protected] Subject: Re: is RDD failure transparent to stream consumer
For output operators like this, the operator will run multiple times, so it need to be idempotent. However, the built-in save operators (e.g. saveAsTextFile) are automatically idempotent (they only create each output partition once). Matei On Feb 28, 2014, at 10:10 AM, Adrian Mocanu <[email protected]<mailto:[email protected]>> wrote: Would really like an answer to this. A `yes` or `no` would suffice. I'm talking ab RDD failure in this context: myStream.foreachRDD(rdd=>rdd.foreach(tuple => println(tuple))) From: Adrian Mocanu [mailto:[email protected]] Sent: February-27-14 12:19 PM To: [email protected]<mailto:[email protected]> Subject: is RDD failure transparent to stream consumer Is RDD failure transparent to a spark stream consumer except for the slowdown needed to recreate the RDD. After reading the papers on RDDs and DStreams from spark homepage I believe it is, but I'd like a confirmation. Thanks -Adrian
