Thanks Sean. That's kind of what I figured. Luckily, for my use case writes are idempotent, so map works.
> From: so...@cloudera.com > Date: Fri, 19 Dec 2014 11:06:51 +0000 > Subject: Re: How to run an action and get output? > To: as...@live.com > CC: user@spark.apache.org > > To really be correct, I think you may have to use the foreach action > to persist your data, since this isn't idempotent, and then read it > again in a new RDD. You might get away with map as long as you can > ensure that your write process is idempotent. > > On Fri, Dec 19, 2014 at 10:57 AM, ashic <as...@live.com> wrote: > > Hi, > > Say we have an operation that writes something to an external resource and > > gets some output. For example: > > > > val doSomething(entry:SomeEntry, session:Session) : SomeOutput = { > > val result = session.SomeOp(entry) > > SomeOutput(entry.Key, result.SomeProp) > > } > > > > I could use a transformation for rdd.map, but in case of failures, the map > > would run on another executor for the same rdd. I could do rdd.foreach, but > > that returns unit. Is there something like a foreach that can return values? > > > > Thanks, > > Ashic. > > > > PS: Resending to nabble email due to spam issues. > > > > ________________________________ > > View this message in context: How to run an action and get output? > > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >