Thanks Sean. That's kind of what I figured. Luckily, for my use case writes are 
idempotent, so map works.

> From: so...@cloudera.com
> Date: Fri, 19 Dec 2014 11:06:51 +0000
> Subject: Re: How to run an action and get output?‏
> To: as...@live.com
> CC: user@spark.apache.org
> 
> To really be correct, I think you may have to use the foreach action
> to persist your data, since this isn't idempotent, and then read it
> again in a new RDD. You might get away with map as long as you can
> ensure that your write process is idempotent.
> 
> On Fri, Dec 19, 2014 at 10:57 AM, ashic <as...@live.com> wrote:
> > Hi,
> > Say we have an operation that writes something to an external resource and
> > gets some output. For example:
> >
> > val doSomething(entry:SomeEntry, session:Session) : SomeOutput = {
> >     val result = session.SomeOp(entry)
> >     SomeOutput(entry.Key, result.SomeProp)
> > }
> >
> > I could use a transformation for rdd.map, but in case of failures, the map
> > would run on another executor for the same rdd. I could do rdd.foreach, but
> > that returns unit. Is there something like a foreach that can return values?
> >
> > Thanks,
> > Ashic.
> >
> > PS: Resending to nabble email due to spam issues.
> >
> > ________________________________
> > View this message in context: How to run an action and get output?‏
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 
                                          

Reply via email to