Correct me if I'm wrong, but he can actually run thus code without
broadcasting the users map,  however the code will be less efficient.

czw., 26 lut 2015, 12:31 PM Sean Owen użytkownik <so...@cloudera.com>
napisał:

> Yes, but there is no concept of executors 'deleting' an RDD. And you
> would want to broadcast the usersMap if you're using it this way.
>
> On Thu, Feb 26, 2015 at 11:26 AM, Guillermo Ortiz <konstt2...@gmail.com>
> wrote:
> > One last time to be sure I got it right, the executing sequence here
> > goes like this?:
> >
> > val usersMap = contacts.collectAsMap()
> > #The contacts RDD is collected by the executors and sent to the
> > driver, the executors delete the rdd
> > contacts.map(v => (v._1, (usersMap(v._1), v._2))).collect()
> > #The userMap object is sent again to the executors to run the code,
> > and with the collect(), the result is sent again back to the driver
> >
> >
> > 2015-02-26 11:57 GMT+01:00 Sean Owen <so...@cloudera.com>:
> >> Yes, in that code, usersMap has been serialized to every executor.
> >> I thought you were referring to accessing the copy in the driver.
> >>
> >> On Thu, Feb 26, 2015 at 10:47 AM, Guillermo Ortiz <konstt2...@gmail.com>
> wrote:
> >>> Isn't it "contacts.map(v => (v._1, (usersMap(v._1), v._2))).collect()"
> >>> executed in the executors?  why is it executed in the driver?
> >>> contacts are not a local object, right?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to