Use a global atomic boolean and return nothing from that partition if the
boolean is true.

Note that your result won't be deterministic.

On Sep 18, 2015, at 4:11 PM, Ulanov, Alexander <[email protected]>
wrote:

Thank you! How can I guarantee that I have only one element per executor
(per worker, or per physical node)?



*From:* Feynman Liang [mailto:[email protected] <[email protected]>]

*Sent:* Friday, September 18, 2015 4:06 PM
*To:* Ulanov, Alexander
*Cc:* [email protected]
*Subject:* Re: One element per node



rdd.mapPartitions(x => new Iterator(x.head))



On Fri, Sep 18, 2015 at 3:57 PM, Ulanov, Alexander <[email protected]>
wrote:

Dear Spark developers,



Is it possible (and how to do it if possible) to pick one element per
physical node from an RDD? Let’s say the first element of any partition on
that node. The result would be an RDD[element], the count of elements is
equal to the N of nodes that has partitions of the initial RDD.



Best regards, Alexander

Reply via email to