wait rdd operations should infact execute in parallel right? so if I call
rdd.forEachAsync that should execute in parallel isn't it? I guess I am a
little confused what the difference really is between forEachAsync vs
forEachPartitionAsync? besides passing in Tuple vs  Iterator of Tuples to
the lambda respectively.

On Sun, Apr 2, 2017 at 8:36 PM, kant kodali <kanth...@gmail.com> wrote:

> Hi all,
> What is the difference between forEachAsync vs forEachPartitionAsync? I
> couldn't find any comments from the Javadoc. If I were to guess here is
> what I would say but please correct me if I am wrong.
> forEachAsync just iterate through values from all partitions one by one in
> an Async Manner
> forEachPartitionAsync: Fan out each partition and run the lambda for each
> partition in parallel across different workers. The lambda here will
> Iterate through values from that partition one by one in Async manner
> Is this right? or am I completely wrong?
> Thanks!

Reply via email to