Do you want to transform the RDD, or just produce some side effect with its
contents?  If the latter, you want foreachPartition, not mapPartitions.

On Fri, Jun 26, 2015 at 11:52 AM, Wang, Ningjun (LNG-NPV) <
[email protected]> wrote:

>  In rdd.mapPartition(…) if I try to iterate through the items in the
> partition, everything screw. For example
>
>
>
> *val *rdd = sc.parallelize(1 to 1000, 3)
> val count = rdd.mapPartitions(iter => {
>
> *println(iter.length)   *iter
> }).count()
>
>
>
>
>
> The count is 0. This is incorrect. The count should be 1000. If I just
> comment out the line *println(iter.length)*, then the count become 1000
> correctly.
>
>
>
> Does this mean I cannot iterate through iter in mapPartitions? I want to
> get all items in a partition and compose one request to send to external
> system. How can I achieve that if I am not allowed to iterate through items
> in the partition?
>
>
>
> Ningjun
>
>
>

Reply via email to