In rdd.mapPartition(...) if I try to iterate through the items in the
partition, everything screw. For example
val rdd = sc.parallelize(1 to 1000, 3)
val count = rdd.mapPartitions(iter => {
println(iter.length)
iter
}).count()
The count is 0. This is incorrect. The count should be 1000. If I just comment
out the line println(iter.length), then the count become 1000 correctly.
Does this mean I cannot iterate through iter in mapPartitions? I want to get
all items in a partition and compose one request to send to external system.
How can I achieve that if I am not allowed to iterate through items in the
partition?
Ningjun