I can see the test for ParallelCollectionRDD.slice().
But how to explain the result of my test?
The following is the simple code I used for test
a=[]
for i in range(0,10000):
a.append(i)
print sc.parallelize(a, 16).mapPartitions(lambda x: f(x)).collect()
and the result is [0, 523776, 0, 1572352, 2620928, 0, 3669504, 4718080, 0,
5766656, 0, 6815232, 7863808, 0, 8912384, 7532280]
2013/9/26 Mike <[email protected]>
> > It does in fact attempt to do that, and the tests check for that, but
> > there's no guarantee in its API. Of course "equally" here means +/-
> > one element.
>
> Correction: *except* when the Seq is a NumericRange: then it shorts the
> last partition. E.g., 91 elements split 10 ways -> 10 elements in the
> first 9 partitions, one in the last.
>
--
--
Shangyu, Luo
Department of Computer Science
Rice University
--
Not Just Think About It, But Do It!
--
Success is never final.
--
Losers always whine about their best