I had explored these examples couple of months back. very good link for RDD
operations. see if below explanation helps, try to understand the
difference between below 2 examples.. initial value in both is """
Example 1;
val z = sc.parallelize(List("12","23","","345"),2)
z.aggregate("")((x,y) => mat
Hello Prem -
Thanks for sharing and I also found the similar example from the link
http://homepage.cs.latrobe.edu.au/zhe/ZhenHeSparkRDDAPIExamples.html#aggregate
But trying the understand the actual functionality or behavior.
Thanks & Regards,
Gokula Krishnan* (Gokul)*
On Tue, Jan 12, 2016 at
try mapPartitionsWithIndex .. below is an example I used earlier. myfunc
logic can be further modified as per your need.
val x = sc.parallelize(List(1,2,3,4,5,6,7,8,9), 3)
def myfunc(index: Int, iter: Iterator[Int]) : Iterator[String] = {
iter.toList.map(x => index + "," + x).iterator
}
x.mapPar
Hello All -
I'm just trying to understand aggregate() and in the meantime got an
question.
*Is there any way to view the RDD databased on the partition ?.*
For the instance, the following RDD has 2 partitions
val multi2s = List(2,4,6,8,10,12,14,16,18,20)
val multi2s_RDD = sc.parallelize(multi2s