Hi all,
I want to calculate mean and SD for each RDD. I used the followoing code for
mean and now I have to use this mean for SD, but not sure how to use get these
means for each RDD from the DStream, so I can use it for SD. My sample files is
as
1
2
3
4
5
The code is as
val individualpoints = ssc.textFileStream(args(1))
val sumandcount = individualpoints.map(x => (x.toDouble, 1)).reduce{ (a, b) =>
(a._1 + b._1, a._2 + b._2) }
sumandcount.foreachRDD(rdd => {
rdd.collect.foreach({
case (sum, n) => {
println("********************************************")
println("The Σ(x) is --> " + sum + " with N --> " + n)
mean = sum/n
println("The μ is -->" + mean)
println("*********************************************")
}
})
})
Regards,
Laeeq