Hi,
I've been struggling to understand the statistical theory behind this piece
of code (from
/core/src/main/scala/org/apache/spark/partial/GroupedSumEvaluator.scala)
below, especially with respect to estimating the size of the population
(total tasks) and its variance. Also I'm trying to under
Hi,
I'm new to Spark and Scala as well. I understand that we can use foreach to
apply a function to each element of an RDD, like rdd.foreach
(x=>println(x)), but I saw we can also do a for loop to print each element
of an RDD, like
for (x <- rdd){
println(x)
}
Does defining the foreach function