I have been playing with using accumulators (despite the possible error with multiple attempts) These provide a convenient way to get some numbers while still performing business logic. I posted some sample code at http://lordjoesoftware.blogspot.com/. Even if accumulators are not perfect today - future versions may improve them and they are great ways to monitor execution and get a sense of performance on lazily executed systems
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDD-aggregate-versus-accumulables-tp19044p19102.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org