Re: [Spark] Accumulators or count()

2017-03-01 Thread Daniel Siegmann
As you noted, Accumulators do not guarantee accurate results except in specific situations. I recommend never using them. This article goes into some detail on the problems with accumulators: http://imranrashid.com/posts/Spark-Accumulators/ On Wed, Mar 1, 2017 at 7:26 AM, Charles O. Bajomo <

[Spark] Accumulators or count()

2017-03-01 Thread Charles O. Bajomo
Hello everyone, I wanted to know if there is any benefit to using an acculumator over just executing a count() on the whole RDD. There seems to be a lot of issues with accumulator during a stage failure and also seems to be an issue rebuilding them if the application restarts from a