Question about accumulator
I have a small application like this acc = sc.accumulate(5) def t_f(x,): global acc sleep(5) acc += x def f(x): global acc thread = Thread(target = t_f, args = (x,)) thread.start() # thread.join() # without this it doesn't work rdd = sc.parallelize([1,2,4,1]) rdd.foreach(f) sleep(30) print(acc.value) assert acc.value == 13 The code doesn't work unless I uncomment the thread.join Any idea why?
A question about accumulator
Hi, all There is a discussion about the accumulator in stack overflow: http://stackoverflow.com/questions/27357440/spark-accumalator-value-is-different-when-inside-rdd-and-outside-rdd I comment about this question (from user Tim). As the output I tried, I hava two questions: 1. Why the addInplace function be called twice? 2. Why the order of two ouput is difference ? Any suggestion will be appreciated.