Question about accumulator

2018-01-23 Thread hsy...@gmail.com
I have a small application like this

acc = sc.accumulate(5)

def t_f(x,):
global acc
sleep(5)
acc += x

def f(x):
global acc
thread = Thread(target = t_f, args = (x,))
thread.start()
# thread.join() # without this it doesn't work

rdd = sc.parallelize([1,2,4,1])
rdd.foreach(f)
sleep(30)
print(acc.value)
assert acc.value == 13

The code doesn't work unless I uncomment the thread.join

Any idea why?


A question about accumulator

2015-11-10 Thread Tan Tim
Hi, all

There is a discussion about the accumulator in stack overflow:
http://stackoverflow.com/questions/27357440/spark-accumalator-value-is-different-when-inside-rdd-and-outside-rdd

I comment about this question (from user Tim). As the output I tried, I
hava two questions:
1. Why the addInplace function be called twice?
2. Why the order of two ouput  is difference ?

Any suggestion will be appreciated.