Master node memory usage question

Richard Alex Hofer Thu, 07 May 2015 21:52:07 -0700

Hi,

I'm working on a project in Spark and am trying to understand what'sgoing on. Right now to try and understand what's happening we came upwith this snippet of code which very roughly resembles what we'reactually doing. When trying to run this our master node ends up quicklyusing up its memory even though all of our RDDs are very small. Cansomeone explain what's going on here and how we can avoid it?


a = sc.parallelize(xrange(100),10)
b = a

for i in xrange(100000):
    a = a.map(lambda x: x + 1)
    if i % 300 == 0:
    # We do this to try and force some of our RDD to evaluate
    a.persist()
        a.foreachPartition(lambda _: None)
        b.unpersist()
        b = a
a.collect()
b.unpersist()

-Richard Hofer

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Master node memory usage question

Reply via email to