That is understandable, but what about if you stop execution by pressing button in notebook? If you do that after you cached some rdd or broadcasted a variable, the cleanup code won't be executed, right ?
On Thu, Dec 3, 2015 at 6:25 PM, Felix Cheung <felixcheun...@hotmail.com> wrote: > I think that's expected since Zeppelin is keeping the spark context alive > even when the notebook is not executing (the idea is you could be running > more things). That would keep broadcasted data and cached rdd in memory. > You should see the same if you run the same code from spark-shell and not > exit the shell. > > > > > > On Thu, Dec 3, 2015 at 9:01 AM -0800, "Jakub Liska" <liska.ja...@gmail.com > > wrote: > > Hey, > > I mentioned that I'm using broardcast variables, but I'm destroying them > at the end... I'm using Spark 1.7.1 ... I'll let you know later if the > problem still occurs. So far it seems it stopped after I started destroying > them + cachedRdd.unpersist > > On Thu, Dec 3, 2015 at 5:52 PM, Felix Cheung <felixcheun...@hotmail.com> > wrote: > > Do you know what version of spark you are running with? > > > > > > On Thu, Dec 3, 2015 at 12:52 AM -0800, "Kevin (Sangwoo) Kim" < > kevin...@apache.org> wrote: > > Do you use broadcast variables? I've found many problems related > to broadcast variables and not using it. > (It's a Spark problem, rather than Zeppelin problem) > > For RDD's, no need to be manually unpersisted, it automatically does. > > > 2015년 12월 3일 (목) 오후 5:28, Jakub Liska <liska.ja...@gmail.com>님이 작성: > > Hi, > > no, just running it manually. I think I need to unpersist cached rdds and > destroy broadcast variables in the end, am I correct? Because it hasn't > crashed since then, the following runs are always a little slower though. > > On Thu, Dec 3, 2015 at 8:08 AM, Felix Cheung <felixcheun...@hotmail.com> > wrote: > > How are you running jobs? Do you schedule a notebook to run from Zeppelin? > > ------------------------------ > Date: Mon, 30 Nov 2015 12:42:16 +0100 > Subject: Spark worker memory not freed up after zeppelin run finishes > From: liska.ja...@gmail.com > To: users@zeppelin.incubator.apache.org > > Hey, > > I'm connecting Zeppelin with a remote Spark standalone cluster (2 worker > nodes) and I noticed that if I run a job from Zeppelin twice without > restarting the Interpreter, it fails on OOME. After the Zeppelin jobs > successfully finishes I can see all executor memory being allocated on > workers and restarting Interpreter frees the memory... But if I don't do it > it fails when running the task again. > > Any idea how to deal with this problem? Currently I have to always restart > Interpreter between running spark jobs. > > Thanks Jakub > > > >