Re: Spark worker memory not freed up after zeppelin run finishes

Jakub Liska Fri, 04 Dec 2015 02:00:13 -0800

That is understandable, but what about if you stop execution by pressing
button in notebook? If you do that after you cached some rdd or broadcasted
a variable, the cleanup code won't be executed, right ?


On Thu, Dec 3, 2015 at 6:25 PM, Felix Cheung <felixcheun...@hotmail.com>
wrote:

> I think that's expected since Zeppelin is keeping the spark context alive
> even when the notebook is not executing (the idea is you could be running
> more things). That would keep broadcasted data and cached rdd in memory.
> You should see the same if you run the same code from spark-shell and not
> exit the shell.
>
>
>
>
>
> On Thu, Dec 3, 2015 at 9:01 AM -0800, "Jakub Liska" <liska.ja...@gmail.com
> > wrote:
>
> Hey,
>
> I mentioned that I'm using broardcast variables, but I'm destroying them
> at the end... I'm using Spark 1.7.1 ... I'll let you know later if the
> problem still occurs. So far it seems it stopped after I started destroying
> them + cachedRdd.unpersist
>
> On Thu, Dec 3, 2015 at 5:52 PM, Felix Cheung <felixcheun...@hotmail.com>
> wrote:
>
> Do you know what version of spark you are running with?
>
>
>
>
>
> On Thu, Dec 3, 2015 at 12:52 AM -0800, "Kevin (Sangwoo) Kim" <
> kevin...@apache.org> wrote:
>
> Do you use broadcast variables? I've found many problems related
> to broadcast variables and not using it.
> (It's a Spark problem, rather than Zeppelin problem)
>
> For RDD's, no need to be manually unpersisted, it automatically does.
>
>
> 2015년 12월 3일 (목) 오후 5:28, Jakub Liska <liska.ja...@gmail.com>님이 작성:
>
> Hi,
>
> no, just running it manually. I think I need to unpersist cached rdds and
> destroy broadcast variables in the end, am I correct? Because it hasn't
> crashed since then, the following runs are always a little slower though.
>
> On Thu, Dec 3, 2015 at 8:08 AM, Felix Cheung <felixcheun...@hotmail.com>
> wrote:
>
> How are you running jobs? Do you schedule a notebook to run from Zeppelin?
>
> ------------------------------
> Date: Mon, 30 Nov 2015 12:42:16 +0100
> Subject: Spark worker memory not freed up after zeppelin run finishes
> From: liska.ja...@gmail.com
> To: users@zeppelin.incubator.apache.org
>
> Hey,
>
> I'm connecting Zeppelin with a remote Spark standalone cluster (2 worker
> nodes) and I noticed that if I run a job from Zeppelin twice without
> restarting the Interpreter, it fails on OOME. After the Zeppelin jobs
> successfully finishes I can see all executor memory being allocated on
> workers and restarting Interpreter frees the memory... But if I don't do it
> it fails when running the task again.
>
> Any idea how to deal with this problem? Currently I have to always restart
> Interpreter between running spark jobs.
>
> Thanks Jakub
>
>
>
>

Re: Spark worker memory not freed up after zeppelin run finishes

Reply via email to