RE: Spark worker memory not freed up after zeppelin run finishes

Felix Cheung Fri, 04 Dec 2015 09:19:18 -0800

It's possible if the execution is interrupted. Perhaps a good practice is to 
have cleanup code in a separate paragraph?

Date: Fri, 4 Dec 2015 10:59:45 +0100
Subject: Re: Spark worker memory not freed up after zeppelin run finishes
From: liska.ja...@gmail.com
To: users@zeppelin.incubator.apache.org

That is understandable, but what about if you stop execution by pressing button 
in notebook? If you do that after you cached some rdd or broadcasted a 
variable, the cleanup code won't be executed, right ?
On Thu, Dec 3, 2015 at 6:25 PM, Felix Cheung <felixcheun...@hotmail.com> wrote:

I think that's expected since Zeppelin is keeping the spark context alive even 
when the notebook is not executing (the idea is you could be running more 
things). That would keep broadcasted data and cached rdd in memory. You should 
see the same if you
 run the same code from spark-shell and not exit the shell.

On Thu, Dec 3, 2015 at 9:01 AM -0800, "Jakub Liska" 
<liska.ja...@gmail.com> wrote:

Hey,

I mentioned that I'm using broardcast variables, but I'm destroying them at the 
end... I'm using Spark 1.7.1 ... I'll let you know later if the problem still 
occurs. So far it seems it stopped after I started destroying them + 
cachedRdd.unpersist 

On Thu, Dec 3, 2015 at 5:52 PM, Felix Cheung 
<felixcheun...@hotmail.com> wrote:

Do you know what version of spark you are running with?

On Thu, Dec 3, 2015 at 12:52 AM -0800, "Kevin (Sangwoo) Kim"
<kevin...@apache.org> wrote:

Do you use broadcast variables? I've found many problems related to broadcast 
variables and not using it. 
(It's a Spark problem, rather than Zeppelin problem)

For RDD's, no need to be manually unpersisted, it automatically does.  

2015년 12월 3일 (목) 오후 5:28, Jakub Liska <liska.ja...@gmail.com>님이 작성:

Hi,

no, just running it manually. I think I need to unpersist cached rdds and 
destroy broadcast variables in the end, am I correct? Because it hasn't crashed 
since then, the following runs are always a little slower though.

On Thu, Dec 3, 2015 at 8:08 AM, Felix Cheung <felixcheun...@hotmail.com> wrote:

How are you running jobs? Do you schedule a notebook to run from Zeppelin?

Date: Mon, 30 Nov 2015 12:42:16 +0100

Subject: Spark worker memory not freed up after zeppelin run finishes

From: liska.ja...@gmail.com

To: users@zeppelin.incubator.apache.org

Hey,

I'm connecting Zeppelin with a remote Spark standalone cluster (2 worker nodes) 
and I noticed that if I run a job from Zeppelin twice without restarting the 
Interpreter, it fails on OOME. After the Zeppelin jobs successfully finishes I 
can see all executor
 memory being allocated on workers and restarting Interpreter frees the 
memory... But if I don't do it it fails when running the task again.

Any idea how to deal with this problem? Currently I have to always restart 
Interpreter between running spark jobs.

Thanks Jakub

RE: Spark worker memory not freed up after zeppelin run finishes

Reply via email to