Re: Will JVM be reused?

Roshan Nair Sat, 04 Jan 2014 11:56:02 -0800

Hi Archit,

I believe its the last case - 1+3+3.

>From what I've seen its one jvm per worker per spark application.

You will have multiple threads within a worker jvm working on different
partitions concurrently. The number of partitions that a worker handles
concurrently appears to be determined by the number of cores you've set the
worker(or app) to use.

You'd have to save to disk and reload an RDD into memory between stages,
which is why spark won't do that.

Roshan
On Jan 5, 2014 1:06 AM, "Archit Thakur" <[email protected]> wrote:

> A JVM reuse doubt.
> Lets say I have a job which has 5 stages:
> Each stage has 10 tasks(10 partitions) Each task has 3 transformation.
> My Cluster is size 4 (1 Master, 3 Workers), How many JVMs will be launched?
>
> 1-Master Daemon 3-Worker Daemon
> JVM = 1+3+10*3*5 (where at a time 10 will be executed parallely on 3
> machine, but trasformation done sequentially launching a JVM every
> transformation for each stage.)
> OR
> 1+3+5*10 (where at a time 10 will be executed parallely on 3 machine but
> different stage in different set of JVM)
> OR
> 1+3+5*3 (So, JVM will be reused for different partition on single machine
> but different stage in different set of JVM)
> OR
> 1+3+3 (So, One JVM per Worker in any case).
> OR
> none
>
> Thx,
> Archit_Thakur.
>
>
>

Re: Will JVM be reused?

Reply via email to