A JVM reuse doubt. Lets say I have a job which has 5 stages: Each stage has 10 tasks(10 partitions) Each task has 3 transformation. My Cluster is size 4 (1 Master, 3 Workers), How many JVMs will be launched?
1-Master Daemon 3-Worker Daemon JVM = 1+3+10*3*5 (where at a time 10 will be executed parallely on 3 machine, but trasformation done sequentially launching a JVM every transformation for each stage.) OR 1+3+5*10 (where at a time 10 will be executed parallely on 3 machine but different stage in different set of JVM) OR 1+3+5*3 (So, JVM will be reused for different partition on single machine but different stage in different set of JVM) OR 1+3+3 (So, One JVM per Worker in any case). OR none Thx, Archit_Thakur.
