Hi Jonathan, Thank you for the information! Yes, I am using maximizeResourceAllocation. I will try turn off this and just use dynamicAllocation alone.
Regards, Soonoh On 4 October 2016 at 11:07, Jonathan Kelly <jonathaka...@gmail.com> wrote: > On the most recent several releases of EMR, Spark dynamicAllocation is > automatically enabled, as it allows longer running apps like Zeppelin's > Spark interpreter to continue running in the background without taking up > resources for any executors unless Spark jobs are actively running. > > However, if you are seeing resources still being used even after some idle > time, maybe you are using maximizeResourceAllocation (which makes any Spark > job use 100% of the resources, with one executor per slave node). If you > use maximizeResourceAllocation, it effectively disables dynamicAllocation > because it causes spark.executor.instances to be set. If you still want to > use dynamicAllocation along with maxizeResourceAllocation, just set > spark.dynamicAllocation.enabled to true in the spark-defaults > configuration classification. This will signal to the > maximizeResourceAllocation feature not to set spark.executor.instances so > that dynamicAllocation will be used. > > Keep in mind that this might not be the most ideal way to use > dynamicAllocation though (especially if you don't have many nodes in the > cluster) because the maximizeResourceAllocation feature would make the > executors very coarsely grained since there's only one per node. It would > still allow multiple applications to run at once though because executors > from one application could spin down when idle, allowing another > application to spin up executors. > > Hope this helps, > Jonathan > > On Mon, Oct 3, 2016 at 5:38 PM Jung, Soonoh <soonoh.j...@gmail.com> wrote: > >> Hi everyone, >> >> I am using Zeppelin in AWS EMR (Zeppelin 0.6.1, spark 2.0 on Yarn RM) >> Basically Zeppelin spark interpreter's spark job is not finishing after >> executing a notebook. >> It looks like the spark job still occupying memory a lot in my Yarn >> cluster. >> Is there a way restart spark interpreter automatically(or pragmatically) >> every time I run a notebook in order to release that memory in my Yarn >> cluster? >> >> Regards, >> Soonoh >> >