I assume you're setting these values in spark-defaults.conf. What happens
if you specify them directly to spark-submit  as in --conf
spark.dynamicAllocation.enabled=true
?

On Thu, Mar 15, 2018 at 1:47 PM, Florian Dewes <fde...@gmail.com> wrote:

> Hi all,
>
> I am currently trying to enable dynamic resource allocation for a little
> yarn managed spark cluster.
> We are using sparklyr to access spark from R and have multiple jobs which
> should run in parallel, because some of them take several days to complete
> or are in development.
>
> Everything works out so far, the only problem we have is that executors
> are not removed from idle jobs.
>
> Lets say job A is the only running job that loads a file that is several
> hundred GB in size and then goes idle without disconnecting from spark. It
> gets 80% of the cluster because I set a maximum value via
> spark.dynamicAllocation.maxExecutors.
>
> When we start another job (B) with the remaining 20% of the cluster
> resources, no idle executors of the other job are freed and the idle job
> will keep 80% of the cluster's resources, although 
> spark.dynamicAllocation.executorIdleTimeout
> is set.
>
> Only if we disconnect job A, B will allocate the freed executors.
>
> Configuration settings used:
>
> spark.shuffle.service.enabled = "true"
> spark.dynamicAllocation.enabled = “true"
> spark.dynamicAllocation.executorIdleTimeout = 120
> spark.dynamicAllocation.maxExecutors = 100
>
> with
>
> Spark 2.1.0
> R 3.4.3
> sparklyr 0.6.3
>
>
> Any ideas?
>
>
> Thanks,
>
> Florian
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
http://www.femibyte.com/twiki5/bin/view/Tech/
http://www.nextmatrix.com
"Great spirits have always encountered violent opposition from mediocre
minds." - Albert Einstein.

Reply via email to