[ https://issues.apache.org/jira/browse/SPARK-24786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeffrey Charles updated SPARK-24786: ------------------------------------ Affects Version/s: (was: 2.2.1) 2.3.0 > Executors not being released after all cached data is unpersisted > ----------------------------------------------------------------- > > Key: SPARK-24786 > URL: https://issues.apache.org/jira/browse/SPARK-24786 > Project: Spark > Issue Type: Bug > Components: Scheduler > Affects Versions: 2.3.0 > Environment: Zeppelin in EMR > Reporter: Jeffrey Charles > Priority: Minor > > I'm persisting a dataframe in Zeppelin which has dynamic allocation enabled > to get a sense of how much memory the dataframe takes up. After I note the > size, I unpersist the dataframe. For some reason, Yarn is not releasing the > executors that were added to Zeppelin. If I don't run the persist and > unpersist steps, the executors that were added are removed about a minute > after the paragraphs complete. Looking at the storage tab in the Spark UI for > the Zeppelin job, I don't see anything cached. I do not want to set > spark.dynamicAllocation.cachedExecutorIdleTimeout to a lower value because I > do not want executors with cached data to be released, but I do want ones > that had cached data and no longer have cached data to be released. > > Steps to reproduce: > # Enable dynamic allocation > # Set spark.dynamicAllocation.executorIdleTimeout to 60s > # Set spark.dynamicAllocation.cachedExecutorIdleTimeout to infinity > # Load a dataset, persist it, run a count on the persisted dataset, > unpersist the persisted dataset > # Wait a couple minutes > Expected behaviour: > All executors will be released as the executors are no longer caching any data > Observed behaviour: > No executors were released -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org