[ https://issues.apache.org/jira/browse/SPARK-30610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-30610: ---------------------------------- Affects Version/s: (was: 3.0.0) 3.1.0 > spark worker graceful shutdown > ------------------------------ > > Key: SPARK-30610 > URL: https://issues.apache.org/jira/browse/SPARK-30610 > Project: Spark > Issue Type: Improvement > Components: Scheduler > Affects Versions: 3.1.0 > Reporter: t oo > Priority: Minor > > I am not talking about spark streaming! just regular batch jobs using > spark-submit that may try to read large csv (100+gb) then write it out as > parquet. In an autoscaling cluster would be nice to be able to scale down (ie > terminate) ec2s without slowing down active spark applications. > for example: > 1. start spark cluster with 8 ec2s > 2. submit 6 spark apps > 3. 1 spark app completes, so 5 apps still running > 4. cluster can scale down 1 ec2 (to save $) but don't want to make the > existing apps running on the (soon to be terminated) ec2 have to make its csv > read, RDD processing steps.etc start from the beginning on different ec2's > executors. Instead want to have a 'graceful shutdown' command so that the 8th > ec2 does not accept new spark-submit apps to it (ie don't start new executors > on it) but finish the ones that have already launched on it, then exit the > worker pid. then the ec2 can be terminated > I thought stop-slave.sh could do this but looks like it just kills the pid -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org