Oops - I meant while it is *busy* when I said while it is *idle*. On Tue, Dec 15, 2015 at 11:35 AM Ben Roling <ben.rol...@gmail.com> wrote:
> I'm curious to see the feedback others will provide. My impression is the > only way to get Spark to give up resources while it is idle would be to use > the preemption feature of the scheduler you're using in YARN. When another > user comes along the scheduler would preempt one or more Spark executors to > free the resources the user is entitled to. The question becomes how much > inefficiency the preemption creates due to lost work that has to be redone > by the Spark job. I'm not sure the best way to generalize a thought about > how big of a deal that would be. I imagine it depends on several factors. > > On Tue, Dec 15, 2015 at 9:31 AM David Fox <dafox7777...@gmail.com> wrote: > >> Hello Spark experts, >> >> We are currently evaluating Spark on our cluster that already supports >> MRv2 over YARN. >> >> We have noticed a problem with running jobs concurrently, in particular >> that a running Spark job will not release its resources until the job is >> finished. Ideally, if two people run any combination of MRv2 and Spark >> jobs, the resources should be fairly distributed. >> >> I have noticed a feature called "dynamic resource allocation" in Spark >> 1.2, but this does not seem to be solving the problem, because it releases >> resources only when Spark is IDLE, not while it's BUSY. What I am looking >> for is similar approch to MapReduce where a new user obtains fair share of >> resources >> >> I haven't been able to locate any further information on this matter. On >> the other hand, I feel this must be pretty common issue for a lot of users. >> >> So, >> >> 1. What is your experience when dealing with multitenant (multiple >> users) Spark cluster with YARN? >> 2. Is Spark architectually adept to support releasing resources while >> it's busy? Is this a planned feature or is it something that conflicts >> with >> the idea of Spark executors? >> >> Thanks >> >