Re: Mesos/Spark Deadlock

Matei Zaharia Mon, 25 Aug 2014 12:03:29 -0700

BTW it seems to me that even without that patch, you should be getting tasks 
launched as long as you leave at least 32 MB of memory free on each machine 
(that is, the sum of the executor memory sizes is not exactly the same as the 
total size of the machine). Then Mesos will be able to re-offer that machine 
whenever CPUs free up.


Matei

On August 25, 2014 at 5:05:56 AM, Gary Malouf ([email protected]) wrote:

We have not tried the work-around because there are other bugs in there 
that affected our set-up, though it seems it would help. 


On Mon, Aug 25, 2014 at 12:54 AM, Timothy Chen <[email protected]> wrote: 

> +1 to have the work around in. 
> 
> I'll be investigating from the Mesos side too. 
> 
> Tim 
> 
> On Sun, Aug 24, 2014 at 9:52 PM, Matei Zaharia <[email protected]> 
> wrote: 
> > Yeah, Mesos in coarse-grained mode probably wouldn't work here. It's too 
> bad that this happens in fine-grained mode -- would be really good to fix. 
> I'll see if we can get the workaround in 
> https://github.com/apache/spark/pull/1860 into Spark 1.1. Incidentally 
> have you tried that? 
> > 
> > Matei 
> > 
> > On August 23, 2014 at 4:30:27 PM, Gary Malouf ([email protected]) 
> wrote: 
> > 
> > Hi Matei, 
> > 
> > We have an analytics team that uses the cluster on a daily basis. They 
> use two types of 'run modes': 
> > 
> > 1) For running actual queries, they set the spark.executor.memory to 
> something between 4 and 8GB of RAM/worker. 
> > 
> > 2) A shell that takes a minimal amount of memory on workers (128MB) for 
> prototyping out a larger query. This allows them to not take up RAM on the 
> cluster when they do not really need it. 
> > 
> > We see the deadlocks when there are a few shells in either case. From 
> the usage patterns we have, coarse-grained mode would be a challenge as we 
> have to constantly remind people to kill their shells as soon as their 
> queries finish. 
> > 
> > Am I correct in viewing Mesos in coarse-grained mode as being similar to 
> Spark Standalone's cpu allocation behavior? 
> > 
> > 
> > 
> > 
> > On Sat, Aug 23, 2014 at 7:16 PM, Matei Zaharia <[email protected]> 
> wrote: 
> > Hey Gary, just as a workaround, note that you can use Mesos in 
> coarse-grained mode by setting spark.mesos.coarse=true. Then it will hold 
> onto CPUs for the duration of the job. 
> > 
> > Matei 
> > 
> > On August 23, 2014 at 7:57:30 AM, Gary Malouf ([email protected]) 
> wrote: 
> > 
> > I just wanted to bring up a significant Mesos/Spark issue that makes the 
> > combo difficult to use for teams larger than 4-5 people. It's covered in 
> > https://issues.apache.org/jira/browse/MESOS-1688. My understanding is 
> that 
> > Spark's use of executors in fine-grained mode is a very different 
> behavior 
> > than many of the other common frameworks for Mesos. 
> > 
>

Re: Mesos/Spark Deadlock

Reply via email to