> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic
allocation

Maybe for CPU, but definitely not for memory.  Executors never shut down in
fine-grained mode, which means you only elastically grow and shrink CPU
usage, not memory.

On Sat, Dec 24, 2016 at 10:14 PM, Davies Liu <davies....@gmail.com> wrote:

> Using 0 for spark.mesos.mesosExecutor.cores is better than dynamic
> allocation, but have to pay a little more overhead for launching a
> task, which should be OK if the task is not trivial.
>
> Since the direct result (up to 1M by default) will also go through
> mesos, it's better to tune it lower, otherwise mesos could become the
> bottleneck.
>
> spark.task.maxDirectResultSize
>
> On Mon, Dec 19, 2016 at 3:23 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> wrote:
> > Tim,
> >
> > We will try to run the application in coarse grain mode, and share the
> > findings with you.
> >
> > Regards
> > Sumit Chawla
> >
> >
> > On Mon, Dec 19, 2016 at 3:11 PM, Timothy Chen <tnac...@gmail.com> wrote:
> >
> >> Dynamic allocation works with Coarse grain mode only, we wasn't aware
> >> a need for Fine grain mode after we enabled dynamic allocation support
> >> on the coarse grain mode.
> >>
> >> What's the reason you're running fine grain mode instead of coarse
> >> grain + dynamic allocation?
> >>
> >> Tim
> >>
> >> On Mon, Dec 19, 2016 at 2:45 PM, Mehdi Meziane
> >> <mehdi.mezi...@ldmobile.net> wrote:
> >> > We will be interested by the results if you give a try to Dynamic
> >> allocation
> >> > with mesos !
> >> >
> >> >
> >> > ----- Mail Original -----
> >> > De: "Michael Gummelt" <mgumm...@mesosphere.io>
> >> > À: "Sumit Chawla" <sumitkcha...@gmail.com>
> >> > Cc: u...@mesos.apache.org, d...@mesos.apache.org, "User"
> >> > <u...@spark.apache.org>, dev@spark.apache.org
> >> > Envoyé: Lundi 19 Décembre 2016 22h42:55 GMT +01:00 Amsterdam / Berlin
> /
> >> > Berne / Rome / Stockholm / Vienne
> >> > Objet: Re: Mesos Spark Fine Grained Execution - CPU count
> >> >
> >> >
> >> >> Is this problem of idle executors sticking around solved in Dynamic
> >> >> Resource Allocation?  Is there some timeout after which Idle
> executors
> >> can
> >> >> just shutdown and cleanup its resources.
> >> >
> >> > Yes, that's exactly what dynamic allocation does.  But again I have no
> >> idea
> >> > what the state of dynamic allocation + mesos is.
> >> >
> >> > On Mon, Dec 19, 2016 at 1:32 PM, Chawla,Sumit <sumitkcha...@gmail.com
> >
> >> > wrote:
> >> >>
> >> >> Great.  Makes much better sense now.  What will be reason to have
> >> >> spark.mesos.mesosExecutor.cores more than 1, as this number doesn't
> >> include
> >> >> the number of cores for tasks.
> >> >>
> >> >> So in my case it seems like 30 CPUs are allocated to executors.  And
> >> there
> >> >> are 48 tasks so 48 + 30 =  78 CPUs.  And i am noticing this gap of
> 30 is
> >> >> maintained till the last task exits.  This explains the gap.   Thanks
> >> >> everyone.  I am still not sure how this number 30 is calculated.  (
> Is
> >> it
> >> >> dynamic based on current resources, or is it some configuration.  I
> >> have 32
> >> >> nodes in my cluster).
> >> >>
> >> >> Is this problem of idle executors sticking around solved in Dynamic
> >> >> Resource Allocation?  Is there some timeout after which Idle
> executors
> >> can
> >> >> just shutdown and cleanup its resources.
> >> >>
> >> >>
> >> >> Regards
> >> >> Sumit Chawla
> >> >>
> >> >>
> >> >> On Mon, Dec 19, 2016 at 12:45 PM, Michael Gummelt <
> >> mgumm...@mesosphere.io>
> >> >> wrote:
> >> >>>
> >> >>> >  I should preassume that No of executors should be less than
> number
> >> of
> >> >>> > tasks.
> >> >>>
> >> >>> No.  Each executor runs 0 or more tasks.
> >> >>>
> >> >>> Each executor consumes 1 CPU, and each task running on that executor
> >> >>> consumes another CPU.  You can customize this via
> >> >>> spark.mesos.mesosExecutor.cores
> >> >>> (https://github.com/apache/spark/blob/v1.6.3/docs/
> running-on-mesos.md)
> >> and
> >> >>> spark.task.cpus
> >> >>> (https://github.com/apache/spark/blob/v1.6.3/docs/configuration.md)
> >> >>>
> >> >>> On Mon, Dec 19, 2016 at 12:09 PM, Chawla,Sumit <
> sumitkcha...@gmail.com
> >> >
> >> >>> wrote:
> >> >>>>
> >> >>>> Ah thanks. looks like i skipped reading this "Neither will
> executors
> >> >>>> terminate when they’re idle."
> >> >>>>
> >> >>>> So in my job scenario,  I should preassume that No of executors
> should
> >> >>>> be less than number of tasks. Ideally one executor should execute 1
> >> or more
> >> >>>> tasks.  But i am observing something strange instead.  I start my
> job
> >> with
> >> >>>> 48 partitions for a spark job. In mesos ui i see that number of
> tasks
> >> is 48,
> >> >>>> but no. of CPUs is 78 which is way more than 48.  Here i am
> assuming
> >> that 1
> >> >>>> CPU is 1 executor.   I am not specifying any configuration to set
> >> number of
> >> >>>> cores per executor.
> >> >>>>
> >> >>>> Regards
> >> >>>> Sumit Chawla
> >> >>>>
> >> >>>>
> >> >>>> On Mon, Dec 19, 2016 at 11:35 AM, Joris Van Remoortere
> >> >>>> <jo...@mesosphere.io> wrote:
> >> >>>>>
> >> >>>>> That makes sense. From the documentation it looks like the
> executors
> >> >>>>> are not supposed to terminate:
> >> >>>>>
> >> >>>>> http://spark.apache.org/docs/latest/running-on-mesos.html#
> >> fine-grained-deprecated
> >> >>>>>>
> >> >>>>>> Note that while Spark tasks in fine-grained will relinquish
> cores as
> >> >>>>>> they terminate, they will not relinquish memory, as the JVM does
> >> not give
> >> >>>>>> memory back to the Operating System. Neither will executors
> >> terminate when
> >> >>>>>> they’re idle.
> >> >>>>>
> >> >>>>>
> >> >>>>> I suppose your task to executor CPU ratio is low enough that it
> looks
> >> >>>>> like most of the resources are not being reclaimed. If your tasks
> >> were using
> >> >>>>> significantly more CPU the amortized cost of the idle executors
> >> would not be
> >> >>>>> such a big deal.
> >> >>>>>
> >> >>>>>
> >> >>>>> —
> >> >>>>> Joris Van Remoortere
> >> >>>>> Mesosphere
> >> >>>>>
> >> >>>>> On Mon, Dec 19, 2016 at 11:26 AM, Timothy Chen <tnac...@gmail.com
> >
> >> >>>>> wrote:
> >> >>>>>>
> >> >>>>>> Hi Chawla,
> >> >>>>>>
> >> >>>>>> One possible reason is that Mesos fine grain mode also takes up
> >> cores
> >> >>>>>> to run the executor per host, so if you have 20 agents running
> Fine
> >> >>>>>> grained executor it will take up 20 cores while it's still
> running.
> >> >>>>>>
> >> >>>>>> Tim
> >> >>>>>>
> >> >>>>>> On Fri, Dec 16, 2016 at 8:41 AM, Chawla,Sumit <
> >> sumitkcha...@gmail.com>
> >> >>>>>> wrote:
> >> >>>>>> > Hi
> >> >>>>>> >
> >> >>>>>> > I am using Spark 1.6. I have one query about Fine Grained
> model in
> >> >>>>>> > Spark.
> >> >>>>>> > I have a simple Spark application which transforms A -> B.
> Its a
> >> >>>>>> > single
> >> >>>>>> > stage application.  To begin the program, It starts with 48
> >> >>>>>> > partitions.
> >> >>>>>> > When the program starts running, in mesos UI it shows 48 tasks
> and
> >> >>>>>> > 48 CPUs
> >> >>>>>> > allocated to job.  Now as the tasks get done, the number of
> active
> >> >>>>>> > tasks
> >> >>>>>> > number starts decreasing.  How ever, the number of CPUs does
> not
> >> >>>>>> > decrease
> >> >>>>>> > propotionally.  When the job was about to finish, there was a
> >> single
> >> >>>>>> > remaininig task, however CPU count was still 20.
> >> >>>>>> >
> >> >>>>>> > My questions, is why there is no one to one mapping between
> tasks
> >> >>>>>> > and cpus
> >> >>>>>> > in Fine grained?  How can these CPUs be released when the job
> is
> >> >>>>>> > done, so
> >> >>>>>> > that other jobs can start.
> >> >>>>>> >
> >> >>>>>> >
> >> >>>>>> > Regards
> >> >>>>>> > Sumit Chawla
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Michael Gummelt
> >> >>> Software Engineer
> >> >>> Mesosphere
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Michael Gummelt
> >> > Software Engineer
> >> > Mesosphere
> >>
>
>
>
> --
>  - Davies
>



-- 
Michael Gummelt
Software Engineer
Mesosphere

Reply via email to