Santosh, I get a lot of 2-3 containers.  But I can only get 9-12 (Topped of
the cpu resources at 12 cores) containers if a task runs for more than 30
secs (preferably 60 secs), that's generally not an issue but I thought it
was worth putting on the list for general knowledge.  It's also less of a
deal when more jobs are running.

I have a few ideas on how to improve it and data locality, but think that
it'll likely involve refactoring YarnNodeCapacityManager and
OfferLifeCycleManager to interfaces that can be extended to handle
different strategies which can be configured at on startup.  I'd love to
start that discussion once we finish getting the basic mechanics working.
Maybe a 0.3.0 or 0.4.0 release?

Darin


On Wed, Apr 13, 2016 at 2:14 PM, Santosh Marella <smare...@maprtech.com>
wrote:

> > After the patches it seems stable, I'm able to run multiple terasort/pi
> > jobs and a few scalding jobs without difficulty.
> Great work, Darin. Glad to see FGS is now stable.
>
> >Noticed with jobs with short map tasks (8-12 secs), I rarely got more
> > than two containers per node, I'm curious if I'm not consuming resources
> > fast enough.
> Yes. Perhaps we need to tune the rate at which Mesos sends out resource
> offers
> to frameworks. The default that we observe in Myriad is 5 seconds. However,
> if your
> job has many map tasks and Mesos offer is big enough to accommodate several
> of them,
> then you should ideally see lot more than 2-3 containers per node.
>
> Isn't that happening? How many map tasks does your job have?
>
> Thanks,
> Santosh
>
> On Wed, Apr 13, 2016 at 8:34 AM, Darin Johnson <dbjohnson1...@gmail.com>
> wrote:
>
> > I've been running a number of tests on the Fine Grained scaling aspect on
> > Myriad.  Here's a few notes:
> >
> > 1. After the patches it seems stable, I'm able to run multiple
> terasort/pi
> > jobs and a few scalding jobs without difficulty.
> > 2. Noticed with jobs with short map tasks (8-12 secs), I rarely got more
> > than two containers per node, I'm curious if I'm not consuming resources
> > fast enough.  The issue goes away on the reduce side (able to get far
> > better utilization of offers).  The issue can be lessened by increasing
> > mapred.splits.min.size and mapred.splits.max.size.  This may be an issue
> > for things like Hive.
> >
> > Darin
> >
>

Reply via email to