One other case to take into account which complicates the logic a bit is
we have some jobs that need to be stopped and then started again usually
with either code changes or capacity increases. In this case we would
need to have the resources already consumed for the job factored back in
to determine whether there is enough room to run the job. I think for a
first pass a simple yes/no on outstanding offers would be good but for
our use case we would need to supply an existing job as an argument to
tell the offers check to add those resources back when considering
whether there is enough room or not. 

This can get a bit race conditiony if you have multiple people starting
and stopping jobs in the cluster. It may also be interesting to have an
addition to the deploy task that says something like "if you can deploy
this do it if not then don't do anything and exit with an error" or
something like that. I'm not sure what guarantees you can make between
the check and the actual deploy based on other things that are going on
in the cluster but that would definitely be an awesome improvement for
that use case. 

-- 
Andrew Jorgensen
@ajorgensen

On Tue, Jan 12, 2016, at 06:14 PM, John Sirois wrote:
> On Tue, Jan 12, 2016 at 3:56 PM, Brian Hatfield <[email protected]>
> wrote:
> 
> > Hi,
> >
> > We currently run a (relatively) small Mesos/Aurora cluster, and don't
> > always have significant resource overhead available.
> >
> > Sometimes, we go to schedule a job and we're just short of what we
> > estimated-by-hand we'd need in the cluster for it. Most of the tasks
> > schedule - but a few stay "PENDING" because of the resource constraint.
> > This often confuses users, or in some cases, causes the command to block
> > for a while until it eventually times out.
> >
> > We're currently working on automating somewhat-more-precise basic
> > estimation with information sourced from /offers to get a sense of "nope,
> > your task won't schedule" to provide fast feedback that doesn't manipulate
> > the state of the cluster.
> >
> > A friend recommended that I suggest to this mailing list something
> > integrated into Aurora to accomplish this instead - since our basic
> > estimation doesn't include co-scheduling constraints, quotas, etc.
> >
> > So: We believe that this feature doesn't exist in Aurora today, and wanted
> > to suggest it as a future feature for the project.
> >
> 
> I think this would be a great feature from simple yes/no to more
> sophisticated likelyhood estimates even based on time of day (cron job
> scheduling taken into account):
> 1. A ticket [1] describing the minimum viable feature.
> 2. Work towards implementation [2].
> 
> Would you be willing to do any of these? I'd be willing to review designs
> and reviews.
> 
> [1] https://issues.apache.org/jira/secure/CreateIssue!default.jspa
> [2] http://aurora.apache.org/documentation/latest/contributing/
> 
> 
> > Thanks :-)
> > Brian
> >

Reply via email to