[
https://issues.apache.org/jira/browse/AURORA-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brian Hatfield updated AURORA-1579:
-----------------------------------
Description:
The goal of this feature is to allow users to check if their job (as
configured) would likely be schedulable given Aurora's current offers. An
extended form of this feature would be able to perform this test while assuming
any current instance of the job in question would be stopped.
Here is the suggestion I sent to the mailing list describing my use-case for
such a feature:
{quote}
We currently run a (relatively) small Mesos/Aurora cluster, and don't always
have significant resource overhead available.
Sometimes, we go to schedule a job and we're just short of what we
estimated-by-hand we'd need in the cluster for it. Most of the tasks schedule -
but a few stay "PENDING" because of the resource constraint. This often
confuses users, or in some cases, causes the command to block for a while until
it eventually times out.
We're currently working in-house on automating somewhat-more-precise basic
estimation with information sourced from /offers to get a sense of "nope, your
task won't schedule" to provide fast feedback that doesn't manipulate the state
of the cluster.
However, our basic estimation doesn't include co-scheduling constraints,
quotas, etc., which seem like something Aurora would be able to determine.
{quote}
It is worth noting that this kind of feature is inherently subject to race
conditions and future restrictions. Somewhat paradoxically, this feature is
more useful the smaller your quota or cluster is, as many actions in a
restricted environment will require adding capacity (or quota). It is worth
documenting this feature to mention that there are cases where your tasks could
still end up pending - losing a race, host failure, "oddly shaped tasks"
failing to reschedule, etc.
was:
The goal of this feature is to allow users to check if their job (as
configured) would likely be schedulable given Aurora's current offers. An
extended form of this feature would be able to perform this test while assuming
any current instance of the job in question would be stopped.
Here is the suggestion I sent to the mailing list describing my use-case for
such a feature:
{quote}
We currently run a (relatively) small Mesos/Aurora cluster, and don't always
have significant resource overhead available.
Sometimes, we go to schedule a job and we're just short of what we
estimated-by-hand we'd need in the cluster for it. Most of the tasks schedule -
but a few stay "PENDING" because of the resource constraint. This often
confuses users, or in some cases, causes the command to block for a while until
it eventually times out.
We're currently working in-house on automating somewhat-more-precise basic
estimation with information sourced from /offers to get a sense of "nope, your
task won't schedule" to provide fast feedback that doesn't manipulate the state
of the cluster.
However, our basic estimation doesn't include co-scheduling constraints,
quotas, etc., which seem like something Aurora would be able to determine.
{quote}
> Allow preflight-check of Job schedulability.
> --------------------------------------------
>
> Key: AURORA-1579
> URL: https://issues.apache.org/jira/browse/AURORA-1579
> Project: Aurora
> Issue Type: Task
> Components: Client, Scheduler
> Reporter: Brian Hatfield
> Priority: Minor
>
> The goal of this feature is to allow users to check if their job (as
> configured) would likely be schedulable given Aurora's current offers. An
> extended form of this feature would be able to perform this test while
> assuming any current instance of the job in question would be stopped.
> Here is the suggestion I sent to the mailing list describing my use-case for
> such a feature:
> {quote}
> We currently run a (relatively) small Mesos/Aurora cluster, and don't always
> have significant resource overhead available.
> Sometimes, we go to schedule a job and we're just short of what we
> estimated-by-hand we'd need in the cluster for it. Most of the tasks schedule
> - but a few stay "PENDING" because of the resource constraint. This often
> confuses users, or in some cases, causes the command to block for a while
> until it eventually times out.
> We're currently working in-house on automating somewhat-more-precise basic
> estimation with information sourced from /offers to get a sense of "nope,
> your task won't schedule" to provide fast feedback that doesn't manipulate
> the state of the cluster.
> However, our basic estimation doesn't include co-scheduling constraints,
> quotas, etc., which seem like something Aurora would be able to determine.
> {quote}
> It is worth noting that this kind of feature is inherently subject to race
> conditions and future restrictions. Somewhat paradoxically, this feature is
> more useful the smaller your quota or cluster is, as many actions in a
> restricted environment will require adding capacity (or quota). It is worth
> documenting this feature to mention that there are cases where your tasks
> could still end up pending - losing a race, host failure, "oddly shaped
> tasks" failing to reschedule, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)