[
https://issues.apache.org/jira/browse/AURORA-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Khutornenko reassigned AURORA-686:
----------------------------------------
Assignee: Maxim Khutornenko
> Job updates may fail due to exceeding role quota
> ------------------------------------------------
>
> Key: AURORA-686
> URL: https://issues.apache.org/jira/browse/AURORA-686
> Project: Aurora
> Issue Type: Story
> Components: Scheduler
> Reporter: Maxim Khutornenko
> Assignee: Maxim Khutornenko
>
> Current way of checking job quota during in-flight updates (i.e. within
> addInstance transaction) may lead to failed updates and inferior user
> experience. Since we are tracking quota at the role level but the update lock
> applied at the job level, there is always a possibility to exceed the allowed
> quota for long running updates.
> This is especially a problem with the server side-driven process where a
> resumed update will restart in a potentially quite different quota
> environment (i.e. due to other jobs created while the update was paused).
> Possible solutions:
> - per job quota tracking - requires significant refactoring;
> - hierarchical locking (e.g. add role lock in addition to job lock) - limits
> update concurrency per role;
> - front-loaded consumption (e.g. add additional job consumption during
> startJobUpdate and re-evaluate on update completion/termination) - will
> require persisting front-loaded value within job update schema but may be the
> way to go given current quota implementation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)