[
https://issues.apache.org/jira/browse/AURORA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081416#comment-15081416
]
Stephan Erb commented on AURORA-1258:
-------------------------------------
[~tonydong3] we have implemented a very rudimentary version of a scaling
command in a thin wrapper around the python client (which we install via the
aurora python sdist). Maybe this helps to get the discussion going. The entire
feature described by [~yasumoto] would be more involved.
Stripped down/relevant code follows below. The field {{self._api}} is of type
{{apache.aurora.client.api.AuroraClientAPI}}.
{code}
def scale_to(self, jobkey, num_instances):
"""
Scale instance count.
Be aware:
* implicit assumptions that all tasks are running with the same task
config
* subject to race conditions when jobs are modified concurrently
(e.g., kill_job between task config fetch and update)
"""
query = TaskQuery(jobKeys=[jobkey.to_thrift()], limit=1,
statuses=ACTIVE_STATES)
resp = self._api.query(query)
self._validate_response(resp)
if not resp.result.scheduleStatusResult.tasks:
raise LookupError("Unable to scale job %s. No jobconfig found." %
jobkey)
task_config =
resp.result.scheduleStatusResult.tasks[0].assignedTask.task
self._start_update(jobkey, task_config, num_instances)
def _start_update(self, jobkey, task_config, num_instances):
update_settings =
UpdaterConfig(**self._update_config).to_thrift_update_settings()
request = JobUpdateRequest(instanceCount=num_instances,
settings=update_settings, taskConfig=task_config)
resp = self._api.scheduler_proxy.startJobUpdate(request, "Scale to %s
instances" % num_instances)
self._validate_response(resp)
{code}
We would happily drop our custom implementation in favor of something more
sane. Feel free to give it a shot :-)
> Improve procedure for adding instances to a job
> -----------------------------------------------
>
> Key: AURORA-1258
> URL: https://issues.apache.org/jira/browse/AURORA-1258
> Project: Aurora
> Issue Type: Story
> Components: Reliability, Usability
> Reporter: Joe Smith
>
> The current process for adding instances to a job is highly manual, and
> potentially dangerous.
> 1. Take a config for a job with 10 instances, update it to 20 instances.
> 2. The batch size will be increased, and users will need to specify shards 10
> to 19.
> 3. After this update is complete, users will need to manually update shards
> 0-9 again.
> There may be other changes pulled in as part of this update other than just
> increasing the number of instances, which could further complicate things.
> One possible improvement would be to change the updater from
> 'under-provision' where it kills instances first, then schedules new
> instances, to an 'over-provision' where it adds on new instances, then
> backpedals and kills the old instances.
> Overall, a single command or process for a user to take an already-existing
> job and increase the number of instances would reduce overhead and
> fat-fingering.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)