[ 
https://issues.apache.org/jira/browse/AURORA-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081416#comment-15081416
 ] 

Stephan Erb commented on AURORA-1258:
-------------------------------------

[~tonydong3] we have implemented a very rudimentary version of a scaling 
command in a thin wrapper around the python client (which we install via the 
aurora python sdist). Maybe this helps to get the discussion going. The entire 
feature described by [~yasumoto] would be more involved.

Stripped down/relevant code follows below. The field {{self._api}} is of type 
{{apache.aurora.client.api.AuroraClientAPI}}.
{code}
    def scale_to(self, jobkey, num_instances):
        """
        Scale instance count.

        Be aware:
          * implicit assumptions that all tasks are running with the same task 
config
          * subject to race conditions when jobs are modified concurrently
            (e.g., kill_job between task config fetch and update)
        """
        query = TaskQuery(jobKeys=[jobkey.to_thrift()], limit=1, 
statuses=ACTIVE_STATES)
        resp = self._api.query(query)
        self._validate_response(resp)

        if not resp.result.scheduleStatusResult.tasks:
            raise LookupError("Unable to scale job %s. No jobconfig found." % 
jobkey)

        task_config = 
resp.result.scheduleStatusResult.tasks[0].assignedTask.task
        self._start_update(jobkey, task_config, num_instances)

    def _start_update(self, jobkey, task_config, num_instances):
        update_settings = 
UpdaterConfig(**self._update_config).to_thrift_update_settings()
        request = JobUpdateRequest(instanceCount=num_instances, 
settings=update_settings, taskConfig=task_config)
        resp = self._api.scheduler_proxy.startJobUpdate(request, "Scale to %s 
instances" % num_instances)
        self._validate_response(resp)
{code}

We would happily drop our custom implementation in favor of something more 
sane. Feel free to give it a shot :-)

> Improve procedure for adding instances to a job
> -----------------------------------------------
>
>                 Key: AURORA-1258
>                 URL: https://issues.apache.org/jira/browse/AURORA-1258
>             Project: Aurora
>          Issue Type: Story
>          Components: Reliability, Usability
>            Reporter: Joe Smith
>
> The current process for adding instances to a job is highly manual, and 
> potentially dangerous.
> 1. Take a config for a job with 10 instances, update it to 20 instances.
> 2. The batch size will be increased, and users will need to specify shards 10 
> to 19.
> 3. After this update is complete, users will need to manually update shards 
> 0-9 again.
> There may be other changes pulled in as part of this update other than just 
> increasing the number of instances, which could further complicate things.
> One possible improvement would be to change the updater from 
> 'under-provision' where it kills instances first, then schedules new 
> instances, to an 'over-provision' where it adds on new instances, then 
> backpedals and kills the old instances.
> Overall, a single command or process for a user to take an already-existing 
> job and increase the number of instances would reduce overhead and 
> fat-fingering.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to