Refactoring killTasks to get rid of TaskQuery sounds good to me. Filed
AURORA-1583.

On Thu, Jan 14, 2016 at 12:33 PM, Bill Farner <wfar...@apache.org> wrote:
> Fwiw I think an APi call being "too powerful" is not a good reason to
> introduce another.
>
> As for the specific behavior here, I know there has been a desire to
> eliminate the use of TaskQuery in killTasks because it is rarely used, and
> kills spanning jobs can be implemented on top of the APi.
>
> On Thursday, January 14, 2016, Maxim Khutornenko <ma...@apache.org> wrote:
>
>> "How is scaling down different from killing instances?"
>>
>> I found 'killTasks' syntax too different and way much more powerful to
>> be used for scaling in. The TaskQuery allows killing instances across
>> jobs/roles, whereas 'scaleIn' is narrowed down to just a single job.
>> Additional benefit: it can be ACLed independently by allowing external
>> process kill tasks only within a given job. We may also add rate
>> limiting or backoff to it later.
>>
>> As for Joshua's question, I feel it should be an operator's
>> responsibility to diff a job with its aurora config before applying an
>> update. That said, if there is enough demand we can definitely
>> consider adding something similar to what George suggested or
>> resurrecting a 'large change' warning message we used to have in
>> client updater.
>>
>> On Thu, Jan 14, 2016 at 12:06 PM, George Sirois <geo...@tellapart.com
>> <javascript:;>> wrote:
>> > As a point of reference, we solved this problem by adding a binding
>> helper
>> > that queries the scheduler for the current number of instances and uses
>> > that number instead of a hardcoded config:
>> >
>> >    instances='{{scaling_instances[60]}}'
>> >
>> > In this example, instances will be set to the currently running number
>> > (unless there are none, in which case 60 instances will be created).
>> >
>> > On Thu, Jan 14, 2016 at 2:44 PM, Joshua Cohen <jco...@apache.org
>> <javascript:;>> wrote:
>> >
>> >> What happens if a job has been scaled out, but the underlying config is
>> not
>> >> updated to take that scaling into account? Would the next update on that
>> >> job revert the number of instances (presumably, because what else could
>> we
>> >> do)? Is there anything we can do, tooling-wise, to improve upon this?
>> >>
>> >> On Thu, Jan 14, 2016 at 1:40 PM, Maxim Khutornenko <ma...@apache.org
>> <javascript:;>>
>> >> wrote:
>> >>
>> >> > Our rolling update APIs can be quite inconvenient to work with when it
>> >> > comes to instance scaling [1]. It's especially frustrating when
>> >> > adding/removing instances has to be done in an automated fashion
>> (e.g.:
>> >> by
>> >> > an external autoscaling process) as it requires holding on to the
>> >> original
>> >> > aurora config at all times.
>> >> >
>> >> > I propose we add simple instance scaling APIs to address the above.
>> Since
>> >> > Aurora job may have instances at different configs at any moment, I
>> >> propose
>> >> > we accept an InstanceKey as a reference point when scaling out. For
>> >> > example:
>> >> >
>> >> >     /** Scales out a given job by adding more instances with the task
>> >> > config of the templateKey. */
>> >> >     Response scaleOut(1: InstanceKey templateKey, 2: i32
>> incrementCount)
>> >> >
>> >> >     /** Scales in a given job by removing existing instances. */
>> >> >     Response scaleIn(1: JobKey job, 2: i32 decrementCount)
>> >> >
>> >> > A correspondent client command could then look like:
>> >> >
>> >> >     aurora job scale-out devcluster/vagrant/test/hello/1 10
>> >> >
>> >> > For the above command, a scheduler would take task config of instance
>> 1
>> >> of
>> >> > the 'hello' job and replicate it 10 more times thus adding 10
>> additional
>> >> > instances to the job.
>> >> >
>> >> > There are, of course, some details to work out like making sure no
>> active
>> >> > update is in flight, scale out does not violate quota and etc. I
>> intend
>> >> to
>> >> > address those during the implementation as things progress.
>> >> >
>> >> > Does the above make sense? Any concerns/suggestions?
>> >> >
>> >> > Thanks,
>> >> > Maxim
>> >> >
>> >> > [1] - https://issues.apache.org/jira/browse/AURORA-1258
>> >> >
>> >>
>>

Reply via email to