> Or without any persistence at all. The client could refuse to adjust the > instance count on a job unless there's additional command line argument. > The same arguments of responsibility could be said here of users of old > clients or custom clients.
Bill, are you suggesting 'aurora update start' client command call a scheduler to acquire an update diff first and block startJobUpdate RPC call unless a special command line flag is present? > When updating a job, the scheduler would fill in the current instance count. > However, when I want to change the number of instances, I could simply > bind another value locally when triggering the update. Stephan, this sounds like increasing instances would also require a binding helper, which makes an update process less deterministic (i.e. .aurora config file is no longer self-contained). On Sun, Feb 7, 2016 at 3:02 PM, Erb, Stephan <stephan....@blue-yonder.com> wrote: > A related idea that recently crossed my mind was some kind of pystachio > variable / binding helper: {{aurora.instances}}. > > When updating a job, the scheduler would fill in the current instance count. > However, when I want to change the number of instances, I could simply bind > another value locally when triggering the update. > ________________________________________ > From: Maxim Khutornenko <ma...@apache.org> > Sent: Saturday, February 6, 2016 00:07 > To: dev@aurora.apache.org > Subject: Re: [PROPOSAL] Disallow instance removal in job update > > We have had attempts to safeguard client updater command with a > "dangerous change" warning before but it did not get good feedback. > Besides, automated tools/scripts just ignored it. > > An alternative could be what George suggest on the scaling API thread > mentioned earlier: automatically bump up instance count to the job > active task count. I'd say this could be an implementation to the > proposal above rather than a safeguard as it accomplishes the exact > same goal. > > Bill, do you have any ideas of what that safeguard could be? > > On Fri, Feb 5, 2016 at 2:56 PM, Bill Farner <wfar...@apache.org> wrote: >>> >>> the outdated instance count problem will only get worse as automated >>> scaling tools will quickly render existing .aurora config value obsolete >> >> >> This is not a compelling reason to remove functionality. Sounds like a >> safeguard is needed instead. >> >> On Fri, Feb 5, 2016 at 2:43 PM, Maxim Khutornenko <ma...@apache.org> wrote: >> >>> This is mostly a survey rather than a proposal. How would people think >>> about limiting updater to only adding/updating instances and let >>> killTasks take care of instance removals? >>> >>> We have all heard stories (or happen to create some ourselves) when an >>> outdated instance count value in .aurora config caused unexpected >>> instance removals. Granted, there are plenty of other values in the >>> config that can cause service-wide outage but instance count seems to >>> be the worst in that sense. >>> >>> After the recent refactoring of addInstances and killTasks to act as >>> scaleOut/scaleIn APIs [1], the outdated instance count problem will >>> only get worse as automated scaling tools will quickly render existing >>> .aurora config value obsolete. With that in mind, should we block >>> instance removal in the updater and let an explicit killTasks call be >>> the only acceptable action to reduce instance count? Is there any >>> value (aside from arguable convenience factor) in having >>> startJobUpdate ever killing instances? >>> >>> Thanks, >>> Maxim >>> >>> [1] - http://markmail.org/message/2smaej5n5e54li3g >>>