Or without any persistence at all. The client could refuse to adjust the instance count on a job unless there's additional command line argument. The same arguments of responsibility could be said here of users of old clients or custom clients.
On Fri, Feb 5, 2016 at 3:17 PM, John Sirois <j...@conductant.com> wrote: > On Fri, Feb 5, 2016 at 4:07 PM, Maxim Khutornenko <ma...@apache.org> > wrote: > > > We have had attempts to safeguard client updater command with a > > "dangerous change" warning before but it did not get good feedback. > > Besides, automated tools/scripts just ignored it. > > > > An alternative could be what George suggest on the scaling API thread > > mentioned earlier: automatically bump up instance count to the job > > active task count. I'd say this could be an implementation to the > > proposal above rather than a safeguard as it accomplishes the exact > > same goal. > > > > Bill, do you have any ideas of what that safeguard could be? > > > > I'd recommend that an API call that reduced instance count require an > `confirm_instance_reduction =true` parameter - this could be plumbed back > to a flag in the official Aurora client. > That said, since Aurora immediately forgets jobs and splits things into > tasks, I'm not sure this is even sanely possible today. > > Assuming it is possible, any human that turns that flag on by default with > a shell alias or an rc file can take responsibility for their own problem. > If a tool passes the boolean, again - that's the tool's problem. Hopefully > its a carefully developed and vetted auto-scaling tool. > > > > On Fri, Feb 5, 2016 at 2:56 PM, Bill Farner <wfar...@apache.org> wrote: > > >> > > >> the outdated instance count problem will only get worse as automated > > >> scaling tools will quickly render existing .aurora config value > obsolete > > > > > > > > > This is not a compelling reason to remove functionality. Sounds like a > > > safeguard is needed instead. > > > > > > On Fri, Feb 5, 2016 at 2:43 PM, Maxim Khutornenko <ma...@apache.org> > > wrote: > > > > > >> This is mostly a survey rather than a proposal. How would people think > > >> about limiting updater to only adding/updating instances and let > > >> killTasks take care of instance removals? > > >> > > >> We have all heard stories (or happen to create some ourselves) when an > > >> outdated instance count value in .aurora config caused unexpected > > >> instance removals. Granted, there are plenty of other values in the > > >> config that can cause service-wide outage but instance count seems to > > >> be the worst in that sense. > > >> > > >> After the recent refactoring of addInstances and killTasks to act as > > >> scaleOut/scaleIn APIs [1], the outdated instance count problem will > > >> only get worse as automated scaling tools will quickly render existing > > >> .aurora config value obsolete. With that in mind, should we block > > >> instance removal in the updater and let an explicit killTasks call be > > >> the only acceptable action to reduce instance count? Is there any > > >> value (aside from arguable convenience factor) in having > > >> startJobUpdate ever killing instances? > > >> > > >> Thanks, > > >> Maxim > > >> > > >> [1] - http://markmail.org/message/2smaej5n5e54li3g > > >> > > > > > > -- > John Sirois > 303-512-3301 >