Yup, I can take the lead on this, I'll work on a design doc first to get the story straight before I dive into coding.
On Thu, Feb 1, 2018 at 10:28 PM, Meghdoot bhattacharya <[email protected] > wrote: > Thx David. Renan can you take a lead on this? > > Yeah we do have separate orchestration for multi aurora cluster geo deploy > and blue-green deploys within each cluster but we have also a requirement > to support the flexible controlled rolling deploy inside each cluster like > in old platform. And we really want to avoid targeting shard ids for the > scenario below. > > Aurora pause is there today but thats user initiated and not deterministic > auto pause but resume already there. So, feeling hopeful below can be done > relatively easily. > > Thx > > On Feb 1, 2018, at 10:08 PM, David McLaughlin <[email protected]> > wrote: > > Yeah that is definitely not possible right now in Aurora. We've built > something external to Aurora (like Spinnaker) that our users use to do this > type of thing (also do it across multiple Schedulers). > > Adding this type of "acknowledge before continuing" functionality to the > updater is certainly possible though. I'm happy to shepherd any proposals. > > > On Thu, Feb 1, 2018 at 9:58 PM, Meghdoot bhattacharya < > [email protected]> wrote: > >> David here is the scenario that we are looking. Its most likely not >> supported but let us know if the changes can be made without much hassle. >> We can then research and do a PR. >> >> The use case arises from our current non mesos platform. >> >> Lets say the app has 100 instances. Here is how we will roll it as an >> example >> >> 1. First to 10 instances (rolled in parallel) and then pause. >> >> 2. App metrics are monitored. >> >> 3. If fine we resume deploy, but increasing the batch size say now 40 in >> parallel and again pause. >> >> 4. Then resume and do the rest 50. >> >> 5. Rollback by default follows 50, 40, 10 but can be changed. >> >> To put this in aurora >> >> 1. We want to do auto pause after every batch. We will use the wait for >> batch completion flag (set it to true) to make sure batch is not sliding. >> If all is good in terms of success threshold met, we would want aurora to >> enter the pause state. Hope existing pause state can be reused. >> >> 2. Then we would want to change the batch size >> >> 3. And then use the existing resume api to start the update. >> >> I am not sure changing the batch size outside of pause window ( with >> batch completion flag) is useful or not. In theory I guess batch size can >> be updated for any scenario. >> >> Pulse update does not provide the same as above. If we poll and see how >> many instances done then we can probably withhold sending a pulse to pause >> the deploy and resume with pulse later but really not deterministic. Also >> dynamic batch size not solved. >> >> Let us know your thoughts. We will like to contribute. >> >> Thx >> >> On Jan 31, 2018, at 6:56 PM, David McLaughlin <[email protected]> >> wrote: >> >> It's not currently possible with a single API call, you'd have to submit >> multiple calls to startJobUpdate, changing the JobUpdateSettings each time. >> When you say the size of the batch could change each step - could it change >> dynamically (e.g. after you've submitted the call to the Scheduler), or is >> all the information known upfront? >> >> On Wed, Jan 31, 2018 at 6:15 PM, Renan DelValle <[email protected] >> > wrote: >> >>> Hi all, >>> >>> We have a use case where we want to deploy X amount of instances in N >>> amount of steps. The size of the batch could potentially change every step. >>> For example, we might start a with deploying 1 instances, followed by batch >>> size of 50, followed by a batch of 49 to deploy 100 instances. >>> >>> Is there any way of achieving this behavior through existing Aurora >>> thrift calls/primitives? >>> >>> Any insights would be greatly appreciated. >>> >>> -Renan >>> >> >> >
