On Thu, Jun 16, 2016 at 7:26 PM, David McLaughlin <dmclaugh...@apache.org> wrote:
> On Thu, Jun 16, 2016 at 1:28 PM, Igor Morozov <igm...@gmail.com> wrote: > > > Hi aurora people, > > > > I would like to start a discussion around few things we would like to see > > supported in aurora scheduler. It is based on our experience of > integrating > > aurora into Uber infrastructure and I believe all the items I'm going to > > talk about will benefit the community and people running aurora clusters. > > > > 1. We support multiple aurora clusters in different failure domains and > we > > run services in those domains. The upgrade workflow for those services > > includes rolling out the same version of a service software to all aurora > > clusters concurrently while monitoring the health status and other > service > > vitals that includes like checking error logs, service stats, > > downstream/upstream services health. That means we occasionally need to > > manually trigger a rollback if things go south and rollback all the > update > > jobs in all aurora clusters for that particular service. So here are the > > problems we discovered so far with this approach: > > > > - We don't have an easy way to assign a common unique identifier > for > > all JobUpdates in different aurora clusters in order to reconcile them > > later into a single meta update job so to speak. Instead we need to > > generate that ID and keep it in every aurora's JobUpdate > > metadata(JobUpdateRequest.taskConfig). Then in order to get the status > the > > upgrade workflow running in different data centers we have to query all > > recent jobs and based on their metadata content try to filter in ones > that > > we thing belongs to a currently running upgrade for the service. > > > > We propose to change > > struct JobUpdateRequest { > > /** Desired TaskConfig to apply. */ > > 1: TaskConfig taskConfig > > > > /** Desired number of instances of the task config. */ > > 2: i32 instanceCount > > > > /** Update settings and limits. */ > > 3: JobUpdateSettings settings > > > > * /**Optional Job Update key's id, if not specified aurora will generate > > one**/* > > > > * 4: optional string id*} > > > > There is potentially another much more involved solution of supporting > user > > defined metadata mentioned in this ticket: > > https://issues.apache.org/jira/browse/AURORA-1711 > > > > I actually think the linked ticket is less involved? It has no impact on > logic, etc. So the work involved is just updating the Thrift object and > then writing in the metadata to the storage layer. But I'm fine with either > (or both!) approaches. > > I agree that both approached would work, I guess more important part here > for us is ability to query JobUpdates by their metadata or combination of > metadata fields > > > > > > > > > > - All that brings us to a second problem we had to deal with during > > the upgrade: > > We don't have a good way to manually trigger a job update rollback in > > aurora. The use case is again the same, while running multiple update > jobs > > in different aurora clusters we have a real production requirement to > start > > rolling back update jobs if things are misbehaving and the nature of this > > misbehavior could be potentially very complex. Currently we abort the job > > update and start a new one that would essentially roll cluster forward > to a > > previously run version of the software. > > > > We propose a new convenience API to rollback a running or complete > > JobUpdate: > > > > * /**Rollback job update. */* > > * Response rollbackJobUpdate(* > > * /** The update to rollback. */* > > * 1: JobUpdateKey key,* > > * /** A user-specified message to include with the induced job > update > > state change. */* > > * 3: string message)* > > > > > +1 to the idea, but there is ambiguity in what rollback means when you pass > a JobUpdateKey. > > Example: > > *undoJobUpdate* (it would replay the previousState instructions from the > given job update) > *rollbackToJobUpdate* (you'd pass the JobUpdateKey and it would replay the > instructions from that job update) > > Correct me if I'm missing something here but if we assume the semantic of > startJobUpdate defined as moving job instances from initial state A to > desired state B then rollbackJobUpdate will be a complete inverse operation > for this procedure. > > > 2. The next problem is related to the way we collect service cluster > > status. I couldn't find a way to quickly get latest statuses for all > > instances/shards for a job in one query. Instead we query all task > statuses > > for a job, them manually iterate through all the statuses and filter the > > latest one in grouped by instance ids. For services with lots of churn on > > tasks statuses that means huge blobs of thrift transferred every time we > > issue a query. I was thinking adding something in this line: > > struct TaskQuery { > > // TODO(maxim): Remove in 0.7.0. (AURORA-749) > > 8: Identity owner > > 14: string role > > 9: string environment > > 2: string jobName > > 4: set<string> taskIds > > 5: set<ScheduleStatus> statuses > > 7: set<i32> instanceIds > > 10: set<string> slaveHosts > > 11: set<JobKey> jobKeys > > 12: i32 offset > > 13: i32 limit > > * 14: i32 limit_per_instance* > > } > > > > but I'm less certain on API here so any help would be welcome. > > > > All the changes we propose would be backward compatible. > > > > -- > > -Igor > > > -- -Igor