I have updated the summary <https://docs.google.com/document/d/1myYX3yuofGr8JIzud98xXd5mqgpZ8q_RqKBpSff4-WE> with a minor but important change. Instead of relying on TaskHistoryPruner to remove JobConfigurations from the storage, the cleanup is now going to happen inside a TaskStateChange event listener when all job instances reach terminal status. As before, feedback is highly appreciated!
On Tue, Jul 26, 2016 at 4:55 PM, Maxim Khutornenko <ma...@apache.org> wrote: > I felt this change is large enough to warrant a brief design summary. > Please, take a look at this document > <https://docs.google.com/document/d/1myYX3yuofGr8JIzud98xXd5mqgpZ8q_RqKBpSff4-WE>and > leave your feedback as applicable. > > On Fri, Jul 1, 2016 at 9:15 AM, Maxim Khutornenko <ma...@apache.org> > wrote: > >> Thanks for the feedback! I will follow up with an itemized epic to >> track this refactoring work. >> >> On Wed, Jun 29, 2016 at 2:29 PM, Jake Farrell <jfarr...@apache.org> >> wrote: >> > huge +1, socket activation is our exact use case for this type of action >> > also >> > >> > -Jake >> > >> > On Wed, Jun 29, 2016 at 5:18 PM, Erb, Stephan < >> stephan....@blue-yonder.com> >> > wrote: >> > >> >> I recently thought about the same idea. Use case for us would be to >> scale >> >> a job 0 instances. While this sounds useless at first, it can be quite >> >> powerful when trying to implement a feature like socket activation. >> >> >> >> ________________________________________ >> >> From: Maxim Khutornenko <ma...@apache.org> >> >> Sent: Wednesday, June 29, 2016 22:43 >> >> To: dev@aurora.apache.org >> >> Subject: [PROPOSAL] Job as a first-class citizen >> >> >> >> TL;DR - I am proposing we store and maintain job-level data >> >> (JobConfiguration [1]) instead of relying on storing everything in a >> >> TaskConfig [2]. >> >> >> >> >> >> Aurora storage currently does not have a concept of a "job" when it >> >> comes to services and adhoc jobs. Instead, it relies on a collection >> >> of TaskConfigs that represent a view of what the job state is. This is >> >> in stark contrast to cron jobs, which are already represented by the >> >> JobConfiguration struct. >> >> >> >> This lack of representation limits our ability to deliver richer >> >> features and may result in suboptimal design and storage utilization. >> >> Specifically, the following is currently impossible: >> >> >> >> - storing normalized job-level data without repeating it in every task >> >> (e.g. contactEmail, isService); >> >> >> >> - maintaining job-level data that may be different for every instance >> >> (SLA requirements, topology specs for stateful services and etc.); >> >> >> >> - knowing what the job instance count is without pulling all ACTIVE >> >> tasks and iterating over them. >> >> >> >> To address the above, I propose we start treating Aurora job as a >> >> tangible entity in the storage and specifically use JobConfiguration >> >> wherever applicable. As a welcome side effect, this will let us: >> >> >> >> - allow instantaneous job updates when job-level fields are updated >> >> (e.g. those that don't require instance restarts); >> >> - finally get rid of the deprecated Identity struct [3]; >> >> - reduce or completely eliminate DB garbage collection of abandoned job >> >> keys [4] >> >> >> >> Any thoughts, suggestions, objections? >> >> >> >> Thanks, >> >> Maxim >> >> >> >> >> >> [1] - >> >> >> https://github.com/apache/aurora/blob/4e28b9c8b29b66f2f10b0a6cafdec1f8e2c1bd7b/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L316-L338 >> >> >> >> [2] - >> >> >> https://github.com/apache/aurora/blob/4e28b9c8b29b66f2f10b0a6cafdec1f8e2c1bd7b/api/src/main/thrift/org/apache/aurora/gen/api.thrift#L240-L284 >> >> >> >> [3] - https://issues.apache.org/jira/browse/AURORA-84 >> >> >> >> [4] - RowGarbageCollector: >> >> >> >> >> https://github.com/apache/aurora/blob/b24619b28c4dbb35188871bacd0091a9e01218e3/src/main/java/org/apache/aurora/scheduler/storage/db/RowGarbageCollector.java >> >> >> > >