On March 13, 2016, 6:04 a.m., John Sirois wrote: > > While your patch is rather easy, I am not sure it is the best way to move > > forward. It feels like it is crossing streams with > > https://issues.apache.org/jira/browse/AURORA-1288. Putting some thought > > into this might be helpful in the long run. > > Bill Farner wrote: > FWIW i don't think it complicates or even diverges from that ticket. In > my opinion it's yet to be seen whether it's feasible to use the same client > for a custom executor (at least, without a decent amount of modularization > work). At the very least, that effort has lost momentum and we shouldn't > block progress for it. > > Stephan Erb wrote: > I mostly brought it up because the ticket also repeatedly mentions the > default Mesos command executor. Supporting this one does not sound to > different from supporting Docker without Thermos. It would also need similar > logic at the UI layer to allow use the Mesos sanbox UI instead of the Thermos > one. > > I agree that we should not block progress here. I justed wanted to make > sure we are not rushing things (i.e., there isn't even a jira ticket right > now). > > Maxim Khutornenko wrote: > +1 to Stephan's concerns. The schema changes in this patch don't > necessarily convey enough meaning to paint a clear picture of where this > effort leads us. FWICT, nothing in the Task aside from resources is > applicable to the Docker case and it feels quite hacky to onboard a new > executor case this way. > > > In my opinion it's yet to be seen whether it's feasible to use the same > client for a custom executor (at least, without a decent amount of > modularization work). > > Bill, would you mind clarifying what this means? Are you expecting this > to be a purely experimental (POC) effort or is there a solid production > quality future here? If it's the former, would it be more appropriate to have > this effort baked in a private branch to avoid possibly unnecessary code > churn and review cycles? > > Bill Farner wrote: > | it feels quite hacky to onboard a new executor case this way > > Suggestions solicited! Just please don't forget that the intent is to > offer real, immediate value - the docker support in Aurora today is quite > crippled, and this will address the biggest shortcomings (entrypoints, and > zero required deps in images). > > | FWICT, nothing in the Task aside from resources is applicable to the > Docker case > > This is a good point. Perhaps we should create a separate struct and > field in `Job` for this case? > > | Bill, would you mind clarifying what this means? > > What i mean is that the client, DSL, and executor all have relatively > high coupling. Adding custom executor support in the client will require > non-trivial effort to break that coupling. I would like to avoid blocking > this feature on that goal. > > | Are you expecting this to be a purely experimental (POC) effort or is > there a solid production quality future here? > > That is very much dependent on the underlying support in mesos. Today, i > see it as the best support for docker containers in mesos. It's been > available for some time, and the work here is entirely plumbing to enable it > in Aurora. > > | would it be more appropriate to have this effort baked in a private > branch to avoid possibly unnecessary code churn and review cycles? > > I don't foresee enough churn to warrant that. > > John Sirois wrote: > Noting that I'm backing off this change until sentiment settles one way > or the other. If it settles in-favor I'll address both Stephan & Joshua's > feedback at that point. > > Stephan Erb wrote: > I have to backoff out of the discussion here, as I don't have the > necessary cycles to participate. A couple of closing notes from my side: > > * I agree with Maxim that giving an empty process list a special meaning > feels kind of like a hack. > * I probably wouldn't have complained about this if it was that way from > the beginning... > * Docker support is still considered experimental, so no decision is cast > in stone. We can change stuff without to much hassle. > * It is great that you are improving the current docker support (even > though I am a fanboy of the upcoming unified container support :-) > > Maxim Khutornenko wrote: > Thanks for explaning Bill. I am fine continuing this effort given the > above. > > > This is a good point. Perhaps we should create a separate struct and > field in Job for this case? > > I don't have bandwidth to think about the alternatives at the moment but > your suggestion about plugging it higher in the chain (e.g. Job struct) > sounds logical. > > Joshua Cohen wrote: > Could we just add name and resources to the Container struct (if we even > need name? I think it's only used by thermos today, but I haven't double > checked this)? I was going to suggest something similar, but was trying to > avoid exploding the change. > > John Sirois wrote: > I've gone down this path trying to do things right and given pystachio is > FAPP frozen code atm: > ```python > class BaseJob(Struct): > name = Required(String) > role = Required(String) > contact = String > cluster = Required(String) > environment = Required(String) > instances = Default(Integer, 1) > announce = Announcer > tier = String > > cron_schedule = String > cron_collision_policy = Default(String, "KILL_EXISTING") > > update_config = Default(UpdateConfig, UpdateConfig()) > > constraints = Map(String, String) > service = Default(Boolean, False) > max_task_failures = Default(Integer, 1) > production = Default(Boolean, False) > priority = Default(Integer, 0) > > enable_hooks = Default(Boolean, False) # enable client API hooks; from > env python-list 'hooks' > > > def inherit(sup): > def assert_struct_type(item): > if not isinstance(item, TypeMetaclass) or item.type_factory() != > 'Struct': > raise TypeError('Cannot decorate {} with @inherit, it is not a > Struct subtype'.format(item)) > assert_struct_type(sup) > def wrap(cls): > assert_struct_type(cls) > merged_typemap = sup.TYPEMAP.copy() > merged_typemap.update(cls.TYPEMAP) > cls.TYPEMAP = merged_typemap > return cls > return wrap > > > @inherit(BaseJob) > class DockerDaemonJob(Struct): > container = Required(Docker) > resources = Resources > > > @inherit(BaseJob) > class MesosJob(Struct): > name = Default(String, '{{task.name}}') > task = Required(Task) > > health_check_config = Default(HealthCheckConfig, > HealthCheckConfig()) > # TODO(wickman) Make Default(Any, LifecycleConfig()) once pystachio #17 > is addressed. > lifecycle = Default(LifecycleConfig, > DefaultLifecycleConfig) > task_links = Map(String, String) # Unsupported. See > AURORA-739 > > container = Container > ``` > > Smirk, Smile, Gak or other? > > Joshua Cohen wrote: > I’m neutral-ish on the inherits decorator (If we could get wickman to > merge it into pystachio itself rather than defining it in our schema, I’d be > more ok with it). > > I’m not sure how I feel about resources being defined on the Job rather > than the Container though, what's the reasoning behind that? > > Also, a little bit iffy on the fact that this would, in essence, require > users to do... > > > jobs = [ > DockerDaemonJob(..., service=True) > ] > > John Sirois wrote: > > ... If we could get wickman to merge it into pystachio itself rather > than defining it in our schema, I’d be more ok with it > > FWICT we can't. I owe a dev@ thread today to propose adopting the code > like we did for java commons so we can get changes in. > > > I’m not sure how I feel about resources being defined on the Job rather > than the Container though, what's the reasoning behind that? > > I was trying to avoid stuffing yet more contextually relevant fields in > Structs. If resources were moved to Container or Docker then we'd have the > possibility of conflicting resources for existing MesosJobs (ThermosJob is a > better name)that use a container (docker images with embedded thermos > executors). This would require logic of the very sort folks were not happy > with above - ie using a subset of fields for certain cases. > > > Also, a little bit iffy on the fact that this would, in essence, > require users to do... > > I assume you're focused on the `service=True` inconvenience, in which > case a DockerDaemonService could be introduced that follows the pattern of > `Job`, `Service`. > If that's not what you were worried about, but instead the verbose name > (`DockerDaemonJob`), I'm open to suggestions, but thought it would be good to > clearly point out the job type will execute under docker and not thermos. > > Joshua Cohen wrote: > > I was trying to avoid stuffing yet more contextually relevant fields in > Structs. If resources were moved to Container or Docker then we'd have the > possibility of conflicting resources for existing MesosJobs (ThermosJob is a > better name)that use a container (docker images with embedded thermos > executors). This would require logic of the very sort folks were not happy > with above - ie using a subset of fields for certain cases. > > Could we take a similar approach to what you're doing w/ MesosJob vs > DockerDaemonJob? E.g. > > class BaseJob(Struct): > # as per above > > @inherit(BaseJob): > class MesosJob(Struct): > task = Required(task) > # etc. > > @inherit(Docker) > class DockerDaemon(Struct): > resources = Required(Resources) > > Which would, in theory, allow job config like: > > jobs=[ > BaseJob( > cluster = 'devcluster', > role = getpass.getuser(), > environment = 'test', > contact = '{{role}}@localhost', > instances = 1, > container = Container( > docker = DockerDaemon( > resources = Resources(cpu=0.4, ram=32*MB, disk=64*MB), > image = 'http_example', > parameters = [ > Parameter(name = 'env', value = 'HTTP_PORT=8888'), > Parameter(name = 'expose', value = '8888'), > Parameter(name = 'publish', value = '8888:8888/tcp'), > ], > ), > ), > ) > ] > > Is that super gross? I think it's definitely making our complex > configuration language even moreso, and things will only get more complicated > when I shortly introduce `Image` to the `Job` struct as a peer of > `Container`. Maybe it's solvable with documentation though? > > (On a side note, I agree that as part of these changes, it would be nice > to call it `ThermosJob` rather than `MesosJob`, but I think that's probably > too much to ask of users who may already be using `MesosJob` for what's > supposed to be a small tweak to allow an experimental feature.) > > > I assume you're focused on the service=True inconvenience, in which > case a DockerDaemonService could be introduced that follows the pattern of > Job, Service. > > Yes, that's what I mean, and adding `DockerDaemonService` would work. > That said, this is starting to feel like it's snowballing. I don't want to > send us down the path of refactoring our core configuration constructs for > the sake of this change. At the same time, these are things that are easier > to get right from the start than they are to change later, so I'm ok with > exploring our options. > > John Sirois wrote: > I honestly have no reliable feel for super gross for APIs (we went down > this path with the thrift/REST API). I think I need you all to assert the > API you want. I'm happy to bring it home from there. > > John Sirois wrote: > I did speak with Brian offline and he did ack not having time to maintain > pystachio. He generously added me as a github comitter and pypi maintainer > though with dispensation to shepherd pull requests and do releases, so I > think it makes sense to consider things like this @extends (or a python > native superclass mechanism), Any types and Union types when coming up with > the API we desire most. > > John Sirois wrote: > And I'll point out we have a nascent implementation of union types from > an Aurora alumnus here: https://github.com/wickman/pystachio/pull/19
Union types (`Choice` fields), are in: https://github.com/wickman/pystachio/pull/19 I'm currently working support for `Any` types https://github.com/wickman/pystachio/issues/17 I'll do a pystachio release by eow that includes these new features. This should make any conceivably desirable form of API evolution / addition possible fwict. At that point I'll either circle back here or let Joshua drive things home in his Universal + legacy support changes to the client schema. - John ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/44745/#review123314 ----------------------------------------------------------- On March 12, 2016, 7:48 p.m., John Sirois wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/44745/ > ----------------------------------------------------------- > > (Updated March 12, 2016, 7:48 p.m.) > > > Review request for Aurora, Joshua Cohen and Bill Farner. > > > Repository: aurora > > > Description > ------- > > This allows for a job config with no processes defined IFF there is also > a Docker container defined. In this case it is assumed that the process > to run is the Docker container's ENTRYPOINT via the Mesos Docker > containerizer. > > src/main/python/apache/aurora/config/__init__.py | 26 > +++++++++++++++++++++++--- > src/main/python/apache/aurora/config/thrift.py | 9 +++++---- > src/test/python/apache/aurora/client/test_config.py | 41 > +++++++++++++++++++++++++++++++++++------ > src/test/python/apache/aurora/config/test_thrift.py | 5 +++++ > 4 files changed, 68 insertions(+), 13 deletions(-) > > > Diffs > ----- > > src/main/python/apache/aurora/config/__init__.py > 65923be1cb8b88139b8eab0ac5b75428972d3cb1 > src/main/python/apache/aurora/config/thrift.py > be0cd68674a71bd4baadf276f40a4bc0223ce4be > src/main/python/apache/thermos/config/schema_base.py > a6768e67189b0560afef844d6b269bed8ada5f2f > src/test/python/apache/aurora/client/test_config.py > b1a3c1865819899ef19173be0f861783a2631d0a > src/test/python/apache/aurora/config/test_thrift.py > 88292d3c4423c0555088a0adaee3c0e62ed0567e > > Diff: https://reviews.apache.org/r/44745/diff/ > > > Testing > ------- > > Locally green `./build-support/jenkins/build.sh` > > I also patched in https://reviews.apache.org/r/44685/ which this change > depends on and was able to run scheduler with > `-allow_docker_parameters -require_docker_use_executor` and successfully > run this job: > ``` > import getpass > > jobs=[ > Service( > cluster = 'devcluster', > role = getpass.getuser(), > environment = 'test', > name = 'http_example_docker_executor', > contact = '{{role}}@localhost', > instances = 1, > task = Task( > name = 'http_docker_example', > resources = Resources(cpu=0.4, ram=32*MB, disk=64*MB), > processes = [] > ), > container = Container( > docker = Docker( > image = 'http_example', > parameters = [ > Parameter(name = 'env', value = 'HTTP_PORT=8888'), > Parameter(name = 'expose', value = '8888'), > Parameter(name = 'publish', value = '8888:8888/tcp'), > ], > ), > ), > ) > ] > ``` > > Using the image created with > `docker build -t http_example src/test/sh/org/apache/aurora/e2e` from: > ``` > FROM python:2.7 > > # mesos.native requires libcurl-nss to initialize MesosExecutorDriver > RUN apt-get update && apt-get -y install libcurl4-nss-dev > > COPY http_example.py /tmp/ > ENTRYPOINT python /tmp/http_example.py $HTTP_PORT > ``` > > I could connect to http://aurora.local:8888 and get `Hello!` back. > > > Thanks, > > John Sirois > >