> On May 25, 2018, 8:50 a.m., Stephan Erb wrote:
> > LGTM! Especially thanks for the thorough documentation.
> > 
> > I have just a bunch of questions below. Feel free to address those as you 
> > see fit. I am away for the next few days I don't want to block the patch 
> > any further.

Thanks for the review Stephan! Much appreciated.


> On May 25, 2018, 8:50 a.m., Stephan Erb wrote:
> > docs/features/sla-requirements.md
> > Lines 115 (patched)
> > <https://reviews.apache.org/r/66716/diff/10/?file=2027918#file2027918line115>
> >
> >     If we are unsure about some details of the interface, we could consider 
> > calling it experimental for now.

Good point. I will call it out as experimental.


> On May 25, 2018, 8:50 a.m., Stephan Erb wrote:
> > docs/features/sla-requirements.md
> > Lines 130 (patched)
> > <https://reviews.apache.org/r/66716/diff/11/?file=2028985#file2028985line130>
> >
> >     Having a GET request with a body is somewhat strange (see 
> > https://stackoverflow.com/questions/978061/http-get-with-request-body). I 
> > am wondering if we should make this a POST instead?

I have been back-and-forth on this. I am taking your suggestion and changing 
this to POST with a body and no query params similar to `WebHooks`.


> On May 25, 2018, 8:50 a.m., Stephan Erb wrote:
> > docs/features/sla-requirements.md
> > Lines 131 (patched)
> > <https://reviews.apache.org/r/66716/diff/11/?file=2028985#file2028985line131>
> >
> >     Do we really need the task parameter if everything is encoded in the 
> > json anyway?

I was going to get rid of it, but then realized that the `cluster` is not 
really available via the `IScheduledTask` and is hidden inside the 
`ExecutorConfig` json string, which made it hard for me to parse it out for my 
vagrant example. This critical info is needed if a `Coordinator` needs to 
handle maintenance across multiple clusters.

I will keep the `task` param in, since it can be a short-hand most customers. 
Only those who require additional information (such as the `hostname` and 
`ports` info as you asked earlier) need to dive into the body to figure those 
out.


> On May 25, 2018, 8:50 a.m., Stephan Erb wrote:
> > examples/vagrant/systemd/aurora-scheduler.service
> > Lines 63 (patched)
> > <https://reviews.apache.org/r/66716/diff/11/?file=2028989#file2028989line63>
> >
> >     Is this change related to the patch?

Apparently, we have a situation where we have to wait for the next offer-cycle 
after an `aurora_admin host_activate` to get tasks to schedule on it. Something 
to do with `HostOffers`. I am not a 100% certain on the issue, but the effect 
is consistent. 

This is added to ensure that the `sla-aware maintenance` `end-to-end test` does 
not have to wait a full 5-min to get tasks to run again on the only agent in 
the vagrant cluster. 

I will add comment to explain this. I will open a JIRA to track this as well.


> On May 25, 2018, 8:50 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java
> > Lines 237 (patched)
> > <https://reviews.apache.org/r/66716/diff/11/?file=2028992#file2028992line238>
> >
> >     The `preferred` tier exists in the default `tiers.json` but might not 
> > be available in general. 
> >     
> >     Maybe just say that "Tier X does not support SlaPolicy".

Done.


> On May 25, 2018, 8:50 a.m., Stephan Erb wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
> > Lines 395-406 (patched)
> > <https://reviews.apache.org/r/66716/diff/11/?file=2028995#file2028995line395>
> >
> >     From a perspective of "single level of abstraction" it looks weird that 
> > `askCoordinatorThenAct` handles many exceptions but not the 
> > `InterruptedException`.  
> >     
> >     I guess you had a reason for that. What am I missing?

One of the previous diffs used a timeout on acquiring the coordinator locks. 
This is irrelevant now dropping it. I will go over the changes once again and 
resolve any intellj suggested improvements. I was flying with the `check-style` 
in `gradle`, apparently it is not enough.


> On May 25, 2018, 8:50 a.m., Stephan Erb wrote:
> > src/main/python/apache/aurora/admin/host_maintenance.py
> > Line 173 (original), 173 (patched)
> > <https://reviews.apache.org/r/66716/diff/11/?file=2029002#file2029002line173>
> >
> >     Should we already call out the old maintenance mechanism as deprecated 
> > in the RELEASE-NOTES?

My thinking was to replace the `host_drain` logic with the one under 
`sla_host_drain` instead. So we will deprecate `sla_host_drain` in the next 
release and replace the logic for `host_drain`?

Putting `host_drain` under deprecation, nevertheless. We can discuss this 
further when you are back and happy to address any concerns.


- Santhosh Kumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review203807
-----------------------------------------------------------


On May 25, 2018, 2:14 p.m., Santhosh Kumar Shanmugham wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> -----------------------------------------------------------
> 
> (Updated May 25, 2018, 2:14 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, Renan DelValle, and 
> Stephan Erb.
> 
> 
> Bugs: AURORA-1978
>     https://issues.apache.org/jira/browse/AURORA-1978
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md 5e1f9940a7974e212140b7e5304695afa7f96e78 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ff48000d613ceef3e03586b94944d13275fb127c 
>   docs/README.md 166bf1ce240474f0a181e023439cfbfbe7363822 
>   docs/features/sla-requirements.md PRE-CREATION 
>   docs/operations/configuration.md 85a6fab54e03d52e42ba7d4ff47ab93f5b8293ee 
>   docs/reference/configuration.md d4b869b938105ba301fc88d41019af2f1707f6f4 
>   docs/reference/scheduler-configuration.md 
> a659cfac974059b04ef5593286011decbb7f9110 
>   examples/vagrant/systemd/aurora-scheduler.service 
> 57e4bba858672f8da94eaa0499f8e5f3347ab982 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java
>  4073229b74d0e0e7fd31552bd96894ceb8a0971a 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> e88cad6cf12312512e6840329db7ca7134ceaae6 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/admin/admin_util.py 
> 8240e8093160623b4c30dd212a88b8e122fd9856 
>   src/main/python/apache/aurora/admin/host_maintenance.py 
> 83fc2b6ece40d3436cc7de7a034f95224235fcfd 
>   src/main/python/apache/aurora/admin/maintenance.py 
> 942a237f47a6e0416bbaf244278685477e0f407d 
>   src/main/python/apache/aurora/client/api/__init__.py 
> f6fd1dd6d7c2bdd5bca3037f501b36badab78c75 
>   src/main/python/apache/aurora/client/cli/context.py 
> 06b194114a7f44a61943e0932973e71b53f239b4 
>   src/main/python/apache/aurora/client/cli/jobs.py 
> 536d04a21d32c4e586dc943a6f9b0ad0143354a3 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
> 63c338e5bbdf60de0fba8d68c6613904abb93fa8 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 8c1f5ce6d7eb94ec4e0302bfd41318bd0797a1a5 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/configuration/ConfigurationManagerTest.java
>  749ffeac6cb851f32bba7606390203d7a046a0e6 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 
> 0fabb3370713e57d417adbd2af9e24a4d515c60a 
>   src/test/java/org/apache/aurora/scheduler/cron/quartz/QuartzTestUtil.java 
> b7dcf3af366c9def63165dc9bef998ab5e95ed49 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
> c6163bbabc7e7748f167b679893a93f58e4ef1ac 
>   src/test/java/org/apache/aurora/scheduler/sla/SlaManagerTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java 
> d37e7a07e9258bc8c0758bf50aece5b79025126b 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  2cf66d8154ad3795989ee9026e45af1be509f244 
>   src/test/java/org/apache/aurora/scheduler/thrift/ThriftIT.java 
> 40851c419e4d62e6545959eebc0ce144fdecc697 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java 
> d412090c292691305f01bccd1596fb0f6bb003ad 
>   src/test/python/apache/aurora/admin/test_maintenance.py 
> ca0239b157f9f9053821af0328b9448703386cd4 
>   src/test/python/apache/aurora/api_util.py 
> 3fc9b478cc9aada0503e8ed8698a37b4ed926cdd 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> f2a2eae1539f7f6dff6855e4122cc41c6cbb0f7b 
>   src/test/python/apache/aurora/client/cli/test_add.py 
> b22b9f72fbddb553bfc33b1bd8e10636a8d887a6 
>   src/test/python/apache/aurora/client/cli/test_kill.py 
> 0e859dc8618a044b2a4a6f73f45cab4a7ffcce4e 
>   src/test/sh/org/apache/aurora/e2e/http_example.py 
> ba7d11429b5f3945a1fdf1808105b11e6ef78420 
>   src/test/sh/org/apache/aurora/e2e/partition_aware.aurora 
> 7ea9fadefcb4846cfe4922e11febec74c75f15db 
>   src/test/sh/org/apache/aurora/e2e/sla_coordinator.py PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/sla_policy.aurora PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> 888efe4e990913d81335f1f3e2c9b6473de7bee8 
>   ui/src/main/js/components/TaskConfigSummary.js 
> 64880f4bd5c5358287ef481df455f6355fedd7d6 
> 
> 
> Diff: https://reviews.apache.org/r/66716/diff/12/
> 
> 
> Testing
> -------
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> 
> 
> Thanks,
> 
> Santhosh Kumar Shanmugham
> 
>

Reply via email to