----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66716/#review203701 -----------------------------------------------------------
Great documentation and tests! Looks pretty much ready to go from my point of view. I've been using the `SlaManager` to test my proposed SLA-aware updates for a while. src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java Lines 234 (patched) <https://reviews.apache.org/r/66716/#comment285978> I would throw an exception here in case the user is not aware of the `minRequiredInstances` and expects SLA to be respected but it will be silently ignored. src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java Lines 236 (patched) <https://reviews.apache.org/r/66716/#comment285979> Should we also warn users if they are setting an SLA Policy for a preemptible task? They may be unaware why SLA is not being respected. src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java Lines 256 (patched) <https://reviews.apache.org/r/66716/#comment285983> I would add a more descriptive message here (I found it confusing when I first saw it). Maybe something that expresses intent like: Current percentage will not allow any instances to be killed. Must be less than ... src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java Lines 265 (patched) <https://reviews.apache.org/r/66716/#comment285984> Same as above: duration_secs must be less than cluster-wide maximum of ... src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java Lines 277 (patched) <https://reviews.apache.org/r/66716/#comment285985> Same as above. src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java Lines 285 (patched) <https://reviews.apache.org/r/66716/#comment285986> Same as above. src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java Lines 236 (patched) <https://reviews.apache.org/r/66716/#comment285992> I think listing the unaffected tasks might be a bit gratuitious. I would have that as a debug statement if needed. When doing SLA-aware updates, a big job will produce huge `unaffected tasks` lists repeatedly. src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java Lines 273 (patched) <https://reviews.apache.org/r/66716/#comment285995> I believe that if the coordinator is locked for more than 10 seconds, then this method will return false. However, the code will continue since the output is ignored despite the lock never being gotten. I think waiting for 10 seconds is also a long time to block the thread. Is it feasible to try and and get the lock and if it is not immediately available to return? The caller can invoke the method again 10 seconds later for essentially the same effect if they desire. src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java Lines 314-318 (patched) <https://reviews.apache.org/r/66716/#comment285996> Is there a way to set a timeout or increment a metric when a coordinator takes too long to respond? Is there any downside to allowing long requests? src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java Lines 448-449 (patched) <https://reviews.apache.org/r/66716/#comment285989> Remove `.newBuilder()` src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java Lines 453 (patched) <https://reviews.apache.org/r/66716/#comment285990> Remove `ISlaPolicy.build` src/test/java/org/apache/aurora/scheduler/sla/SlaManagerTest.java Lines 1018 (patched) <https://reviews.apache.org/r/66716/#comment285997> `Thread.sleep(15000)` here breaks this test per my comment in `SlaManager` - Jordan Ly On May 23, 2018, 12:21 a.m., Santhosh Kumar Shanmugham wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66716/ > ----------------------------------------------------------- > > (Updated May 23, 2018, 12:21 a.m.) > > > Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb. > > > Repository: aurora > > > Description > ------- > > `Tasks` can specify custom SLA requirements as part of > their `TaskConfig`. One of the new features is the ability > to specify an external coordinator that can ACK/NACK > maintenance requests for tasks. This will be hugely > beneficial for onboarding services that cannot satisfactorily > specify SLA in terms of running instances. > > Maintenance requests are driven from the Scheduler to > improve management of nodes in the cluster. > > > Diffs > ----- > > RELEASE-NOTES.md 5e1f9940a7974e212140b7e5304695afa7f96e78 > api/src/main/thrift/org/apache/aurora/gen/api.thrift > ff48000d613ceef3e03586b94944d13275fb127c > docs/README.md 166bf1ce240474f0a181e023439cfbfbe7363822 > docs/features/sla-requirements.md PRE-CREATION > docs/reference/configuration.md d4b869b938105ba301fc88d41019af2f1707f6f4 > docs/reference/scheduler-configuration.md > a659cfac974059b04ef5593286011decbb7f9110 > examples/vagrant/systemd/aurora-scheduler.service > 57e4bba858672f8da94eaa0499f8e5f3347ab982 > src/main/java/org/apache/aurora/scheduler/app/AppModule.java > ffc07443fae9e5216a5333ae305f75aa9b452a0c > src/main/java/org/apache/aurora/scheduler/config/CliOptions.java > a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 > > src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java > 4073229b74d0e0e7fd31552bd96894ceb8a0971a > > src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java > PRE-CREATION > src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java > 3b4df55a05873e79aae206b117cbc753fa3abb94 > src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION > src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java > 25ed474289f369e74c24e999ad97ed6810c9fd5e > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java > f58c66aaebe8d31913d67a05add0f3d6054e88d1 > src/main/java/org/apache/aurora/scheduler/state/StateModule.java > 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 > src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java > e88cad6cf12312512e6840329db7ca7134ceaae6 > > src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java > 9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b > src/main/python/apache/aurora/admin/admin_util.py > 8240e8093160623b4c30dd212a88b8e122fd9856 > src/main/python/apache/aurora/admin/host_maintenance.py > 83fc2b6ece40d3436cc7de7a034f95224235fcfd > src/main/python/apache/aurora/admin/maintenance.py > 942a237f47a6e0416bbaf244278685477e0f407d > src/main/python/apache/aurora/client/api/__init__.py > f6fd1dd6d7c2bdd5bca3037f501b36badab78c75 > src/main/python/apache/aurora/client/cli/context.py > 06b194114a7f44a61943e0932973e71b53f239b4 > src/main/python/apache/aurora/client/cli/jobs.py > 536d04a21d32c4e586dc943a6f9b0ad0143354a3 > src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java > 63c338e5bbdf60de0fba8d68c6613904abb93fa8 > src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java > 8c1f5ce6d7eb94ec4e0302bfd41318bd0797a1a5 > src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java > e66ec116112df164106598d9ff0bc9e8f465e44f > > src/test/java/org/apache/aurora/scheduler/configuration/ConfigurationManagerTest.java > 749ffeac6cb851f32bba7606390203d7a046a0e6 > > src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java > c6163bbabc7e7748f167b679893a93f58e4ef1ac > src/test/java/org/apache/aurora/scheduler/sla/SlaManagerTest.java > PRE-CREATION > src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java > d37e7a07e9258bc8c0758bf50aece5b79025126b > > src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java > 770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 > > src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java > 2cf66d8154ad3795989ee9026e45af1be509f244 > src/test/java/org/apache/aurora/scheduler/thrift/ThriftIT.java > 40851c419e4d62e6545959eebc0ce144fdecc697 > > src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java > d412090c292691305f01bccd1596fb0f6bb003ad > src/test/python/apache/aurora/admin/test_maintenance.py > ca0239b157f9f9053821af0328b9448703386cd4 > src/test/python/apache/aurora/api_util.py > 3fc9b478cc9aada0503e8ed8698a37b4ed926cdd > src/test/python/apache/aurora/client/api/test_scheduler_client.py > f2a2eae1539f7f6dff6855e4122cc41c6cbb0f7b > src/test/python/apache/aurora/client/cli/test_add.py > b22b9f72fbddb553bfc33b1bd8e10636a8d887a6 > src/test/python/apache/aurora/client/cli/test_kill.py > 0e859dc8618a044b2a4a6f73f45cab4a7ffcce4e > src/test/sh/org/apache/aurora/e2e/sla_policy.aurora PRE-CREATION > ui/src/main/js/components/TaskConfigSummary.js > 64880f4bd5c5358287ef481df455f6355fedd7d6 > > > Diff: https://reviews.apache.org/r/66716/diff/9/ > > > Testing > ------- > > ./build-support/jenkins/build.sh > > > Thanks, > > Santhosh Kumar Shanmugham > >
