> On July 9, 2018, 2:04 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/java/org/apache/aurora/scheduler/updater/SlaKillController.java
> > Lines 140 (patched)
> > <https://reviews.apache.org/r/67696/diff/5/?file=2054728#file2054728line140>
> >
> >     Repeating my comment from internal review - 
> >     
> >     Technically this is can be any kind of work. Should we call this class 
> > `SlaAwareInstanceUpdater`? Since this has more to do with actually update 
> > logic than kill itself. And perhaps the method can become 
> > `startSlaAwareUpdate`?
> >     
> >     The job of this class is around checking safety and scheduling the 
> > supplied work to be evaulated. Naming the class based on it usage rather 
> > than its function seems weird to me.
> 
> Jordan Ly wrote:
>     This would be ideal but I think there are some small parts of the logic 
> that changing the name would be misleading towards.
>     
>     For example:
>     
>     ```
>             storeProvider
>                 .getTaskStore()
>                 .fetchTask(taskId)
>                 .filter(task -> isKillable(task.getStatus()))
>                 .ifPresent(task -> {
>                   incrementJobStatCounter(killAttemptsByJob, 
> SLA_KILL_ATTEMPT, instance.getJobKey());
>                   slaManager.checkSlaThenAct(
>                       task,
>                       instructions.getDesiredState().getTask().getSlaPolicy(),
>                       slaStoreProvider -> performKill(
>                           slaStoreProvider,
>                           instance,
>                           key,
>                           status,
>                           killCommand),
>                       ImmutableMap.of(),
>                       // If the task is not assigned, force the update since 
> it does not affect the
>                       // SLA. For example, if a task is THROTTLED or PENDING, 
> we probably don't care
>                       // if the update replaces it with a new instance.
>                       !SLAVE_ASSIGNED_STATES.contains(task.getStatus()));
>     ```
>     
>     Within the SLA aware code, we have kill-specific checks in order to 
> determine if we should continue with the action (or if it should be a NOOP), 
> or if we should force the kill regardless (the task would not affect SLA). 
> Changing the name would require a refactoring for a more complex interface 
> but I am not sure if we would ever use it for anything else.
>     
>     I am leaning towards keeping the current name and functionality right now 
> until another sla-aware update use case comes in and we have a better idea on 
> how we can refactor the current abstraction (from kill to generic, which I 
> don't think should be too difficult but I'm sure I'll regret those words...)

These checks can be moved into the `killCommand`. I think the abstraction is 
not perfect as it is today.


> On July 9, 2018, 2:04 p.m., Santhosh Kumar Shanmugham wrote:
> > src/test/java/org/apache/aurora/scheduler/updater/SlaKillControllerTest.java
> > Lines 180 (patched)
> > <https://reviews.apache.org/r/67696/diff/5/?file=2054739#file2054739line180>
> >
> >     Drop `_` here and everywhere. `SLA_CHECKING` sounds like a valid event 
> > that is definied in code.
> 
> Jordan Ly wrote:
>     Can you elaborate on this? Did you mean drop `_MESSAGE`? I added 
> `_MESSAGE` because it is not a true "event" in the sense that it piggybacks 
> on `INSTANCE_UPDATING` or `INSTANCE_ROLLING_BACK` with a message showing 
> progress.

Can use `SLA_CHECKING_MESSAGE` or simply `sla checking event is added`. I 
started searching for an event type of `SLA_CHECKING` after reading this 
comment, which is a little confusing.


- Santhosh Kumar


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67696/#review205866
-----------------------------------------------------------


On July 9, 2018, 7:22 p.m., Jordan Ly wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67696/
> -----------------------------------------------------------
> 
> (Updated July 9, 2018, 7:22 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Daniel Knightly, Renan DelValle, 
> Santhosh Kumar Shanmugham, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This patch enables SLA-aware updates.
> 
> Following https://reviews.apache.org/r/66716/, tasks may now specify custom 
> SLA policies that will be respected by the scheduler during maintenance. This 
> patch integrates into the same system to allow users to specify if they want 
> their updates to also respect SLA. Please see 
> https://docs.google.com/document/d/1lCoDyoX26qrGrptrgO7vJHqYR_L2CBRGFIywsAd8uQo/edit?usp=sharing
>  for a more detailed description.
> 
> This patch adds two optional Thrift fields, `slaAware` to `JobUpdateSettings` 
> and `message` to `JobInstanceUpdateEvent`. These should be forward and 
> backwards compatible.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md edc081f502370190597ad028f3275cdfd572f5ca 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 7265b11103aa12743c42355163ae64e98e965d7f 
>   docs/features/job-updates.md b52eb35a1de9da40f8a00ef0b905df30069029d3 
>   docs/reference/configuration.md acab4c58d9ab3c04d156fed3636e77aed6d1faf4 
>   docs/reference/scheduler-configuration.md 
> 805e516689be019101f7c220c89fd9c391bb93b3 
>   src/main/java/org/apache/aurora/scheduler/base/Tasks.java 
> 2e13aacf576e648d9fffe989e4fc05c8954e72d8 
>   src/main/java/org/apache/aurora/scheduler/pruning/TaskHistoryPruner.java 
> 9aa51c3637df74cca088bd65c5539e1ebb8e5f0d 
>   
> src/main/java/org/apache/aurora/scheduler/updater/InstanceActionHandler.java 
> 9fa68b2dd55b4e4f5436356c1b94af1393967679 
>   src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java 
> a002d955c3bc7b7c39da5e130e8c10c536bdcebd 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  ec577cccb86914ebd679ca235103f79dd7e7b79d 
>   src/main/java/org/apache/aurora/scheduler/updater/OneWayJobUpdater.java 
> f2d33fb9ab6bd2c3ff199ab03dc75b1d6d618f3a 
>   src/main/java/org/apache/aurora/scheduler/updater/SlaKillController.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdaterModule.java 
> 74ee1745e1fd7c308e2bdfa46aeb18a7ecfe14d2 
>   src/main/java/org/apache/aurora/scheduler/updater/Updates.java 
> f949fd54f524780672167e12fcadf268da08e679 
>   src/main/python/apache/aurora/client/api/updater_util.py 
> 4e3986220aaa4c9b138394b962120b176185af12 
>   src/main/python/apache/aurora/config/schema/base.py 
> 7baded79acdf863670afc183d740dcad602490c2 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> dcf58896f1e866c0369ba1b78060236e98d9d46b 
>   
> src/test/java/org/apache/aurora/scheduler/storage/AbstractJobUpdateStoreTest.java
>  3a06a451da4ef3acccb33b5495b9fae141557148 
>   
> src/test/java/org/apache/aurora/scheduler/testing/FakeScheduledExecutor.java 
> 0aea369d8a8f75291de9691b6d61f3d48895507c 
>   src/test/java/org/apache/aurora/scheduler/updater/AddTaskTest.java 
> 43f857d893a54e19e71b36f2f06fef3a3ef6e874 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 5667a1b59681a6de87149d7161d760bff5da3818 
>   src/test/java/org/apache/aurora/scheduler/updater/KillTaskTest.java 
> 2c27ec7136ff22a3570a4ec278c73f7ee310f628 
>   
> src/test/java/org/apache/aurora/scheduler/updater/SlaKillControllerTest.java 
> PRE-CREATION 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> 2baba2aa55865ec298a4c9e5af3952b56cb9a910 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveJobInstanceUpdateEvent
>  48902d39b9d2cbeae1a52180669aba8349e4dd65 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveJobUpdate
>  08dfa5b9a67989083a0d405ce8100698a4d096ae 
>   ui/src/main/js/components/UpdateInstanceEvents.js 
> 8351f2c4c256e27625d94b70842be0e91065a551 
>   ui/src/main/js/components/UpdateSettings.js 
> d756f5916fd8c39dcbb5578ee5eda198f807f458 
> 
> 
> Diff: https://reviews.apache.org/r/67696/diff/7/
> 
> 
> Testing
> -------
> 
> Added unit tests, `./gradlew test`.
> 
> Tested at scale with over 10,000 SLA-aware instance update events occuring 
> concurrently. Scheduler stability did not seem to be affected.
> 
> 
> Thanks,
> 
> Jordan Ly
> 
>

Reply via email to