-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review201642
-----------------------------------------------------------



Thanks for starting the work on this feature! Very exciting.

Most of my concerns lie around future usecases (selfishly like SLA-aware 
updates) :)


api/src/main/thrift/org/apache/aurora/gen/api.thrift
Lines 263 (patched)
<https://reviews.apache.org/r/66716/#comment283130>

    Is this per request? Will end users be able to do partial ACKs?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Line 131 (original), 148-149 (patched)
<https://reviews.apache.org/r/66716/#comment283132>

    nit: make a single line (looks like it could fit, but ignore if it actually 
can't)



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 213-217 (original), 221-240 (patched)
<https://reviews.apache.org/r/66716/#comment283135>

    Does it make sense to/is it possible to combine these? E.g. force drains 
would call with a defaultSlaPolicy of none or something



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 214-216 (original), 222-224 (patched)
<https://reviews.apache.org/r/66716/#comment283150>

    Should this also create `HostMaintenanceRequest` for consistency?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 231 (patched)
<https://reviews.apache.org/r/66716/#comment283149>

    What is the behavior if there is an existing host maintenance request? Do 
we deny the new one or overwrite the old one?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 318-326 (patched)
<https://reviews.apache.org/r/66716/#comment283174>

    Can we make this `@Timed` as well?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 321 (patched)
<https://reviews.apache.org/r/66716/#comment283177>

    Do we need the write lock for each iteration? Is it possible to minimize 
the scope to only when we go to change state for a task to DRAINING?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 325 (patched)
<https://reviews.apache.org/r/66716/#comment283170>

    Why is `checkSla` always `true`?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 328-331 (patched)
<https://reviews.apache.org/r/66716/#comment283171>

    Should this be customizable? I am pretty sure it is fine if it is not, but 
wanted to call it out.



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 333-334 (patched)
<https://reviews.apache.org/r/66716/#comment283172>

    nit: style
    
    ```
    private void maybeDrain(
            boolean checkSla,
            IScheduledTask task,
            String host,
            Storage.MutableStoreProvider store) {
            [NEWLINE]
            ...
    ```



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 337-340 (patched)
<https://reviews.apache.org/r/66716/#comment283178>

    Seems odd to have `if not checkSla, call slaManager to drain the task`
    
    maybe slaManager should behave like a consumer, executing the statechange 
only if SLA check passes?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 342-345 (patched)
<https://reviews.apache.org/r/66716/#comment283173>

    This should never happen if `checkSla` is `true`, correct?
    
    Could we add a metric here as well?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 348 (patched)
<https://reviews.apache.org/r/66716/#comment283175>

    Use `TimeAmount` to do comparisons instead of `expireMs / 1000`



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 349 (patched)
<https://reviews.apache.org/r/66716/#comment283176>

    Add metric here as well, maybe tracking the tasks that cause this.



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 60-61 (patched)
<https://reviews.apache.org/r/66716/#comment283179>

    I think that this makes more sense in `MaintenanceController`



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 90 (patched)
<https://reviews.apache.org/r/66716/#comment283180>

    nit: newline after



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 100 (patched)
<https://reviews.apache.org/r/66716/#comment283186>

    mostly for my curiosity but why 100?



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 110 (patched)
<https://reviews.apache.org/r/66716/#comment283183>

    would `IllegalArgumentException` here be better?



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 121 (patched)
<https://reviews.apache.org/r/66716/#comment283187>

    Maybe use `IllegalArgumentException`



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 147 (patched)
<https://reviews.apache.org/r/66716/#comment283188>

    Add `@Timed` annotation to public methods



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 148-174 (patched)
<https://reviews.apache.org/r/66716/#comment283185>

    Mentioned above, but I think that the concept of `drain` should remain in 
`MaintenanceController` and not leak into `SlaManager`.
    
    Instead of `drain`, this could be something like `checkSlaAndExecute` which 
would be a consumer that checks the SLA then executes an action provided by the 
caller. I am thinking about this in terms of my proposed SLA-aware updates -- I 
would be able to utilize this interface as well.



src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java
Lines 22-28 (patched)
<https://reviews.apache.org/r/66716/#comment283181>

    nit: newlines between function definitions, after interface definition


- Jordan Ly


On April 19, 2018, 5:30 p.m., Santhosh Kumar Shanmugham wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> -----------------------------------------------------------
> 
> (Updated April 19, 2018, 5:30 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> - I have received some last minute comments from customers, which might add 
> some extra features to the `CoordinatorSlaPolicy`.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/SlaManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   src/test/java/org/apache/aurora/scheduler/state/SlaManagerTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
> ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
>  3dd9ce4039b223cb6156462d089f7062a1cde772 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
>  27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
> be07361a27afefa21cc2ba76ce82531a418d9814 
>   
> src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java
>  d59118be13342da9003b0bcb97e12e477d9edf8f 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java 
> d412090c292691305f01bccd1596fb0f6bb003ad 
>   src/test/python/apache/aurora/api_util.py 
> 3fc9b478cc9aada0503e8ed8698a37b4ed926cdd 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> f2a2eae1539f7f6dff6855e4122cc41c6cbb0f7b 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> e4f43d0573c7862adc9bc679f4cea40cc76eac38 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 8e1d0e177959af12b97bdd1cd47845b72bc12fe1 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/removeHostMaintenanceRequest
>  PRE-CREATION 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveCronJob
>  88e1c36a1aa2d192b95963f7aa36e243a447e4af 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveHostMaintenanceRequest
>  PRE-CREATION 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveJobUpdate
>  32fdcdacde58345cdd6c4b449b82c0c90c2b2aae 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveTasks
>  4323031ec6bd128576c2a43ebc11f04a9f046e2f 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/read-compatible/16-saveHostMaintenanceRequest
>  PRE-CREATION 
>   
> src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/read-compatible/17-removeHostMaintenanceRequest
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/66716/diff/1/
> 
> 
> Testing
> -------
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Santhosh Kumar Shanmugham
> 
>

Reply via email to