-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63536/#review191256
-----------------------------------------------------------


Fix it, then Ship it!




Overall LGTM! You may want to add some entry to the release notes explaining 
how Mesos should fix some bugs on their end before enabling this.


src/main/java/org/apache/aurora/scheduler/state/PartitionManager.java
Lines 90-91 (patched)
<https://reviews.apache.org/r/63536/#comment268977>

    Can this just be:
    ```
    if (stateChange.getNewState().equals(ScheduleStatus.PARTITIONED))
    ```



src/main/python/apache/aurora/config/schema/base.py
Lines 157 (patched)
<https://reviews.apache.org/r/63536/#comment268978>

    Can you explain what `Disable` does? If it is `RescheduleImmediately`, then 
the default 0 on `PartitionPolicy` is sufficient.


- Jordan Ly


On Nov. 16, 2017, 1:54 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63536/
> -----------------------------------------------------------
> 
> (Updated Nov. 16, 2017, 1:54 a.m.)
> 
> 
> Review request for Aurora, Jordan Ly, Santhosh Kumar Shanmugham, and Bill 
> Farner.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> This is my prototype code for adding partition-awareness to Aurora. There is 
> a proposal document to accompany this here: 
> https://docs.google.com/document/d/1E3GlsVTJLEMAkDWk2_PTxzkRZcapb8nF_5q5AADQI7g/edit#
> 
> I'd like feedback on the high-level approach before adding unit tests, 
> metrics, logging, etc.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> 1d369263d779b549b9304018437f535ddc855966 
>   examples/vagrant/upstart/aurora-scheduler.conf 
> 5ca3caef03b6632cd4dbf47711b1ef183f6a6449 
>   src/main/java/org/apache/aurora/scheduler/base/Conversions.java 
> 33cc012a2cad929b0dd1ce236597b870cfc5aba0 
>   src/main/java/org/apache/aurora/scheduler/base/Jobs.java 
> 8d5f4e57c6b4f847cb74471f246fd0b7dd0cbc36 
>   src/main/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 7c223eae69503fe1d5bf34c430438637abcbcb9b 
>   
> src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java
>  5e83b2acdde7198d16427d4031e9772f78612554 
>   src/main/java/org/apache/aurora/scheduler/state/PartitionManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/SideEffect.java 
> b91a0852e968b4aa9d74801601671cb61af3648a 
>   src/main/java/org/apache/aurora/scheduler/state/StateManagerImpl.java 
> 9989ed441cd9bc442e6472768880ce7924c3bdd9 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> c03fff11ea3a4086f9daaa8b07315006c1b481e4 
>   src/main/java/org/apache/aurora/scheduler/state/TaskStateMachine.java 
> eb4fe7d78ad1e6ec430c428df527bd0cf3a053c1 
>   src/main/python/apache/aurora/client/cli/jobs.py 
> b79ae56bee0e5692cacf1e66f4a4126b06aaffdc 
>   src/main/python/apache/aurora/config/schema/base.py 
> 18ce826363009e1cc0beac5cce4edf42610d487e 
>   src/main/python/apache/aurora/config/thrift.py 
> bedf8cd6641e1b1a930602791b758d584af4891c 
>   src/test/java/org/apache/aurora/scheduler/base/JobsTest.java 
> 13f656f241a8a9a3d339f4053f165070c2669ef3 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> c2d875bb5c393dd95d75251fe86dc649ceba7bd9 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModuleTest.java
>  7b0429109e9a7795e559db264e7737fc55ff0169 
>   src/test/java/org/apache/aurora/scheduler/state/PartitionManagerTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/state/StateManagerImplTest.java 
> 0366cd6e9ddba0c3b9c88ffb50738767a352a17a 
>   src/test/java/org/apache/aurora/scheduler/state/TaskStateMachineTest.java 
> 8d6c3fff0af2df39bb929f760b862a2edf5d6fca 
>   src/test/python/apache/aurora/client/cli/test_task.py 
> 186cb2737ba8e169819b7d54f86a7344a669b6cb 
>   src/test/python/apache/aurora/config/test_thrift.py 
> 7a1567a9b67917072bb0ba3eea5857e968375f4d 
>   src/test/sh/org/apache/aurora/e2e/partition_aware.aurora PRE-CREATION 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh 
> f0819fb7ac758ad1229a76fd9794b393400e9f63 
> 
> 
> Diff: https://reviews.apache.org/r/63536/diff/6/
> 
> 
> Testing
> -------
> 
> Manual testing in Vagrant by stopping and starting the Mesos agent. With 
> three jobs:
> 
> 1) No PartitionPolicy (verified existing behavior of moving from PARTITIONED 
> directly to LOST)
> 2) PartitionPolicy with custom delay_secs (verified sat in PARTITIONED for a 
> while before moving to LOST)
> 3) PartitionPolicy with reschedule=False (verified sat in PARTITIONED 
> indefinitely)
> 
> I also verified tasks are able to transition to various states (back to 
> RUNNING or moving to FAILED, etc.) when you turn the agent back on.
> 
> 
> File Attachments
> ----------------
> 
> Task in PARTITIONED state
>   
> https://reviews.apache.org/media/uploaded/files/2017/11/07/02c7fc72-b11d-4ef9-a86b-914e748cad99__Screen_Shot_2017-11-07_at_11.23.41_AM.png
> Task back into running when partition resolved
>   
> https://reviews.apache.org/media/uploaded/files/2017/11/07/a0413f54-1572-4410-a386-0a22e78fab13__Screen_Shot_2017-11-07_at_11.26.02_AM.png
> Compaction of PARTITIONED cycles (note timestamps)
>   
> https://reviews.apache.org/media/uploaded/files/2017/11/07/edec32e5-b3ec-4fdc-b93f-5449519805ae__Screen_Shot_2017-11-07_at_11.27.47_AM.png
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>

Reply via email to