----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/58259/#review172122 -----------------------------------------------------------
The code change looks decent to me. However, I am unsure about two things: * For us it is common to have jobs with #instance in the ballpark of #agents. The proposed code change could easily block a significant number of agents for scheduling, even if there would be enough capacity for other job instances. So while improving the MTTA for job udpates, this could easily lead to increased MTTA for regularly launched jobs (cron, adhoc, etc). * The alternative dynamic reservation proposal has the advantage that it works when multiple frameworks are used. Would it be plausible to just reserve any used resources in a generic fashion, so that we ensure reservations always come back to Aurora and cannot be intercepted by another framework? Please run `./gradlew jmh -Pbenchmarks='SchedulingBenchmarks.*'` to help ensure the scheduling changes don't come with an unexpected performance regression. src/main/java/org/apache/aurora/scheduler/updater/UpdaterModule.java Lines 52-55 (patched) <https://reviews.apache.org/r/58259/#comment245215> I am trying to understand if this is a good default for this best-effort feature. What is your cluster-wide MTTA? It should give us a decent hint for a suitable default. - Stephan Erb On April 12, 2017, 9:51 a.m., David McLaughlin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/58259/ > ----------------------------------------------------------- > > (Updated April 12, 2017, 9:51 a.m.) > > > Review request for Aurora, Santhosh Kumar Shanmugham, Stephan Erb, and Zameer > Manji. > > > Repository: aurora > > > Description > ------- > > In the Dynamic Reservations review (and on the mailing list), I mentioned > that we could implement update affinity with less complexity using the same > technique as preemption. Here is how that would work. > > This just adds a simple wrapper around the preemptor's BiCache structure and > then optimistically tries to keep an agent free for a task during the update > process. > > > Note: I don't bother even checking the resources before reserving the agent. > I figure there is a chance the agent has enough room, and if not we'll catch > it when we attempt to veto the offer. We need to always check the offer like > this anyway in case constraints change. In the worst case it adds some delay > in the rare cases you increase resources. > > We also don't persist the reservations, so if the Scheduler fails over during > an update, the worst case is that any instances between the KILLED and > ASSIGNED in-flight batch need to fall back to the current first-fit > scheduling algorithm. > > > Diffs > ----- > > src/main/java/org/apache/aurora/scheduler/base/TaskTestUtil.java > f0b148cd158d61cd89cc51dca9f3fa4c6feb1b49 > src/main/java/org/apache/aurora/scheduler/scheduling/TaskScheduler.java > 203f62bacc47470545d095e4d25f7e0f25990ed9 > src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java > a177b301203143539b052524d14043ec8a85a46d > src/main/java/org/apache/aurora/scheduler/updater/InstanceAction.java > b4cd01b3e03029157d5ca5d1d8e79f01296b57c2 > > src/main/java/org/apache/aurora/scheduler/updater/InstanceActionHandler.java > f25dc0c6d9c05833b9938b023669c9c36a489f68 > src/main/java/org/apache/aurora/scheduler/updater/InstanceUpdater.java > c129896d8cd54abd2634e2a339c27921042b0162 > > src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java > e14112479807b4477b82554caf84fe733f62cf58 > src/main/java/org/apache/aurora/scheduler/updater/StateEvaluator.java > c95943d242dc2f539778bdc9e071f342005e8de3 > src/main/java/org/apache/aurora/scheduler/updater/UpdateAgentReserver.java > PRE-CREATION > src/main/java/org/apache/aurora/scheduler/updater/UpdaterModule.java > 13cbdadad606d9acaadc541320b22b0ae538cc5e > > src/test/java/org/apache/aurora/scheduler/scheduling/TaskSchedulerImplTest.java > fa1a81785802b82542030e1aae786fe9570d9827 > src/test/java/org/apache/aurora/scheduler/state/TaskAssignerImplTest.java > cf2d25ec2e407df7159e0021ddb44adf937e1777 > src/test/java/org/apache/aurora/scheduler/updater/AddTaskTest.java > b2c4c66850dd8f35e06a631809530faa3b776252 > src/test/java/org/apache/aurora/scheduler/updater/InstanceUpdaterTest.java > c78c7fbd7d600586136863c99ce3d7387895efee > src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java > 30b44f88a5b8477e917da21d92361aea1a39ceeb > src/test/java/org/apache/aurora/scheduler/updater/KillTaskTest.java > 833fd62c870f96b96343ee5e0eed0d439536381f > > src/test/java/org/apache/aurora/scheduler/updater/NullAgentReserverTest.java > PRE-CREATION > > src/test/java/org/apache/aurora/scheduler/updater/UpdateAgentReserverImplTest.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/58259/diff/2/ > > > Testing > ------- > > ./gradlew build > ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh > > > Thanks, > > David McLaughlin > >
