Re: Review Request 31376: Introduce JobUpdateSummary.key field, dual write that field when it is read/received.
On Feb. 24, 2015, 8:54 p.m., Zameer Manji wrote: src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml, line 149 https://reviews.apache.org/r/31376/diff/1/?file=874379#file874379line149 Have you filed an upstream bug about this behaviour? I ventured a search at whether this is by design, i welcome your contribution. Tracing the mybatis code for why exactly this happens is not how i would like to spend my afternoon, and i do not want to file an ignorant bug report. - Bill --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/#review73906 --- On Feb. 24, 2015, 8:12 p.m., Bill Farner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/ --- (Updated Feb. 24, 2015, 8:12 p.m.) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- There are 3 ways a JobUpdateSummary lands in storage: - starting a job update via thrift - replaying a log record to save a job update - replaying a snapshot log record This change focuses on ensuring that `JobUpdateSummary.key` is always available for code that is currently relying on the legacy fields (`updateId` and `jobKey`) by cloning them. A follow-up change will alter all code inside the scheduler to only consume the new field (removing `Updates.getKey`). Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6f6124a4c844a1abcf07401d80c3e50eb8b4 src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 384e9b536c3ee3edf7d90960aa51ef98948af088 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 27981393d700c84f6aaa791f12b24d0cec28b5df src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 3e491d999dc3f9e116d3502e4c08c04bd6c6bab9 src/main/java/org/apache/aurora/scheduler/updater/Updates.java 6493d687d00a1d285777ce3c3b5dd45cd08ceb71 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml 7685597bb6acafe1412b23234227adb0eddad429 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java f4c92e64c36fb0c000ab8e9199dfed2fc35dcb36 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 7e5d0edefbed3f67116361da15dd4c969c67cfe8 src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 152e70489b351b5bcf06f85baddbe31259d0aef4 src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 945144dcb5240c3713d909344c82a9312cd3ba5c src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 2088a8f1be04218f2a5abc7fb4b04170b37ba2d1 Diff: https://reviews.apache.org/r/31376/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 31376: Introduce JobUpdateSummary.key field, dual write that field when it is read/received.
On Feb. 24, 2015, 8:54 p.m., Zameer Manji wrote: src/main/java/org/apache/aurora/scheduler/updater/Updates.java, line 57 https://reviews.apache.org/r/31376/diff/1/?file=874378#file874378line57 I don't think we should pass around any mutable state. I think data objects like JobUpdateSummary should be immutable and the extra complextity at call sites is worth the safety we get. I agree wrt permanent changes outside of backfill logic. This, however, is a short-term migration hack that is not supposed to be used anywhere outside of backfilling. Anyway, I am ok with the change. Just feels weird to go through all these hoops in this particular case. - Maxim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/#review73906 --- On Feb. 24, 2015, 8:12 p.m., Bill Farner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/ --- (Updated Feb. 24, 2015, 8:12 p.m.) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- There are 3 ways a JobUpdateSummary lands in storage: - starting a job update via thrift - replaying a log record to save a job update - replaying a snapshot log record This change focuses on ensuring that `JobUpdateSummary.key` is always available for code that is currently relying on the legacy fields (`updateId` and `jobKey`) by cloning them. A follow-up change will alter all code inside the scheduler to only consume the new field (removing `Updates.getKey`). Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6f6124a4c844a1abcf07401d80c3e50eb8b4 src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 384e9b536c3ee3edf7d90960aa51ef98948af088 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 27981393d700c84f6aaa791f12b24d0cec28b5df src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 3e491d999dc3f9e116d3502e4c08c04bd6c6bab9 src/main/java/org/apache/aurora/scheduler/updater/Updates.java 6493d687d00a1d285777ce3c3b5dd45cd08ceb71 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml 7685597bb6acafe1412b23234227adb0eddad429 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java f4c92e64c36fb0c000ab8e9199dfed2fc35dcb36 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 7e5d0edefbed3f67116361da15dd4c969c67cfe8 src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 152e70489b351b5bcf06f85baddbe31259d0aef4 src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 945144dcb5240c3713d909344c82a9312cd3ba5c src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 2088a8f1be04218f2a5abc7fb4b04170b37ba2d1 Diff: https://reviews.apache.org/r/31376/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 30891: Offer filtering for static vetoes. Part 3 of 4: Offer filtering.
On Feb. 23, 2015, 10:52 p.m., Bill Farner wrote: src/main/java/org/apache/aurora/scheduler/async/OfferManager.java, line 255 https://reviews.apache.org/r/30891/diff/7/?file=873086#file873086line255 s/may/will/? Changed. - Maxim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30891/#review73703 --- On Feb. 23, 2015, 8:36 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30891/ --- (Updated Feb. 23, 2015, 8:36 p.m.) Review request for Aurora, Kevin Sweeney and Bill Farner. Bugs: AURORA-909 https://issues.apache.org/jira/browse/AURORA-909 Repository: aurora Description --- Offer filtering for static vetoes. Part 3 of 4: Filtering out statically banned offers. Will not apply cleanly: diffed with https://reviews.apache.org/r/30890 as a parent. Original RB: https://reviews.apache.org/r/28617/ Diffs - src/main/java/org/apache/aurora/scheduler/async/OfferManager.java b241d7b22c3d1ceca127b2671eb608ae41283bf3 src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 21ea7d2b9d2f8c76367d7ae985270402bb51ea26 src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 5a0f7ddb7e8fa6869cbb0fdfd07c6881780c6917 src/test/java/org/apache/aurora/scheduler/async/OfferManagerImplTest.java 7ee2bb9bec6c59ba67b65d5b1229c64aca1277ff src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 9eef52a333e09454e8fd0026371c7e64472a883d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java b6d4d8e771c7d16a46e43c7d5c427b911f8b661d Diff: https://reviews.apache.org/r/30891/diff/ Testing --- ./gradlew -Pq build Thanks, Maxim Khutornenko
Review Request 31376: Introduce JobUpdateSummary.key field, dual write that field when it is read/received.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/ --- Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- There are 3 ways a JobUpdateSummary lands in storage: - starting a job update via thrift - replaying a log record to save a job update - replaying a snapshot log record This change focuses on ensuring that `JobUpdateSummary.key` is always available for code that is currently relying on the legacy fields (`updateId` and `jobKey`) by cloning them. A follow-up change will alter all code inside the scheduler to only consume the new field (removing `Updates.getKey`). Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6f6124a4c844a1abcf07401d80c3e50eb8b4 src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 384e9b536c3ee3edf7d90960aa51ef98948af088 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 27981393d700c84f6aaa791f12b24d0cec28b5df src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 3e491d999dc3f9e116d3502e4c08c04bd6c6bab9 src/main/java/org/apache/aurora/scheduler/updater/Updates.java 6493d687d00a1d285777ce3c3b5dd45cd08ceb71 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml 7685597bb6acafe1412b23234227adb0eddad429 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java f4c92e64c36fb0c000ab8e9199dfed2fc35dcb36 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 7e5d0edefbed3f67116361da15dd4c969c67cfe8 src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 152e70489b351b5bcf06f85baddbe31259d0aef4 src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 945144dcb5240c3713d909344c82a9312cd3ba5c src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 2088a8f1be04218f2a5abc7fb4b04170b37ba2d1 Diff: https://reviews.apache.org/r/31376/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 30891: Offer filtering for static vetoes. Part 3 of 4: Offer filtering.
On Feb. 24, 2015, 7:35 p.m., Kevin Sweeney wrote: src/main/java/org/apache/aurora/scheduler/async/OfferManager.java, line 301 https://reviews.apache.org/r/30891/diff/7/?file=873086#file873086line301 No need to hold the intrinsic lock while logging here Having an explicit lock will be inconsistent with the rest of this class and will add up to clutter. Given the on-demand logging here, I don't see much value in over-optimizing this logic. On Feb. 24, 2015, 7:35 p.m., Kevin Sweeney wrote: src/main/java/org/apache/aurora/scheduler/async/OfferManager.java, line 317 https://reviews.apache.org/r/30891/diff/7/?file=873086#file873086line317 Same as above - no need to hold the intrinsic lock while logging here. Same here. On Feb. 24, 2015, 7:35 p.m., Kevin Sweeney wrote: src/test/java/org/apache/aurora/scheduler/async/OfferManagerImplTest.java, line 78 https://reviews.apache.org/r/30891/diff/7/?file=873089#file873089line78 It would be good form to reset the level in the tearDown here. Sure, done. On Feb. 24, 2015, 7:35 p.m., Kevin Sweeney wrote: src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java, line 119 https://reviews.apache.org/r/30891/diff/3/?file=862800#file862800line119 as written, a caller needs to know FAILURE and FAILURE_STATIC_MISMATCH both mean failure, but it would be easy to just check for `== FAILURE` and miss that in code reviews. Would it make sense to add `isFailure()` and `isStaticMismatch()` here? Also, consider fetching the VetoGroup in the constructor so that precondition checks will fail faster as written, a caller needs to know FAILURE and FAILURE_STATIC_MISMATCH both mean failure, but it would be easy to just check for == FAILURE and miss that in code reviews. Would it make sense to add isFailure() and isStaticMismatch() here? This would shift failure detection responsibility upstream and will open up for invalid combinations like isFailure=False isStaticMismatch=True. I believe enum-based approach is the more accurate and easier to reason solution in this case. Also, consider fetching the VetoGroup in the constructor so that precondition checks will fail faster `Veto.identifyGroup` does not provide any additional error handling. It's already failing fast as the only way to create an instance with vetoes is through this: ``` public static Assignment failure(SetVeto vetoes) { return new Assignment(NO_TASK_INFO, MorePreconditions.checkNotBlank(vetoes)); } ``` - Maxim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30891/#review72448 --- On Feb. 23, 2015, 8:36 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30891/ --- (Updated Feb. 23, 2015, 8:36 p.m.) Review request for Aurora, Kevin Sweeney and Bill Farner. Bugs: AURORA-909 https://issues.apache.org/jira/browse/AURORA-909 Repository: aurora Description --- Offer filtering for static vetoes. Part 3 of 4: Filtering out statically banned offers. Will not apply cleanly: diffed with https://reviews.apache.org/r/30890 as a parent. Original RB: https://reviews.apache.org/r/28617/ Diffs - src/main/java/org/apache/aurora/scheduler/async/OfferManager.java b241d7b22c3d1ceca127b2671eb608ae41283bf3 src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 21ea7d2b9d2f8c76367d7ae985270402bb51ea26 src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 5a0f7ddb7e8fa6869cbb0fdfd07c6881780c6917 src/test/java/org/apache/aurora/scheduler/async/OfferManagerImplTest.java 7ee2bb9bec6c59ba67b65d5b1229c64aca1277ff src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 9eef52a333e09454e8fd0026371c7e64472a883d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java b6d4d8e771c7d16a46e43c7d5c427b911f8b661d Diff: https://reviews.apache.org/r/30891/diff/ Testing --- ./gradlew -Pq build Thanks, Maxim Khutornenko
Re: Review Request 29943: Uptime-driven scheduler job updates
On Feb. 24, 2015, 7:30 p.m., Kevin Sweeney wrote: Is this ready for review now? It is. However, since AURORA-1041 is still in Open I am going to discard it and repost when the ticket moves into Accepted. - Maxim --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29943/#review73877 --- On Jan. 20, 2015, 9:12 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29943/ --- (Updated Jan. 20, 2015, 9:12 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-1041 https://issues.apache.org/jira/browse/AURORA-1041 Repository: aurora Description --- This is the first take on implementing job uptime driven updates. In addition to the olde good batch_size, instances can now be dispatched in arbitrary sequence depending on the overall uptime (health) of the job. The uptime is specified by a tuple of **waitForUptimeMs** and **waitForUptimePercentInstances** values. An excerpt from api.thrift explaining the feature: ``` /** * The uptime-driven update throttles the number of instances being updated at any given moment * according to the job uptime calculations. The X% of instances up over Y interval invariant * is preserved over the entire job update lifetime. No new instances are dispatched for update * unless that invariant is satisfied. Instances are dispatched in their natural uptime order, * shortest uptime first. * * For example, when set as below the update will block until at least 90% of job instances are in * RUNNING state for at least 1 minute: *waitForUptimeMs = 6 *waitForUptimePercentInstances = 90 * * When using uptime-driven update, it's expected that updateGroupSize is left unset to allow job * uptime settings drive the update progress. However, if updateGroupSize is set it will be * pre-applied before SLA uptime calculations to determine the update working set. As a side * effect, the updateGroupSize results in a natural ordering of instances taken for each group * (instances within a group are still updated in a shortest uptime first order). * * For example, if set as below the number of instances being updated at any given moment will * never exceed 5 even though the uptime calculations may allow more than 5: *updateGroupSize = 5 *waitForUptimeMs = 6 *waitForUptimePercentInstances = 90 * * NOTE on update rollback: with the uptime-driven update, there is no reliable way to ensure a * graceful throttled rollback as unstable/flapping instances may never yield an acceptable uptime * to perform an uptime-coordinated rollback. As such, when rollbackOnFailure=True AND the * updateGroupSize=0 the updater will dispatch all affected instances at once. * Use rollbackOnFailure=True with caution for uptime-driven updates. */ ``` For reviewers: recommend starting with api.thrift and then proceeding to the InstanceUptimeStrategy.java that implements the core algo. TODO: - vagrant e2e test - more corner case unit test coverage in JobUpdaterIT - client warning message in case uptime specs are used with client updater - docs Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 08ba1cdf88b712de22c26c04443079282db59ef9 src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java eae79d59b445ea58f46dc9e3107c03fbd83b6a95 src/main/java/org/apache/aurora/scheduler/sla/SlaUtil.java 156b9c0a2fa0c0ec4b7220d5ec2cc40c3e59d1d6 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java ac92959f34a3b0962d6aa018dc82a5ac72ea1b34 src/main/java/org/apache/aurora/scheduler/updater/InstanceUptimeProviderImpl.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java a992938d4e12b20f81608be6bbdc24c0a211c3fd src/main/java/org/apache/aurora/scheduler/updater/OneWayJobUpdater.java 27a5b9026f5ac3b3bdeb32813b10435bc3dab173 src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java b53086169aa53d27a39a01cadf8d3c4a8ecb68de src/main/java/org/apache/aurora/scheduler/updater/UpdaterModule.java 5733da3daeacd8cb726310e5d9933635e3993687 src/main/java/org/apache/aurora/scheduler/updater/strategy/FilteringStrategy.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/updater/strategy/InstanceUptimeProvider.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/updater/strategy/InstanceUptimeStrategy.java PRE-CREATION
Re: Review Request 31376: Introduce JobUpdateSummary.key field, dual write that field when it is read/received.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/#review73899 --- Ship it! src/main/java/org/apache/aurora/scheduler/updater/Updates.java https://reviews.apache.org/r/31376/#comment120333 Would it make sense to accept/return `JobUpdateSummary`? The mutable/immutable dance is rather confusing at call sites. - Maxim Khutornenko On Feb. 24, 2015, 8:12 p.m., Bill Farner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/ --- (Updated Feb. 24, 2015, 8:12 p.m.) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- There are 3 ways a JobUpdateSummary lands in storage: - starting a job update via thrift - replaying a log record to save a job update - replaying a snapshot log record This change focuses on ensuring that `JobUpdateSummary.key` is always available for code that is currently relying on the legacy fields (`updateId` and `jobKey`) by cloning them. A follow-up change will alter all code inside the scheduler to only consume the new field (removing `Updates.getKey`). Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6f6124a4c844a1abcf07401d80c3e50eb8b4 src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 384e9b536c3ee3edf7d90960aa51ef98948af088 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 27981393d700c84f6aaa791f12b24d0cec28b5df src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 3e491d999dc3f9e116d3502e4c08c04bd6c6bab9 src/main/java/org/apache/aurora/scheduler/updater/Updates.java 6493d687d00a1d285777ce3c3b5dd45cd08ceb71 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml 7685597bb6acafe1412b23234227adb0eddad429 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java f4c92e64c36fb0c000ab8e9199dfed2fc35dcb36 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 7e5d0edefbed3f67116361da15dd4c969c67cfe8 src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 152e70489b351b5bcf06f85baddbe31259d0aef4 src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 945144dcb5240c3713d909344c82a9312cd3ba5c src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 2088a8f1be04218f2a5abc7fb4b04170b37ba2d1 Diff: https://reviews.apache.org/r/31376/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 31376: Introduce JobUpdateSummary.key field, dual write that field when it is read/received.
On Feb. 24, 2015, 8:28 p.m., Maxim Khutornenko wrote: src/main/java/org/apache/aurora/scheduler/updater/Updates.java, line 57 https://reviews.apache.org/r/31376/diff/1/?file=874378#file874378line57 Would it make sense to accept/return `JobUpdateSummary`? The mutable/immutable dance is rather confusing at call sites. I'm generally weary of passing mutable state around. I feel that this signature more clearly communicates the behavior than accepting a mutable value would. Interested in what Zameer thinks for a third opinion, though. - Bill --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/#review73899 --- On Feb. 24, 2015, 8:12 p.m., Bill Farner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/ --- (Updated Feb. 24, 2015, 8:12 p.m.) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- There are 3 ways a JobUpdateSummary lands in storage: - starting a job update via thrift - replaying a log record to save a job update - replaying a snapshot log record This change focuses on ensuring that `JobUpdateSummary.key` is always available for code that is currently relying on the legacy fields (`updateId` and `jobKey`) by cloning them. A follow-up change will alter all code inside the scheduler to only consume the new field (removing `Updates.getKey`). Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6f6124a4c844a1abcf07401d80c3e50eb8b4 src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 384e9b536c3ee3edf7d90960aa51ef98948af088 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 27981393d700c84f6aaa791f12b24d0cec28b5df src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 3e491d999dc3f9e116d3502e4c08c04bd6c6bab9 src/main/java/org/apache/aurora/scheduler/updater/Updates.java 6493d687d00a1d285777ce3c3b5dd45cd08ceb71 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml 7685597bb6acafe1412b23234227adb0eddad429 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java f4c92e64c36fb0c000ab8e9199dfed2fc35dcb36 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 7e5d0edefbed3f67116361da15dd4c969c67cfe8 src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 152e70489b351b5bcf06f85baddbe31259d0aef4 src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 945144dcb5240c3713d909344c82a9312cd3ba5c src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 2088a8f1be04218f2a5abc7fb4b04170b37ba2d1 Diff: https://reviews.apache.org/r/31376/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 31338: Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run.
On Feb. 24, 2015, 6:01 a.m., Joshua Cohen wrote: Steve Niemitz wrote: I'm not a big fan of how the parsing works here either. I was thinking about this last night, I think I have a better plan here. Lemme know what you think. I already want to add volume support per-job at some point, so I propose adding the needed thrift objects to api.thrift (Volume, Mode enum), and converting the command line into those objects in ExecutorSettings. That would then let me reuse the volume-adding code in MesosTaskFactory for normal volumes in the future. Re: command line, I chose the format there because it's the same as the docker command line. What about making a new parser to parse that into the above mentioned thrift objects? Stephan Erb wrote: How about using the same format and argument name as Mesos' `--default_container_info`: { type: MESOS, volumes: [ { host_path: ./.private/tmp, container_path: /tmp, mode: RW } ] } See https://mesos.apache.org/documentation/latest/configuration/ Steve Niemitz wrote: JSON on the command line is an escaping nightmare IMO, and very verbose to specify 2 (or 3) strings for this use case. Bill Farner wrote: JSON on the command line also makes me uneasy, but i don't think we have precedent for command line entities with this many knobs (3 are obvious so far). How about 2 approaches as straw men: `-volume_mounts_read_only` and `-volume_mounts_read_write`, both key-value mappings `-volume_mounts_config`, path to a JSON file with entities similar to what Stephan sketched above The first approach has the upside that it fits with the status quo, but it cannot be easily extended if we ever want more attributes. The second would be the first config file we use, but is more future-proof. Joshua Cohen wrote: I think a single arg is easiest to reason about. I think I'm ok with Steve's follow up suggestion to create the thrift types that will be needed for per-job mounts now and use them to aid in arg parsing? That said, I'm not sure we should enforce the docker mount format since in theory these mounts can/will apply to other containers where they may not make as much sense. Steve Niemitz wrote: I think it's simple enough it'd make sense for any container system. host:container:mode is pretty generic. I'm also not as big of a fan about multiple args, and a default of RO if not specified seems reasonable to me. Works for me, but -1 to defaulting. I much prefer explicit over implicit, and this way it's easy to see how to turn a read-only mount into read-write. - Bill --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/#review73755 --- On Feb. 24, 2015, 2:07 a.m., Steve Niemitz wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/ --- (Updated Feb. 24, 2015, 2:07 a.m.) Review request for Aurora, Jay Buffington and Bill Farner. Bugs: AURORA-1107 https://issues.apache.org/jira/browse/AURORA-1107 Repository: aurora Description --- Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run. This is the first portion of allowing per-job mounts, however, I wanted to get this out first since more people want it. I'll implement per-job mounts in a future review. Diffs - docs/deploying-aurora-scheduler.md d1123359961fd59ddb8c1a07f80f293bdd46019f src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java bacfbfeb237ecddf82f58679e05be012c5214e61 src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 24b61c1e4f615295acf28d904588e1512972d3f4 src/main/java/org/apache/aurora/scheduler/mesos/ExecutorSettings.java baacb71403d55c5b90fc11cb2a23f552a32e8ba5 src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 5340d651b298ec8aa079e73d6d2f652fdf876293 src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 6575b7d420f17ec68d6e2a83e7b380f684577d4f src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 444d6d3fdaf86eb84612f846eaa326eb75c49898 src/test/java/org/apache/aurora/scheduler/mesos/TaskExecutors.java efe62ceb502ead88a2f0cd6d09a76664e465d9bc Diff: https://reviews.apache.org/r/31338/diff/ Testing --- Thanks, Steve Niemitz
Re: Review Request 31376: Introduce JobUpdateSummary.key field, dual write that field when it is read/received.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/#review73906 --- src/main/java/org/apache/aurora/scheduler/updater/Updates.java https://reviews.apache.org/r/31376/#comment120338 I don't think we should pass around any mutable state. I think data objects like JobUpdateSummary should be immutable and the extra complextity at call sites is worth the safety we get. src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml https://reviews.apache.org/r/31376/#comment120339 Have you filed an upstream bug about this behaviour? - Zameer Manji On Feb. 24, 2015, 12:12 p.m., Bill Farner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/ --- (Updated Feb. 24, 2015, 12:12 p.m.) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- There are 3 ways a JobUpdateSummary lands in storage: - starting a job update via thrift - replaying a log record to save a job update - replaying a snapshot log record This change focuses on ensuring that `JobUpdateSummary.key` is always available for code that is currently relying on the legacy fields (`updateId` and `jobKey`) by cloning them. A follow-up change will alter all code inside the scheduler to only consume the new field (removing `Updates.getKey`). Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6f6124a4c844a1abcf07401d80c3e50eb8b4 src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 384e9b536c3ee3edf7d90960aa51ef98948af088 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 27981393d700c84f6aaa791f12b24d0cec28b5df src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 3e491d999dc3f9e116d3502e4c08c04bd6c6bab9 src/main/java/org/apache/aurora/scheduler/updater/Updates.java 6493d687d00a1d285777ce3c3b5dd45cd08ceb71 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml 7685597bb6acafe1412b23234227adb0eddad429 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java f4c92e64c36fb0c000ab8e9199dfed2fc35dcb36 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 7e5d0edefbed3f67116361da15dd4c969c67cfe8 src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 152e70489b351b5bcf06f85baddbe31259d0aef4 src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 945144dcb5240c3713d909344c82a9312cd3ba5c src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 2088a8f1be04218f2a5abc7fb4b04170b37ba2d1 Diff: https://reviews.apache.org/r/31376/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 30891: Offer filtering for static vetoes. Part 3 of 4: Offer filtering.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30891/#review72448 --- Ship it! src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java https://reviews.apache.org/r/30891/#comment118503 as written, a caller needs to know FAILURE and FAILURE_STATIC_MISMATCH both mean failure, but it would be easy to just check for `== FAILURE` and miss that in code reviews. Would it make sense to add `isFailure()` and `isStaticMismatch()` here? Also, consider fetching the VetoGroup in the constructor so that precondition checks will fail faster src/main/java/org/apache/aurora/scheduler/async/OfferManager.java https://reviews.apache.org/r/30891/#comment120285 No need to hold the intrinsic lock while logging here src/main/java/org/apache/aurora/scheduler/async/OfferManager.java https://reviews.apache.org/r/30891/#comment120286 Same as above - no need to hold the intrinsic lock while logging here. src/test/java/org/apache/aurora/scheduler/async/OfferManagerImplTest.java https://reviews.apache.org/r/30891/#comment120290 It would be good form to reset the level in the tearDown here. - Kevin Sweeney On Feb. 23, 2015, 12:36 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30891/ --- (Updated Feb. 23, 2015, 12:36 p.m.) Review request for Aurora, Kevin Sweeney and Bill Farner. Bugs: AURORA-909 https://issues.apache.org/jira/browse/AURORA-909 Repository: aurora Description --- Offer filtering for static vetoes. Part 3 of 4: Filtering out statically banned offers. Will not apply cleanly: diffed with https://reviews.apache.org/r/30890 as a parent. Original RB: https://reviews.apache.org/r/28617/ Diffs - src/main/java/org/apache/aurora/scheduler/async/OfferManager.java b241d7b22c3d1ceca127b2671eb608ae41283bf3 src/main/java/org/apache/aurora/scheduler/async/TaskScheduler.java 21ea7d2b9d2f8c76367d7ae985270402bb51ea26 src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java 5a0f7ddb7e8fa6869cbb0fdfd07c6881780c6917 src/test/java/org/apache/aurora/scheduler/async/OfferManagerImplTest.java 7ee2bb9bec6c59ba67b65d5b1229c64aca1277ff src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerImplTest.java 9eef52a333e09454e8fd0026371c7e64472a883d src/test/java/org/apache/aurora/scheduler/async/TaskSchedulerTest.java b6d4d8e771c7d16a46e43c7d5c427b911f8b661d Diff: https://reviews.apache.org/r/30891/diff/ Testing --- ./gradlew -Pq build Thanks, Maxim Khutornenko
Re: Review Request 30895: Offer filtering for static vetoes. Part 4 of 4: Benchmarks.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30895/#review73887 --- Ship it! Ship It! - Kevin Sweeney On Feb. 23, 2015, 2:47 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30895/ --- (Updated Feb. 23, 2015, 2:47 p.m.) Review request for Aurora, Kevin Sweeney and Bill Farner. Repository: aurora Description --- Offer filtering for static vetoes. Part 4 of 4: Modifying benchmarks to support preemption toggling. Original RB: https://reviews.apache.org/r/28617/ Diffs - src/jmh/java/org/apache/aurora/benchmark/BenchmarkSettings.java PRE-CREATION src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java bacfbfeb237ecddf82f58679e05be012c5214e61 src/jmh/java/org/apache/aurora/benchmark/Tasks.java 1a35f9ee9e8e76def0f9bf5454cf8cbdf6a89c25 Diff: https://reviews.apache.org/r/30895/diff/ Testing --- ./gradlew -Pq build Thanks, Maxim Khutornenko
Re: Review Request 31376: Introduce JobUpdateSummary.key field, dual write that field when it is read/received.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/#review73900 --- Ship it! Master (19378c1) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On Feb. 24, 2015, 8:12 p.m., Bill Farner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31376/ --- (Updated Feb. 24, 2015, 8:12 p.m.) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- There are 3 ways a JobUpdateSummary lands in storage: - starting a job update via thrift - replaying a log record to save a job update - replaying a snapshot log record This change focuses on ensuring that `JobUpdateSummary.key` is always available for code that is currently relying on the legacy fields (`updateId` and `jobKey`) by cloning them. A follow-up change will alter all code inside the scheduler to only consume the new field (removing `Updates.getKey`). Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6f6124a4c844a1abcf07401d80c3e50eb8b4 src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 384e9b536c3ee3edf7d90960aa51ef98948af088 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 27981393d700c84f6aaa791f12b24d0cec28b5df src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 3e491d999dc3f9e116d3502e4c08c04bd6c6bab9 src/main/java/org/apache/aurora/scheduler/updater/Updates.java 6493d687d00a1d285777ce3c3b5dd45cd08ceb71 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml 7685597bb6acafe1412b23234227adb0eddad429 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java f4c92e64c36fb0c000ab8e9199dfed2fc35dcb36 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 7e5d0edefbed3f67116361da15dd4c969c67cfe8 src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 152e70489b351b5bcf06f85baddbe31259d0aef4 src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 945144dcb5240c3713d909344c82a9312cd3ba5c src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 2088a8f1be04218f2a5abc7fb4b04170b37ba2d1 Diff: https://reviews.apache.org/r/31376/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 10:52 p.m.) Review request for Aurora. Changes --- Review changes * use pytest.raises to ignore the exception * start with a non-empty cluster * assert clusters content instead of size (as far as possible, given what the class offers) Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs (updated) - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- Review request for Aurora. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73792 --- Master (19378c1) is red with this patch. ./build-support/jenkins/build.sh SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_quota.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_command_hooks.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_supdate.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_cancel_update.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_config_noun.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_create.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_kill.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_status.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_diff.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_inspect.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_update.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_plugins.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_restart.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_cron.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_client.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/__init__.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_api_from_cli.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/util.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_version.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_open.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/cli/test_context.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/hooks/test_hooked_api.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/hooks/test_non_hooked_api.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_sla.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_api.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_health_check.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_job_monitor.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_updater_util.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_scheduler_mux.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_scheduler_client.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_quota_check.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_instance_watcher.py Everything Looks Good! SUCCESS: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/test/python/apache/aurora/client/api/test_updater.py Everything Looks Good! SUCCESS:
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73815 --- Ship it! Master (19378c1) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On Feb. 24, 2015, 2:18 p.m., Stephan Erb wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 2:18 p.m.) Review request for Aurora. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31338: Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run.
On Feb. 24, 2015, 6:01 a.m., Joshua Cohen wrote: Steve Niemitz wrote: I'm not a big fan of how the parsing works here either. I was thinking about this last night, I think I have a better plan here. Lemme know what you think. I already want to add volume support per-job at some point, so I propose adding the needed thrift objects to api.thrift (Volume, Mode enum), and converting the command line into those objects in ExecutorSettings. That would then let me reuse the volume-adding code in MesosTaskFactory for normal volumes in the future. Re: command line, I chose the format there because it's the same as the docker command line. What about making a new parser to parse that into the above mentioned thrift objects? Stephan Erb wrote: How about using the same format and argument name as Mesos' `--default_container_info`: { type: MESOS, volumes: [ { host_path: ./.private/tmp, container_path: /tmp, mode: RW } ] } See https://mesos.apache.org/documentation/latest/configuration/ JSON on the command line is an escaping nightmare IMO, and very verbose to specify 2 (or 3) strings for this use case. - Steve --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/#review73755 --- On Feb. 24, 2015, 2:07 a.m., Steve Niemitz wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/ --- (Updated Feb. 24, 2015, 2:07 a.m.) Review request for Aurora, Jay Buffington and Bill Farner. Bugs: AURORA-1107 https://issues.apache.org/jira/browse/AURORA-1107 Repository: aurora Description --- Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run. This is the first portion of allowing per-job mounts, however, I wanted to get this out first since more people want it. I'll implement per-job mounts in a future review. Diffs - docs/deploying-aurora-scheduler.md d1123359961fd59ddb8c1a07f80f293bdd46019f src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java bacfbfeb237ecddf82f58679e05be012c5214e61 src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 24b61c1e4f615295acf28d904588e1512972d3f4 src/main/java/org/apache/aurora/scheduler/mesos/ExecutorSettings.java baacb71403d55c5b90fc11cb2a23f552a32e8ba5 src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 5340d651b298ec8aa079e73d6d2f652fdf876293 src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 6575b7d420f17ec68d6e2a83e7b380f684577d4f src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 444d6d3fdaf86eb84612f846eaa326eb75c49898 src/test/java/org/apache/aurora/scheduler/mesos/TaskExecutors.java efe62ceb502ead88a2f0cd6d09a76664e465d9bc Diff: https://reviews.apache.org/r/31338/diff/ Testing --- Thanks, Steve Niemitz
Re: Review Request 31338: Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/#review73822 --- src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java https://reviews.apache.org/r/31338/#comment120230 This code will also work for a default Mesos container on a slave with the `filesystem/shared` isolator. Is it worth to generalize here and make mounting work for docker and non-docker containers? - Stephan Erb On Feb. 24, 2015, 3:07 a.m., Steve Niemitz wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/ --- (Updated Feb. 24, 2015, 3:07 a.m.) Review request for Aurora, Jay Buffington and Bill Farner. Bugs: AURORA-1107 https://issues.apache.org/jira/browse/AURORA-1107 Repository: aurora Description --- Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run. This is the first portion of allowing per-job mounts, however, I wanted to get this out first since more people want it. I'll implement per-job mounts in a future review. Diffs - docs/deploying-aurora-scheduler.md d1123359961fd59ddb8c1a07f80f293bdd46019f src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java bacfbfeb237ecddf82f58679e05be012c5214e61 src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 24b61c1e4f615295acf28d904588e1512972d3f4 src/main/java/org/apache/aurora/scheduler/mesos/ExecutorSettings.java baacb71403d55c5b90fc11cb2a23f552a32e8ba5 src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 5340d651b298ec8aa079e73d6d2f652fdf876293 src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 6575b7d420f17ec68d6e2a83e7b380f684577d4f src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 444d6d3fdaf86eb84612f846eaa326eb75c49898 src/test/java/org/apache/aurora/scheduler/mesos/TaskExecutors.java efe62ceb502ead88a2f0cd6d09a76664e465d9bc Diff: https://reviews.apache.org/r/31338/diff/ Testing --- Thanks, Steve Niemitz
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73806 --- @ReviewBot retry - Stephan Erb On Feb. 24, 2015, 11:57 a.m., Stephan Erb wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 11:57 a.m.) Review request for Aurora. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 3:18 p.m.) Review request for Aurora. Changes --- Fix import order Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs (updated) - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31338: Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run.
On Feb. 24, 2015, 6:01 a.m., Joshua Cohen wrote: Steve Niemitz wrote: I'm not a big fan of how the parsing works here either. I was thinking about this last night, I think I have a better plan here. Lemme know what you think. I already want to add volume support per-job at some point, so I propose adding the needed thrift objects to api.thrift (Volume, Mode enum), and converting the command line into those objects in ExecutorSettings. That would then let me reuse the volume-adding code in MesosTaskFactory for normal volumes in the future. Re: command line, I chose the format there because it's the same as the docker command line. What about making a new parser to parse that into the above mentioned thrift objects? Stephan Erb wrote: How about using the same format and argument name as Mesos' `--default_container_info`: { type: MESOS, volumes: [ { host_path: ./.private/tmp, container_path: /tmp, mode: RW } ] } See https://mesos.apache.org/documentation/latest/configuration/ Steve Niemitz wrote: JSON on the command line is an escaping nightmare IMO, and very verbose to specify 2 (or 3) strings for this use case. Bill Farner wrote: JSON on the command line also makes me uneasy, but i don't think we have precedent for command line entities with this many knobs (3 are obvious so far). How about 2 approaches as straw men: `-volume_mounts_read_only` and `-volume_mounts_read_write`, both key-value mappings `-volume_mounts_config`, path to a JSON file with entities similar to what Stephan sketched above The first approach has the upside that it fits with the status quo, but it cannot be easily extended if we ever want more attributes. The second would be the first config file we use, but is more future-proof. I think a single arg is easiest to reason about. I think I'm ok with Steve's follow up suggestion to create the thrift types that will be needed for per-job mounts now and use them to aid in arg parsing? That said, I'm not sure we should enforce the docker mount format since in theory these mounts can/will apply to other containers where they may not make as much sense. - Joshua --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/#review73755 --- On Feb. 24, 2015, 2:07 a.m., Steve Niemitz wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/ --- (Updated Feb. 24, 2015, 2:07 a.m.) Review request for Aurora, Jay Buffington and Bill Farner. Bugs: AURORA-1107 https://issues.apache.org/jira/browse/AURORA-1107 Repository: aurora Description --- Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run. This is the first portion of allowing per-job mounts, however, I wanted to get this out first since more people want it. I'll implement per-job mounts in a future review. Diffs - docs/deploying-aurora-scheduler.md d1123359961fd59ddb8c1a07f80f293bdd46019f src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java bacfbfeb237ecddf82f58679e05be012c5214e61 src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 24b61c1e4f615295acf28d904588e1512972d3f4 src/main/java/org/apache/aurora/scheduler/mesos/ExecutorSettings.java baacb71403d55c5b90fc11cb2a23f552a32e8ba5 src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 5340d651b298ec8aa079e73d6d2f652fdf876293 src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 6575b7d420f17ec68d6e2a83e7b380f684577d4f src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 444d6d3fdaf86eb84612f846eaa326eb75c49898 src/test/java/org/apache/aurora/scheduler/mesos/TaskExecutors.java efe62ceb502ead88a2f0cd6d09a76664e465d9bc Diff: https://reviews.apache.org/r/31338/diff/ Testing --- Thanks, Steve Niemitz
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73832 --- src/test/python/apache/aurora/common/test_clusters.py https://reviews.apache.org/r/31350/#comment120234 How about populating with a real cluster here and asserting it later? src/test/python/apache/aurora/common/test_clusters.py https://reviews.apache.org/r/31350/#comment120235 `with pytest.raises():` should be more compact. src/test/python/apache/aurora/common/test_clusters.py https://reviews.apache.org/r/31350/#comment120237 Please, assert the entire contents here, e.g. `assert clusters == [Cluster(...)]` - Maxim Khutornenko On Feb. 24, 2015, 2:18 p.m., Stephan Erb wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 2:18 p.m.) Review request for Aurora. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31338: Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run.
On Feb. 24, 2015, 6:01 a.m., Joshua Cohen wrote: Steve Niemitz wrote: I'm not a big fan of how the parsing works here either. I was thinking about this last night, I think I have a better plan here. Lemme know what you think. I already want to add volume support per-job at some point, so I propose adding the needed thrift objects to api.thrift (Volume, Mode enum), and converting the command line into those objects in ExecutorSettings. That would then let me reuse the volume-adding code in MesosTaskFactory for normal volumes in the future. Re: command line, I chose the format there because it's the same as the docker command line. What about making a new parser to parse that into the above mentioned thrift objects? Stephan Erb wrote: How about using the same format and argument name as Mesos' `--default_container_info`: { type: MESOS, volumes: [ { host_path: ./.private/tmp, container_path: /tmp, mode: RW } ] } See https://mesos.apache.org/documentation/latest/configuration/ Steve Niemitz wrote: JSON on the command line is an escaping nightmare IMO, and very verbose to specify 2 (or 3) strings for this use case. JSON on the command line also makes me uneasy, but i don't think we have precedent for command line entities with this many knobs (3 are obvious so far). How about 2 approaches as straw men: `-volume_mounts_read_only` and `-volume_mounts_read_write`, both key-value mappings `-volume_mounts_config`, path to a JSON file with entities similar to what Stephan sketched above The first approach has the upside that it fits with the status quo, but it cannot be easily extended if we ever want more attributes. The second would be the first config file we use, but is more future-proof. - Bill --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/#review73755 --- On Feb. 24, 2015, 2:07 a.m., Steve Niemitz wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/ --- (Updated Feb. 24, 2015, 2:07 a.m.) Review request for Aurora, Jay Buffington and Bill Farner. Bugs: AURORA-1107 https://issues.apache.org/jira/browse/AURORA-1107 Repository: aurora Description --- Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run. This is the first portion of allowing per-job mounts, however, I wanted to get this out first since more people want it. I'll implement per-job mounts in a future review. Diffs - docs/deploying-aurora-scheduler.md d1123359961fd59ddb8c1a07f80f293bdd46019f src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java bacfbfeb237ecddf82f58679e05be012c5214e61 src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 24b61c1e4f615295acf28d904588e1512972d3f4 src/main/java/org/apache/aurora/scheduler/mesos/ExecutorSettings.java baacb71403d55c5b90fc11cb2a23f552a32e8ba5 src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 5340d651b298ec8aa079e73d6d2f652fdf876293 src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 6575b7d420f17ec68d6e2a83e7b380f684577d4f src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 444d6d3fdaf86eb84612f846eaa326eb75c49898 src/test/java/org/apache/aurora/scheduler/mesos/TaskExecutors.java efe62ceb502ead88a2f0cd6d09a76664e465d9bc Diff: https://reviews.apache.org/r/31338/diff/ Testing --- Thanks, Steve Niemitz
Re: Review Request 31235: Refactoring CronJobManager interface.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31235/#review73712 --- Ship it! src/main/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImpl.java https://reviews.apache.org/r/31235/#comment120097 rename to cronJobStore? src/main/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImpl.java https://reviews.apache.org/r/31235/#comment120098 rename to getCronJobStore? src/main/java/org/apache/aurora/scheduler/cron/quartz/CronLifecycle.java https://reviews.apache.org/r/31235/#comment120099 This TODO wasn't addressed - put it back? - Kevin Sweeney On Feb. 23, 2015, 5:23 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31235/ --- (Updated Feb. 23, 2015, 5:23 p.m.) Review request for Aurora, Kevin Sweeney and Bill Farner. Repository: aurora Description --- Removing job store proxy methods from CronJobManager and dropping job manager concept from the storage. The job manager is long gone but we still carry around the job manager ID. Despite the diff's size the changes are mostly mechanical: replacing CronJobManager occurrences with direct job store access. Diffs - api/src/main/thrift/org/apache/aurora/gen/storage.thrift 31b698131aeb87ea0a633b0cee33b49ea1d501b4 src/main/java/org/apache/aurora/scheduler/SchedulerLifecycle.java b8830d187de27533307a8a2e9be6385f5d3e2289 src/main/java/org/apache/aurora/scheduler/cron/CronJobManager.java fc6e4432d64e625a583d8c8a130d99e066fd232c src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java efea7ad9f6efc99b600d071c3c20063b6bc4b211 src/main/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImpl.java c5cb8cae43c86ae2378a0bef7688d400aa188e57 src/main/java/org/apache/aurora/scheduler/cron/quartz/CronLifecycle.java 64d9486fade3b03b8db936fe60790ea0858212a9 src/main/java/org/apache/aurora/scheduler/http/StructDump.java 063170ac869dafc161c73f735b33f4cbe8e03ab6 src/main/java/org/apache/aurora/scheduler/quota/QuotaManager.java dab8c0535bf9b36cdf7ca785c34d315f0bb4204c src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java 4d83c58c8f0527813ef26477f08424104e0c29d9 src/main/java/org/apache/aurora/scheduler/storage/JobStore.java ad0d67a27628f46dedda2ae4e0e61025dff1e1fd src/main/java/org/apache/aurora/scheduler/storage/Storage.java 83a59ce5d5b2754b1d31354280eca31922d73cdd src/main/java/org/apache/aurora/scheduler/storage/StorageBackfill.java 52377bca8060720dc4bec884c911182c3e77bc52 src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 384e9b536c3ee3edf7d90960aa51ef98948af088 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 27981393d700c84f6aaa791f12b24d0cec28b5df src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java 262d72c781759bac58bfa6c0fcec66c513d8b79b src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobStore.java c24a32e57b34a1fc41681fda9bdb4de38ed8896b src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java ee7618979ce94631af8aaf7ab3ecb2fbfb33fc38 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 3e491d999dc3f9e116d3502e4c08c04bd6c6bab9 src/test/java/org/apache/aurora/scheduler/SchedulerLifecycleTest.java 97ecb742d6e0418890f875394ded8d9fdae2b1c2 src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java c99135ab9c55a42ab51f18cc5ea127b498f0721f src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java 915d7c8294b3d8262021da1c30324f55d8413ff9 src/test/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImplTest.java 701536fff8948ef233523e114e45043992175891 src/test/java/org/apache/aurora/scheduler/quota/QuotaManagerImplTest.java b0772f73f1a21da5828660bfd7d2b1f6b15cbf74 src/test/java/org/apache/aurora/scheduler/storage/StorageBackfillTest.java 493150b062eddf2581a048a9e13826205b8f2c15 src/test/java/org/apache/aurora/scheduler/storage/backup/StorageBackupTest.java 479c8bded75c61c36ac70cf1d8f945b96850087e src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 7e5d0edefbed3f67116361da15dd4c969c67cfe8 src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 152e70489b351b5bcf06f85baddbe31259d0aef4 src/test/java/org/apache/aurora/scheduler/storage/mem/MemJobStoreTest.java 2e0c4151f58a28b695c13a392bae126857d4c4b6 src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java a85cba13fe7ed0e47072f391683261e4bec48009
Re: Review Request 29943: Uptime-driven scheduler job updates
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29943/#review73877 --- Is this ready for review now? - Kevin Sweeney On Jan. 20, 2015, 1:12 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29943/ --- (Updated Jan. 20, 2015, 1:12 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-1041 https://issues.apache.org/jira/browse/AURORA-1041 Repository: aurora Description --- This is the first take on implementing job uptime driven updates. In addition to the olde good batch_size, instances can now be dispatched in arbitrary sequence depending on the overall uptime (health) of the job. The uptime is specified by a tuple of **waitForUptimeMs** and **waitForUptimePercentInstances** values. An excerpt from api.thrift explaining the feature: ``` /** * The uptime-driven update throttles the number of instances being updated at any given moment * according to the job uptime calculations. The X% of instances up over Y interval invariant * is preserved over the entire job update lifetime. No new instances are dispatched for update * unless that invariant is satisfied. Instances are dispatched in their natural uptime order, * shortest uptime first. * * For example, when set as below the update will block until at least 90% of job instances are in * RUNNING state for at least 1 minute: *waitForUptimeMs = 6 *waitForUptimePercentInstances = 90 * * When using uptime-driven update, it's expected that updateGroupSize is left unset to allow job * uptime settings drive the update progress. However, if updateGroupSize is set it will be * pre-applied before SLA uptime calculations to determine the update working set. As a side * effect, the updateGroupSize results in a natural ordering of instances taken for each group * (instances within a group are still updated in a shortest uptime first order). * * For example, if set as below the number of instances being updated at any given moment will * never exceed 5 even though the uptime calculations may allow more than 5: *updateGroupSize = 5 *waitForUptimeMs = 6 *waitForUptimePercentInstances = 90 * * NOTE on update rollback: with the uptime-driven update, there is no reliable way to ensure a * graceful throttled rollback as unstable/flapping instances may never yield an acceptable uptime * to perform an uptime-coordinated rollback. As such, when rollbackOnFailure=True AND the * updateGroupSize=0 the updater will dispatch all affected instances at once. * Use rollbackOnFailure=True with caution for uptime-driven updates. */ ``` For reviewers: recommend starting with api.thrift and then proceeding to the InstanceUptimeStrategy.java that implements the core algo. TODO: - vagrant e2e test - more corner case unit test coverage in JobUpdaterIT - client warning message in case uptime specs are used with client updater - docs Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 08ba1cdf88b712de22c26c04443079282db59ef9 src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java eae79d59b445ea58f46dc9e3107c03fbd83b6a95 src/main/java/org/apache/aurora/scheduler/sla/SlaUtil.java 156b9c0a2fa0c0ec4b7220d5ec2cc40c3e59d1d6 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java ac92959f34a3b0962d6aa018dc82a5ac72ea1b34 src/main/java/org/apache/aurora/scheduler/updater/InstanceUptimeProviderImpl.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java a992938d4e12b20f81608be6bbdc24c0a211c3fd src/main/java/org/apache/aurora/scheduler/updater/OneWayJobUpdater.java 27a5b9026f5ac3b3bdeb32813b10435bc3dab173 src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java b53086169aa53d27a39a01cadf8d3c4a8ecb68de src/main/java/org/apache/aurora/scheduler/updater/UpdaterModule.java 5733da3daeacd8cb726310e5d9933635e3993687 src/main/java/org/apache/aurora/scheduler/updater/strategy/FilteringStrategy.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/updater/strategy/InstanceUptimeProvider.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/updater/strategy/InstanceUptimeStrategy.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/updater/strategy/UpdateStrategy.java c2a2ee8f3ad09d48918e4e62eb8fe7a71b428160 src/main/python/apache/aurora/client/api/updater_util.py
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
On Feb. 24, 2015, 11:12 a.m., Kevin Sweeney wrote: This is awesome! If this fixes the full suite, can you also remove `--no-fast` from `build-support/jenkins/build.sh`? Stephan Erb wrote: I doubt that this patch is sufficient to warrant the change of the build script. There may be many more error conditions requiring a `no-fast`. Background info: We are directly using the python client in our own backend and are also writing some integration tests against it. In some of these testcases we are using CLUSTERS.patch and recently discovered its faulty cleanup behaviour. That's unfortunate. Filed https://issues.apache.org/jira/browse/AURORA-1145 to follow-up. - Kevin --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73867 --- On Feb. 24, 2015, 2:12 p.m., Stephan Erb wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 2:12 p.m.) Review request for Aurora, Kevin Sweeney and Maxim Khutornenko. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73992 --- Ship it! I can only wonder how many times this has caused an error for me... thank you!! :) - Joe Smith On Feb. 24, 2015, 2:12 p.m., Stephan Erb wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 2:12 p.m.) Review request for Aurora, Kevin Sweeney and Maxim Khutornenko. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31394: Renaming storage.thrift job store structs.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31394/#review73955 --- Ship it! Master (cd681d9) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On Feb. 25, 2015, 1:06 a.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31394/ --- (Updated Feb. 25, 2015, 1:06 a.m.) Review request for Aurora, Kevin Sweeney and Bill Farner. Repository: aurora Description --- Renaming thrift structs and the `JobStore` to better reflect its cron-only purpose. Purely IDE-driven renaming. Diffs - api/src/main/thrift/org/apache/aurora/gen/storage.thrift d3b50d7511f385797cbb539e66e83a851c6bccf7 src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java c8834215c6c621246e3436de734b95d9ced66537 src/main/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImpl.java 916cb6dc4b8907ffaf6736f1d615bbc72611a589 src/main/java/org/apache/aurora/scheduler/http/StructDump.java e7ee0cf0895c1c75c98c993e14716c7419a9eeec src/main/java/org/apache/aurora/scheduler/quota/QuotaManager.java 199dc1123cd1966b1c8075e7a486efdf627c45f3 src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java fc8e2f4e75a33789b28588204cbbfd56b5d049a4 src/main/java/org/apache/aurora/scheduler/storage/JobStore.java 748b2cbce76258b3f4c454538b8d40a19e13e68e src/main/java/org/apache/aurora/scheduler/storage/Storage.java 41b7053b01f2105bbf9d5ee4616a0c3d0d91220a src/main/java/org/apache/aurora/scheduler/storage/StorageBackfill.java 7a99afa656a0978d05848a31c73c2658a98d6918 src/main/java/org/apache/aurora/scheduler/storage/db/DbStorage.java ea4afb49e52aed1134cf6ca44bf53693a4d999df src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 0229b920ebb36353b8bf9d1ea360d7ff6f4dbf97 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 72b9895214aa372c4fd75e5bb6dee8485b3594f6 src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java ee88d7c03cc1d1f23ed6b2ef1e5cf5c7929a57c9 src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java aa2226651ba6c5d1b77c4861f734b4194dcbecf3 src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobStore.java c9ffc3871e1824ea39961c4a4d416e50d7e1f078 src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java bc3a7c5061d97e13c7702329a168d416c3dd1bcc src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 31c25d04cd1b8f741649edea8424eb298738d895 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 059626f742ce806fded754ef04a83e2eb5405ff2 src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 7bef09e8217750908eb56d9010df79bb5fd050de src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java c3f8d1b4c96ae35ebfb157f46ae545c8d39c3602 src/test/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImplTest.java d4c14afa7b7ea904f7a33a8747591ab8f1be5f9b src/test/java/org/apache/aurora/scheduler/storage/StorageBackfillTest.java b48994f40edea5e257e2816b1950d8a495d89aa6 src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 91789e7c203753054406296c41248a1fbf30d019 src/test/java/org/apache/aurora/scheduler/storage/backup/StorageBackupTest.java 0c73a9f78d3dfe4764c0e8513b85611313067368 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 31b75f41ef3cf1d32824afabec6f7ea79285973d src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 343106c24395911162657538c99a05b81d5ee2dc src/test/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorageTest.java 36809ac2e7e8924187eb2d9cbd69da58e7866681 src/test/java/org/apache/aurora/scheduler/storage/mem/MemJobStoreTest.java b9ade45bae6f76520789821733a58075ea6d0bd0 src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java 6feddc3fc82da072d84bd69b16637144215415b7 Diff: https://reviews.apache.org/r/31394/diff/ Testing --- ./gradlew -Pq build Also, test bidirectional upgrade in vagrant. Thanks, Maxim Khutornenko
Re: Review Request 30710: add mesos role feature
On Feb. 17, 2015, 11:09 p.m., Maxim Khutornenko wrote: src/main/java/org/apache/aurora/scheduler/configuration/Resources.java, line 114 https://reviews.apache.org/r/30710/diff/3/?file=861559#file861559line114 This is breaking Resources/Offer encapsulation and is not addressing the long term possiblility of making role-biased scheduling decisions. A possible solution could be de-anonymize `Resources` by adding a `StringOptional role` to it. The `role` would be left `Optional.Stringabsent()` any time an anonymous `Resources` instance is required or populated to the respective framework role when a role-specific resources are initilized. This would obviously require refactoring `Resources` to support internal mapping of resource vectors by role. The above would also eliminate the need for wrapper classes and will keep the door open for future role-based scheduling optimizations. I am curious what others think here? May I have your skype No.? let's discuss it - lozh...@ebay.com --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30710/#review72793 --- On Feb. 13, 2015, 10 a.m., lozh...@ebay.com zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30710/ --- (Updated Feb. 13, 2015, 10 a.m.) Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Bill Farner. Bugs: AURORA-1109 https://issues.apache.org/jira/browse/AURORA-1109 Repository: aurora Description --- ## Problems We are from eBay platform team. Previously, we used marathon to generate Jenkins master instance in dedicated vms and recieve resource offer from same dedicated vms. For the details, please refer to http://www.ebaytechblog.com/2014/04/04/delivering-ebays-ci-solution-with-apache-mesos-part-i/#.VNQUuC6_SPU Now, we found Aurora is more stable and powerful. We are moving from Marathon to Aurora. During the move, we found there is no mesos role in Aurora now. But we need use mesos role way to solve the problem in section Frameworks stopped receiving offers after a while of the given url. Here is a snippet of the problem description: *We noticed occurred after we used Marathon to create the initial set of CI masters. As those CI masters started registering themselves as frameworks, Marathon stopped receiving any offers from Mesos; essentially, no new CI masters could be launched. Let’s start with Marathon. In the DRF model, it was unfair to treat Marathon in the same bucket/role alongside hundreds of connected Jenkins frameworks. After launching all these Jenkins frameworks, Marathon had a large resource share and Mesos would aggressively offer resources to frameworks that were using little or no resources. Marathon was placed last in priority and got starved out.* *We decided to define a dedicated Mesos role for Marathon and to have all of the Mesos slaves that were reserved for Jenkins master instances support that Mesos role. Jenkins frameworks were left with the default role “*”.*This solved the problem – Mesos offered resources per role and hence Marathon never got starved out. A framework with a special role will get resource offers from both slaves supporting that special role and also from the default role “*”.* However, since we were using placement constraints, Marathon accepted resource offers only from slaves that supported both the role and the placement constraints.* ## Solution So we add role feature is the source code to solve the problem in same way: When accept a resource offer, Aurora will send back the needed resources to Mesos with the mesos role in resource offer. How to configure the Mesos role: 1.Add cmd option --mesos_role=${Mesos role name} when start Aurora scheduler. We change the test cases according code change. Each changed test case is green Merge https://github.com/zhanglong2015/incubator-aurora Diffs - src/main/java/org/apache/aurora/scheduler/ResourceSlot.java 1a158b4e0be94762ad0480e8ce74b19bacc90c97 src/main/java/org/apache/aurora/scheduler/async/GcExecutorLauncher.java 31aa2bbaab3d97875493ad75c4d2c7c82ac7fa58 src/main/java/org/apache/aurora/scheduler/configuration/Resources.java b5a3140e3560f790d1db496dca3c2ee0dc96a195 src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java d0994203b5650f44ca2eb32e1e2aa61875163854 src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 5340d651b298ec8aa079e73d6d2f652fdf876293 src/main/java/org/apache/aurora/scheduler/mesos/OfferWrapper.java PRE-CREATION
Re: Review Request 31394: Renaming storage.thrift job store structs.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31394/#review73958 --- Ship it! Ship It! - Bill Farner On Feb. 25, 2015, 1:06 a.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31394/ --- (Updated Feb. 25, 2015, 1:06 a.m.) Review request for Aurora, Kevin Sweeney and Bill Farner. Repository: aurora Description --- Renaming thrift structs and the `JobStore` to better reflect its cron-only purpose. Purely IDE-driven renaming. Diffs - api/src/main/thrift/org/apache/aurora/gen/storage.thrift d3b50d7511f385797cbb539e66e83a851c6bccf7 src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java c8834215c6c621246e3436de734b95d9ced66537 src/main/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImpl.java 916cb6dc4b8907ffaf6736f1d615bbc72611a589 src/main/java/org/apache/aurora/scheduler/http/StructDump.java e7ee0cf0895c1c75c98c993e14716c7419a9eeec src/main/java/org/apache/aurora/scheduler/quota/QuotaManager.java 199dc1123cd1966b1c8075e7a486efdf627c45f3 src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java fc8e2f4e75a33789b28588204cbbfd56b5d049a4 src/main/java/org/apache/aurora/scheduler/storage/JobStore.java 748b2cbce76258b3f4c454538b8d40a19e13e68e src/main/java/org/apache/aurora/scheduler/storage/Storage.java 41b7053b01f2105bbf9d5ee4616a0c3d0d91220a src/main/java/org/apache/aurora/scheduler/storage/StorageBackfill.java 7a99afa656a0978d05848a31c73c2658a98d6918 src/main/java/org/apache/aurora/scheduler/storage/db/DbStorage.java ea4afb49e52aed1134cf6ca44bf53693a4d999df src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 0229b920ebb36353b8bf9d1ea360d7ff6f4dbf97 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 72b9895214aa372c4fd75e5bb6dee8485b3594f6 src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java ee88d7c03cc1d1f23ed6b2ef1e5cf5c7929a57c9 src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java aa2226651ba6c5d1b77c4861f734b4194dcbecf3 src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobStore.java c9ffc3871e1824ea39961c4a4d416e50d7e1f078 src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java bc3a7c5061d97e13c7702329a168d416c3dd1bcc src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 31c25d04cd1b8f741649edea8424eb298738d895 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 059626f742ce806fded754ef04a83e2eb5405ff2 src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 7bef09e8217750908eb56d9010df79bb5fd050de src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java c3f8d1b4c96ae35ebfb157f46ae545c8d39c3602 src/test/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImplTest.java d4c14afa7b7ea904f7a33a8747591ab8f1be5f9b src/test/java/org/apache/aurora/scheduler/storage/StorageBackfillTest.java b48994f40edea5e257e2816b1950d8a495d89aa6 src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 91789e7c203753054406296c41248a1fbf30d019 src/test/java/org/apache/aurora/scheduler/storage/backup/StorageBackupTest.java 0c73a9f78d3dfe4764c0e8513b85611313067368 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 31b75f41ef3cf1d32824afabec6f7ea79285973d src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 343106c24395911162657538c99a05b81d5ee2dc src/test/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorageTest.java 36809ac2e7e8924187eb2d9cbd69da58e7866681 src/test/java/org/apache/aurora/scheduler/storage/mem/MemJobStoreTest.java b9ade45bae6f76520789821733a58075ea6d0bd0 src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java 6feddc3fc82da072d84bd69b16637144215415b7 Diff: https://reviews.apache.org/r/31394/diff/ Testing --- ./gradlew -Pq build Also, test bidirectional upgrade in vagrant. Thanks, Maxim Khutornenko
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73928 --- Ship it! Master (cd681d9) is green with this patch. ./build-support/jenkins/build.sh I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On Feb. 24, 2015, 10:12 p.m., Stephan Erb wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 10:12 p.m.) Review request for Aurora, Kevin Sweeney and Maxim Khutornenko. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31389: Remove LiveClusterState, make CachedClusterState (now ClusterStateImpl) default.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31389/#review73954 --- Ship it! Ship It! - Maxim Khutornenko On Feb. 25, 2015, 12:01 a.m., Bill Farner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31389/ --- (Updated Feb. 25, 2015, 12:01 a.m.) Review request for Aurora, Joshua Cohen and Maxim Khutornenko. Bugs: AURORA-1055 https://issues.apache.org/jira/browse/AURORA-1055 Repository: aurora Description --- Remove LiveClusterState, make CachedClusterState (now ClusterStateImpl) default. Diffs - src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 0351b10cbfff76f07eec15c5ed1af7bb5f39a8ce src/main/java/org/apache/aurora/scheduler/async/preemptor/CachedClusterState.java 2831103457184d2049684c1c519b6bbb3b5b3e62 src/main/java/org/apache/aurora/scheduler/async/preemptor/LiveClusterState.java 38cb17371735d959c14fc7bf9fc25b8eb814fdfe src/main/java/org/apache/aurora/scheduler/async/preemptor/PreemptorModule.java 2030a9413035e1ad7db52b13957ed074b211d558 src/test/java/org/apache/aurora/scheduler/async/preemptor/CachedClusterStateTest.java 7cc04dd4ea214f7e9923be51b467a2564fec18bb src/test/java/org/apache/aurora/scheduler/async/preemptor/LiveClusterStateTest.java 763f8b5f31349f291c0ede7b5d3292f6ca5b src/test/java/org/apache/aurora/scheduler/async/preemptor/PreemptorImplTest.java 83cf118c9e1d32b0de511754b4aeadd8aed33e49 Diff: https://reviews.apache.org/r/31389/diff/ Testing --- Standard tests. I removed a test case `testIgnoresThrottledTasks`, which was actually testing LiveClusterState behavior. That test case was redundant to unit testing in `ClusterStateImpl`. Thanks, Bill Farner
Review Request 31388: Update thrift API and internal code to use JobUpdateSummary.key rather than job key and id.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31388/ --- Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- Update thrift API and internal code to use JobUpdateSummary.key rather than job key and id. Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6a00ce2c30ce24851560c4d499c395194dbeed16 src/main/java/org/apache/aurora/scheduler/quota/QuotaManager.java bb727214b19ec6074d04eacdd997832e50b756f9 src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java fc8e2f4e75a33789b28588204cbbfd56b5d049a4 src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 4b8b00cc74d81aa5cc772a7b4e077da6a4ed2a83 src/main/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStore.java d52c15e0a7083e0d02bbf6dbaa58d8956bc5dd6f src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 72b9895214aa372c4fd75e5bb6dee8485b3594f6 src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 34763851cd59996559f160c57b00d23de3dc49b4 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java d24b31bf1034c69ac8a9891aaa9b8653674ce0ce src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java d1bd3c975b2057b46597f38555c2a73d14835600 src/main/java/org/apache/aurora/scheduler/updater/Updates.java b8c919585307e27f154782ee179153165458bed7 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml c08f09a511e1fc631abdae50630b8b7ab10a440b src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql e6184dd2de0e86849d811ec05c168bbb4db8e8cd src/test/java/org/apache/aurora/scheduler/quota/QuotaManagerImplTest.java eac2033cffc09f6a9e867a2499569b51d94f0705 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java 81c8be7685854c47ed4a7c6deeffb6a0b80023ab src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java daaa759ca5c2cda26802187698f0a33ceb0dd39f src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 5b0565e6477644047a237b8d185027a6fdc5c6a5 src/test/java/org/apache/aurora/scheduler/thrift/aop/ForwardingThrift.java 2405bb8ad379e355a7987e9bb4163e1693f18777 src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 3ba6274b97393ac2228d3eac8ba1615cb57014f8 Diff: https://reviews.apache.org/r/31388/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73939 --- Ship it! src/test/python/apache/aurora/common/test_clusters.py https://reviews.apache.org/r/31350/#comment120356 I'm not sure this test case is valuable since it will run in a non-deterministic order. - Kevin Sweeney On Feb. 24, 2015, 2:12 p.m., Stephan Erb wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 2:12 p.m.) Review request for Aurora, Kevin Sweeney and Maxim Khutornenko. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 31380: Now that we've identified not calling .converge as a source of flakiness, remove epsilon from health checker tests.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31380/#review73938 --- Ship it! Ship It! - Joe Smith On Feb. 24, 2015, 1:27 p.m., Brian Wickman wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31380/ --- (Updated Feb. 24, 2015, 1:27 p.m.) Review request for Aurora and Joe Smith. Repository: aurora Description --- Now that we've identified not calling .converge as the source of the issue, remove epsilon from health checker tests. Diffs - src/test/python/apache/aurora/executor/common/test_health_checker.py 1b4423a1eb95cc950206a355fa657c8082c8d93f Diff: https://reviews.apache.org/r/31380/diff/ Testing --- ¯\_(?)_/¯ Ran this for 20 minutes on Linux and OS X: while true; do THERMOS_DEBUG=1 ./pants test.pytest --options='-v' src/test/python/apache/aurora/executor/common:health_checker; done; Thanks, Brian Wickman
Re: Review Request 31388: Update thrift API and internal code to use JobUpdateSummary.key rather than job key and id.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31388/ --- (Updated Feb. 25, 2015, midnight) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description (updated) --- Update thrift API and internal code to use JobUpdateSummary.key rather than job key and id. Note: This change will leave the client in a broken state w.r.t. the scheduler. I will hold commiting this change until i have another review ready to go that updates the client. Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6a00ce2c30ce24851560c4d499c395194dbeed16 src/main/java/org/apache/aurora/scheduler/quota/QuotaManager.java bb727214b19ec6074d04eacdd997832e50b756f9 src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java fc8e2f4e75a33789b28588204cbbfd56b5d049a4 src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 4b8b00cc74d81aa5cc772a7b4e077da6a4ed2a83 src/main/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStore.java d52c15e0a7083e0d02bbf6dbaa58d8956bc5dd6f src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 72b9895214aa372c4fd75e5bb6dee8485b3594f6 src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 34763851cd59996559f160c57b00d23de3dc49b4 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java d24b31bf1034c69ac8a9891aaa9b8653674ce0ce src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java d1bd3c975b2057b46597f38555c2a73d14835600 src/main/java/org/apache/aurora/scheduler/updater/Updates.java b8c919585307e27f154782ee179153165458bed7 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml c08f09a511e1fc631abdae50630b8b7ab10a440b src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql e6184dd2de0e86849d811ec05c168bbb4db8e8cd src/test/java/org/apache/aurora/scheduler/quota/QuotaManagerImplTest.java eac2033cffc09f6a9e867a2499569b51d94f0705 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java 81c8be7685854c47ed4a7c6deeffb6a0b80023ab src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java daaa759ca5c2cda26802187698f0a33ceb0dd39f src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 5b0565e6477644047a237b8d185027a6fdc5c6a5 src/test/java/org/apache/aurora/scheduler/thrift/aop/ForwardingThrift.java 2405bb8ad379e355a7987e9bb4163e1693f18777 src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 3ba6274b97393ac2228d3eac8ba1615cb57014f8 Diff: https://reviews.apache.org/r/31388/diff/ Testing --- Thanks, Bill Farner
Review Request 31389: Remove LiveClusterState, make CachedClusterState (now ClusterStateImpl) default.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31389/ --- Review request for Aurora, Joshua Cohen and Maxim Khutornenko. Bugs: AURORA-1055 https://issues.apache.org/jira/browse/AURORA-1055 Repository: aurora Description --- Remove LiveClusterState, make CachedClusterState (now ClusterStateImpl) default. Diffs - src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java 0351b10cbfff76f07eec15c5ed1af7bb5f39a8ce src/main/java/org/apache/aurora/scheduler/async/preemptor/CachedClusterState.java 2831103457184d2049684c1c519b6bbb3b5b3e62 src/main/java/org/apache/aurora/scheduler/async/preemptor/LiveClusterState.java 38cb17371735d959c14fc7bf9fc25b8eb814fdfe src/main/java/org/apache/aurora/scheduler/async/preemptor/PreemptorModule.java 2030a9413035e1ad7db52b13957ed074b211d558 src/test/java/org/apache/aurora/scheduler/async/preemptor/CachedClusterStateTest.java 7cc04dd4ea214f7e9923be51b467a2564fec18bb src/test/java/org/apache/aurora/scheduler/async/preemptor/LiveClusterStateTest.java 763f8b5f31349f291c0ede7b5d3292f6ca5b src/test/java/org/apache/aurora/scheduler/async/preemptor/PreemptorImplTest.java 83cf118c9e1d32b0de511754b4aeadd8aed33e49 Diff: https://reviews.apache.org/r/31389/diff/ Testing --- Standard tests. I removed a test case `testIgnoresThrottledTasks`, which was actually testing LiveClusterState behavior. That test case was redundant to unit testing in `ClusterStateImpl`. Thanks, Bill Farner
Re: Review Request 31338: Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run.
On Feb. 24, 2015, 6:01 a.m., Joshua Cohen wrote: Steve Niemitz wrote: I'm not a big fan of how the parsing works here either. I was thinking about this last night, I think I have a better plan here. Lemme know what you think. I already want to add volume support per-job at some point, so I propose adding the needed thrift objects to api.thrift (Volume, Mode enum), and converting the command line into those objects in ExecutorSettings. That would then let me reuse the volume-adding code in MesosTaskFactory for normal volumes in the future. Re: command line, I chose the format there because it's the same as the docker command line. What about making a new parser to parse that into the above mentioned thrift objects? Stephan Erb wrote: How about using the same format and argument name as Mesos' `--default_container_info`: { type: MESOS, volumes: [ { host_path: ./.private/tmp, container_path: /tmp, mode: RW } ] } See https://mesos.apache.org/documentation/latest/configuration/ Steve Niemitz wrote: JSON on the command line is an escaping nightmare IMO, and very verbose to specify 2 (or 3) strings for this use case. Bill Farner wrote: JSON on the command line also makes me uneasy, but i don't think we have precedent for command line entities with this many knobs (3 are obvious so far). How about 2 approaches as straw men: `-volume_mounts_read_only` and `-volume_mounts_read_write`, both key-value mappings `-volume_mounts_config`, path to a JSON file with entities similar to what Stephan sketched above The first approach has the upside that it fits with the status quo, but it cannot be easily extended if we ever want more attributes. The second would be the first config file we use, but is more future-proof. Joshua Cohen wrote: I think a single arg is easiest to reason about. I think I'm ok with Steve's follow up suggestion to create the thrift types that will be needed for per-job mounts now and use them to aid in arg parsing? That said, I'm not sure we should enforce the docker mount format since in theory these mounts can/will apply to other containers where they may not make as much sense. Steve Niemitz wrote: I think it's simple enough it'd make sense for any container system. host:container:mode is pretty generic. I'm also not as big of a fan about multiple args, and a default of RO if not specified seems reasonable to me. Bill Farner wrote: Works for me, but -1 to defaulting. I much prefer explicit over implicit, and this way it's easy to see how to turn a read-only mount into read-write. Cool with me. Ok I think we're all on the same page here. I'll submit an updated patch in a little bit. - Steve --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/#review73755 --- On Feb. 24, 2015, 2:07 a.m., Steve Niemitz wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31338/ --- (Updated Feb. 24, 2015, 2:07 a.m.) Review request for Aurora, Jay Buffington and Bill Farner. Bugs: AURORA-1107 https://issues.apache.org/jira/browse/AURORA-1107 Repository: aurora Description --- Added a command line flag, -global_container_mounts, to allow mounting paths from the slaves into the (docker) containers they run. This is the first portion of allowing per-job mounts, however, I wanted to get this out first since more people want it. I'll implement per-job mounts in a future review. Diffs - docs/deploying-aurora-scheduler.md d1123359961fd59ddb8c1a07f80f293bdd46019f src/jmh/java/org/apache/aurora/benchmark/SchedulingBenchmarks.java bacfbfeb237ecddf82f58679e05be012c5214e61 src/main/java/org/apache/aurora/scheduler/app/SchedulerMain.java 24b61c1e4f615295acf28d904588e1512972d3f4 src/main/java/org/apache/aurora/scheduler/mesos/ExecutorSettings.java baacb71403d55c5b90fc11cb2a23f552a32e8ba5 src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 5340d651b298ec8aa079e73d6d2f652fdf876293 src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 6575b7d420f17ec68d6e2a83e7b380f684577d4f src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java 444d6d3fdaf86eb84612f846eaa326eb75c49898 src/test/java/org/apache/aurora/scheduler/mesos/TaskExecutors.java efe62ceb502ead88a2f0cd6d09a76664e465d9bc Diff: https://reviews.apache.org/r/31338/diff/
Re: Review Request 31394: Renaming storage.thrift job store structs.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31394/#review73953 --- Ship it! Ship It! - Kevin Sweeney On Feb. 24, 2015, 5:06 p.m., Maxim Khutornenko wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31394/ --- (Updated Feb. 24, 2015, 5:06 p.m.) Review request for Aurora, Kevin Sweeney and Bill Farner. Repository: aurora Description --- Renaming thrift structs and the `JobStore` to better reflect its cron-only purpose. Purely IDE-driven renaming. Diffs - api/src/main/thrift/org/apache/aurora/gen/storage.thrift d3b50d7511f385797cbb539e66e83a851c6bccf7 src/main/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJob.java c8834215c6c621246e3436de734b95d9ced66537 src/main/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImpl.java 916cb6dc4b8907ffaf6736f1d615bbc72611a589 src/main/java/org/apache/aurora/scheduler/http/StructDump.java e7ee0cf0895c1c75c98c993e14716c7419a9eeec src/main/java/org/apache/aurora/scheduler/quota/QuotaManager.java 199dc1123cd1966b1c8075e7a486efdf627c45f3 src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java fc8e2f4e75a33789b28588204cbbfd56b5d049a4 src/main/java/org/apache/aurora/scheduler/storage/JobStore.java 748b2cbce76258b3f4c454538b8d40a19e13e68e src/main/java/org/apache/aurora/scheduler/storage/Storage.java 41b7053b01f2105bbf9d5ee4616a0c3d0d91220a src/main/java/org/apache/aurora/scheduler/storage/StorageBackfill.java 7a99afa656a0978d05848a31c73c2658a98d6918 src/main/java/org/apache/aurora/scheduler/storage/db/DbStorage.java ea4afb49e52aed1134cf6ca44bf53693a4d999df src/main/java/org/apache/aurora/scheduler/storage/log/LogStorage.java 0229b920ebb36353b8bf9d1ea360d7ff6f4dbf97 src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 72b9895214aa372c4fd75e5bb6dee8485b3594f6 src/main/java/org/apache/aurora/scheduler/storage/log/StreamManagerImpl.java ee88d7c03cc1d1f23ed6b2ef1e5cf5c7929a57c9 src/main/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorage.java aa2226651ba6c5d1b77c4861f734b4194dcbecf3 src/main/java/org/apache/aurora/scheduler/storage/mem/MemJobStore.java c9ffc3871e1824ea39961c4a4d416e50d7e1f078 src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java bc3a7c5061d97e13c7702329a168d416c3dd1bcc src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 31c25d04cd1b8f741649edea8424eb298738d895 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 059626f742ce806fded754ef04a83e2eb5405ff2 src/test/java/org/apache/aurora/scheduler/cron/quartz/AuroraCronJobTest.java 7bef09e8217750908eb56d9010df79bb5fd050de src/test/java/org/apache/aurora/scheduler/cron/quartz/CronIT.java c3f8d1b4c96ae35ebfb157f46ae545c8d39c3602 src/test/java/org/apache/aurora/scheduler/cron/quartz/CronJobManagerImplTest.java d4c14afa7b7ea904f7a33a8747591ab8f1be5f9b src/test/java/org/apache/aurora/scheduler/storage/StorageBackfillTest.java b48994f40edea5e257e2816b1950d8a495d89aa6 src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 91789e7c203753054406296c41248a1fbf30d019 src/test/java/org/apache/aurora/scheduler/storage/backup/StorageBackupTest.java 0c73a9f78d3dfe4764c0e8513b85611313067368 src/test/java/org/apache/aurora/scheduler/storage/log/LogStorageTest.java 31b75f41ef3cf1d32824afabec6f7ea79285973d src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImplTest.java 343106c24395911162657538c99a05b81d5ee2dc src/test/java/org/apache/aurora/scheduler/storage/log/WriteAheadStorageTest.java 36809ac2e7e8924187eb2d9cbd69da58e7866681 src/test/java/org/apache/aurora/scheduler/storage/mem/MemJobStoreTest.java b9ade45bae6f76520789821733a58075ea6d0bd0 src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java 6feddc3fc82da072d84bd69b16637144215415b7 Diff: https://reviews.apache.org/r/31394/diff/ Testing --- ./gradlew -Pq build Also, test bidirectional upgrade in vagrant. Thanks, Maxim Khutornenko
Re: Review Request 31350: Fix clusters.patch contextmanager cleanup
On Feb. 24, 2015, 8:12 p.m., Kevin Sweeney wrote: This is awesome! If this fixes the full suite, can you also remove `--no-fast` from `build-support/jenkins/build.sh`? I doubt that this patch is sufficient to warrant the change of the build script. There may be many more error conditions requiring a `no-fast`. Background info: We are directly using the python client in our own backend and are also writing some integration tests against it. In some of these testcases we are using CLUSTERS.patch and recently discovered its faulty cleanup behaviour. - Stephan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/#review73867 --- On Feb. 24, 2015, 10:52 p.m., Stephan Erb wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31350/ --- (Updated Feb. 24, 2015, 10:52 p.m.) Review request for Aurora, Kevin Sweeney and Maxim Khutornenko. Repository: aurora Description --- Fix clusters.patch contextmanager cleanup Otherwise one failing testcase can affect the entire suite. Diffs - src/main/python/apache/aurora/common/clusters.py c4b7fefca30313b281808478bf23158a9b7fbf85 src/test/python/apache/aurora/common/test_clusters.py 1bd696e9cd28d87d0cac68b33ab043407d796b61 Diff: https://reviews.apache.org/r/31350/diff/ Testing --- ./pants test.pytest --no-fast src/test/python/apache/aurora/common:: Thanks, Stephan Erb
Re: Review Request 30710: add mesos role feature
On Feb. 17, 2015, 11:09 p.m., Maxim Khutornenko wrote: src/main/java/org/apache/aurora/scheduler/configuration/Resources.java, line 114 https://reviews.apache.org/r/30710/diff/3/?file=861559#file861559line114 This is breaking Resources/Offer encapsulation and is not addressing the long term possiblility of making role-biased scheduling decisions. A possible solution could be de-anonymize `Resources` by adding a `StringOptional role` to it. The `role` would be left `Optional.Stringabsent()` any time an anonymous `Resources` instance is required or populated to the respective framework role when a role-specific resources are initilized. This would obviously require refactoring `Resources` to support internal mapping of resource vectors by role. The above would also eliminate the need for wrapper classes and will keep the door open for future role-based scheduling optimizations. I am curious what others think here? lozh...@ebay.com zhang wrote: May I have your skype No.? let's discuss it The wrapper classes is used to track which resources have bean allocated. Do you mean tracking the resouce allocation in Resources instead? if so, there are some issues: 1.Noramally each task will has two Resources instance: one for task itself, one for executor. So is unable to track the resource allcation in Resources instance, unless we use thread local variable or something like this. I think another solution is: 1.keep Resources/Offer unchanged, so the Resources/Offer encapsulation is not broken. 2.add a new class ResourceAllocationTracker which is used to track the resource allocation, this class's constructor should accept a Offer as argument. 3.pass the instance of ResourceAllocationTracker to Resources.toResourceList as new parameter 4.in Resources.toResourceList, evertytime a resource is allocated, will be saved in ResourceAllocationTracker instance. - lozh...@ebay.com --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30710/#review72793 --- On Feb. 13, 2015, 10 a.m., lozh...@ebay.com zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30710/ --- (Updated Feb. 13, 2015, 10 a.m.) Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Bill Farner. Bugs: AURORA-1109 https://issues.apache.org/jira/browse/AURORA-1109 Repository: aurora Description --- ## Problems We are from eBay platform team. Previously, we used marathon to generate Jenkins master instance in dedicated vms and recieve resource offer from same dedicated vms. For the details, please refer to http://www.ebaytechblog.com/2014/04/04/delivering-ebays-ci-solution-with-apache-mesos-part-i/#.VNQUuC6_SPU Now, we found Aurora is more stable and powerful. We are moving from Marathon to Aurora. During the move, we found there is no mesos role in Aurora now. But we need use mesos role way to solve the problem in section Frameworks stopped receiving offers after a while of the given url. Here is a snippet of the problem description: *We noticed occurred after we used Marathon to create the initial set of CI masters. As those CI masters started registering themselves as frameworks, Marathon stopped receiving any offers from Mesos; essentially, no new CI masters could be launched. Let’s start with Marathon. In the DRF model, it was unfair to treat Marathon in the same bucket/role alongside hundreds of connected Jenkins frameworks. After launching all these Jenkins frameworks, Marathon had a large resource share and Mesos would aggressively offer resources to frameworks that were using little or no resources. Marathon was placed last in priority and got starved out.* *We decided to define a dedicated Mesos role for Marathon and to have all of the Mesos slaves that were reserved for Jenkins master instances support that Mesos role. Jenkins frameworks were left with the default role “*”.*This solved the problem – Mesos offered resources per role and hence Marathon never got starved out. A framework with a special role will get resource offers from both slaves supporting that special role and also from the default role “*”.* However, since we were using placement constraints, Marathon accepted resource offers only from slaves that supported both the role and the placement constraints.* ## Solution So we add role feature is the source code to solve the problem in same way: When accept a resource offer, Aurora will send back the needed resources to Mesos with the mesos role in resource offer. How to configure the Mesos role: 1.Add cmd
Re: Review Request 30710: add mesos role feature
On Feb. 17, 2015, 11:09 p.m., Maxim Khutornenko wrote: src/main/java/org/apache/aurora/scheduler/configuration/Resources.java, line 174 https://reviews.apache.org/r/30710/diff/3/?file=861559#file861559line174 Prefer using guava `Ordering` [1] instead. It will avoid excessive collection copies. [1] - https://code.google.com/p/guava-libraries/wiki/OrderingExplained Will change On Feb. 17, 2015, 11:09 p.m., Maxim Khutornenko wrote: src/main/java/org/apache/aurora/scheduler/configuration/Resources.java, line 139 https://reviews.apache.org/r/30710/diff/3/?file=861559#file861559line139 PORTS should be treated the same as other Mesos resources as far as framework role goes. We need to set the framework role on a matching port `Resource`. Will Change - lozh...@ebay.com --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30710/#review72793 --- On Feb. 13, 2015, 10 a.m., lozh...@ebay.com zhang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30710/ --- (Updated Feb. 13, 2015, 10 a.m.) Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Bill Farner. Bugs: AURORA-1109 https://issues.apache.org/jira/browse/AURORA-1109 Repository: aurora Description --- ## Problems We are from eBay platform team. Previously, we used marathon to generate Jenkins master instance in dedicated vms and recieve resource offer from same dedicated vms. For the details, please refer to http://www.ebaytechblog.com/2014/04/04/delivering-ebays-ci-solution-with-apache-mesos-part-i/#.VNQUuC6_SPU Now, we found Aurora is more stable and powerful. We are moving from Marathon to Aurora. During the move, we found there is no mesos role in Aurora now. But we need use mesos role way to solve the problem in section Frameworks stopped receiving offers after a while of the given url. Here is a snippet of the problem description: *We noticed occurred after we used Marathon to create the initial set of CI masters. As those CI masters started registering themselves as frameworks, Marathon stopped receiving any offers from Mesos; essentially, no new CI masters could be launched. Let’s start with Marathon. In the DRF model, it was unfair to treat Marathon in the same bucket/role alongside hundreds of connected Jenkins frameworks. After launching all these Jenkins frameworks, Marathon had a large resource share and Mesos would aggressively offer resources to frameworks that were using little or no resources. Marathon was placed last in priority and got starved out.* *We decided to define a dedicated Mesos role for Marathon and to have all of the Mesos slaves that were reserved for Jenkins master instances support that Mesos role. Jenkins frameworks were left with the default role “*”.*This solved the problem – Mesos offered resources per role and hence Marathon never got starved out. A framework with a special role will get resource offers from both slaves supporting that special role and also from the default role “*”.* However, since we were using placement constraints, Marathon accepted resource offers only from slaves that supported both the role and the placement constraints.* ## Solution So we add role feature is the source code to solve the problem in same way: When accept a resource offer, Aurora will send back the needed resources to Mesos with the mesos role in resource offer. How to configure the Mesos role: 1.Add cmd option --mesos_role=${Mesos role name} when start Aurora scheduler. We change the test cases according code change. Each changed test case is green Merge https://github.com/zhanglong2015/incubator-aurora Diffs - src/main/java/org/apache/aurora/scheduler/ResourceSlot.java 1a158b4e0be94762ad0480e8ce74b19bacc90c97 src/main/java/org/apache/aurora/scheduler/async/GcExecutorLauncher.java 31aa2bbaab3d97875493ad75c4d2c7c82ac7fa58 src/main/java/org/apache/aurora/scheduler/configuration/Resources.java b5a3140e3560f790d1db496dca3c2ee0dc96a195 src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java d0994203b5650f44ca2eb32e1e2aa61875163854 src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java 5340d651b298ec8aa079e73d6d2f652fdf876293 src/main/java/org/apache/aurora/scheduler/mesos/OfferWrapper.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/mesos/ResourceWrapper.java PRE-CREATION src/main/java/org/apache/aurora/scheduler/state/TaskAssigner.java e1c29747c9854cf75bf63f6f085cf40ca68989af
Re: Review Request 31388: Update thrift API and internal code to use JobUpdateSummary.key rather than job key and id.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31388/#review73936 --- This patch does not apply cleanly on master (cd681d9), do you need to rebase? I will refresh this build result if you post a review containing @ReviewBot retry - Aurora ReviewBot On Feb. 24, 2015, 11:47 p.m., Bill Farner wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31388/ --- (Updated Feb. 24, 2015, 11:47 p.m.) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- Update thrift API and internal code to use JobUpdateSummary.key rather than job key and id. Diffs - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6a00ce2c30ce24851560c4d499c395194dbeed16 src/main/java/org/apache/aurora/scheduler/quota/QuotaManager.java bb727214b19ec6074d04eacdd997832e50b756f9 src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java fc8e2f4e75a33789b28588204cbbfd56b5d049a4 src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 4b8b00cc74d81aa5cc772a7b4e077da6a4ed2a83 src/main/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStore.java d52c15e0a7083e0d02bbf6dbaa58d8956bc5dd6f src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 72b9895214aa372c4fd75e5bb6dee8485b3594f6 src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 34763851cd59996559f160c57b00d23de3dc49b4 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java d24b31bf1034c69ac8a9891aaa9b8653674ce0ce src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java d1bd3c975b2057b46597f38555c2a73d14835600 src/main/java/org/apache/aurora/scheduler/updater/Updates.java b8c919585307e27f154782ee179153165458bed7 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml c08f09a511e1fc631abdae50630b8b7ab10a440b src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql e6184dd2de0e86849d811ec05c168bbb4db8e8cd src/test/java/org/apache/aurora/scheduler/quota/QuotaManagerImplTest.java eac2033cffc09f6a9e867a2499569b51d94f0705 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java 81c8be7685854c47ed4a7c6deeffb6a0b80023ab src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java daaa759ca5c2cda26802187698f0a33ceb0dd39f src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java 5b0565e6477644047a237b8d185027a6fdc5c6a5 src/test/java/org/apache/aurora/scheduler/thrift/aop/ForwardingThrift.java 2405bb8ad379e355a7987e9bb4163e1693f18777 src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 3ba6274b97393ac2228d3eac8ba1615cb57014f8 Diff: https://reviews.apache.org/r/31388/diff/ Testing --- Thanks, Bill Farner
Re: Review Request 31388: Update thrift API and internal code to use JobUpdateSummary.key rather than job key and id.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31388/ --- (Updated Feb. 25, 2015, 12:07 a.m.) Review request for Aurora, Maxim Khutornenko and Zameer Manji. Changes --- Rebase on master. Bugs: AURORA-1093 https://issues.apache.org/jira/browse/AURORA-1093 Repository: aurora Description --- Update thrift API and internal code to use JobUpdateSummary.key rather than job key and id. Note: This change will leave the client in a broken state w.r.t. the scheduler. I will hold commiting this change until i have another review ready to go that updates the client. Diffs (updated) - api/src/main/thrift/org/apache/aurora/gen/api.thrift 6a00ce2c30ce24851560c4d499c395194dbeed16 src/main/java/org/apache/aurora/scheduler/quota/QuotaManager.java 199dc1123cd1966b1c8075e7a486efdf627c45f3 src/main/java/org/apache/aurora/scheduler/storage/ForwardingStore.java fc8e2f4e75a33789b28588204cbbfd56b5d049a4 src/main/java/org/apache/aurora/scheduler/storage/JobUpdateStore.java 4b8b00cc74d81aa5cc772a7b4e077da6a4ed2a83 src/main/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStore.java d52c15e0a7083e0d02bbf6dbaa58d8956bc5dd6f src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotStoreImpl.java 72b9895214aa372c4fd75e5bb6dee8485b3594f6 src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java b9dce16fa705fc14b5af394f8edb6301c789aff4 src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 059626f742ce806fded754ef04a83e2eb5405ff2 src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java d1bd3c975b2057b46597f38555c2a73d14835600 src/main/java/org/apache/aurora/scheduler/updater/Updates.java b8c919585307e27f154782ee179153165458bed7 src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml c08f09a511e1fc631abdae50630b8b7ab10a440b src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql e6184dd2de0e86849d811ec05c168bbb4db8e8cd src/test/java/org/apache/aurora/scheduler/quota/QuotaManagerImplTest.java b79b07c23e00fab524aef21481070c5f09c9fa18 src/test/java/org/apache/aurora/scheduler/storage/db/DBJobUpdateStoreTest.java 81c8be7685854c47ed4a7c6deeffb6a0b80023ab src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java e969b972806bef9aa67729c8a35283df6145e687 src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java d36418807ace4f61e8709a2827cea0e7bb8ac6b7 src/test/java/org/apache/aurora/scheduler/thrift/aop/ForwardingThrift.java 2405bb8ad379e355a7987e9bb4163e1693f18777 src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 3ba6274b97393ac2228d3eac8ba1615cb57014f8 Diff: https://reviews.apache.org/r/31388/diff/ Testing --- Thanks, Bill Farner