Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review55402 --- Pinging Kevin for final ship it and commit. - Zameer Manji On Oct. 2, 2014, 6:44 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Oct. 2, 2014, 6:44 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
On Oct. 1, 2014, 4:34 p.m., Kevin Sweeney wrote: src/main/python/apache/aurora/executor/common/announcer.py, line 228 https://reviews.apache.org/r/25974/diff/6/?file=709964#file709964line228 push this into `__start`, out of the constructor? At least on the Java side we try to avoid doing any I/O in constructors (as they're more for wiring) but instead delegate to explicit lifecycle methods like `start`. I'd rather not. Keeping it in the constructor means it stays in the from_assigned_task method code path. Moving it to the start method moves it out of that code path and really changes when we do this sort of I/O. - Zameer --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review55161 --- On Sept. 30, 2014, 5:17 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 30, 2014, 5:17 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
On Oct. 1, 2014, 4:14 p.m., Brian Wickman wrote: src/test/python/apache/aurora/executor/common/test_announcer.py, line 239 https://reviews.apache.org/r/25974/diff/6/?file=709965#file709965line239 out of scope for this review, but there's a new-ish pypi project called 'zake' that allows the kazoo client to be stubbed out with a mock zk tree (while preserving certain operations like create/delete.) i'd love to explore using it here in the future. It looks really nice for this sort of testing. It is definately out of the scope of this review. If we ever need to do more work on the Announcer/Service Discovery we should first migrate to zake. - Zameer --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review55158 --- On Sept. 30, 2014, 5:17 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 30, 2014, 5:17 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
On Oct. 1, 2014, 4:34 p.m., Kevin Sweeney wrote: src/main/python/apache/aurora/executor/common/announcer.py, line 228 https://reviews.apache.org/r/25974/diff/6/?file=709964#file709964line228 push this into `__start`, out of the constructor? At least on the Java side we try to avoid doing any I/O in constructors (as they're more for wiring) but instead delegate to explicit lifecycle methods like `start`. Zameer Manji wrote: I'd rather not. Keeping it in the constructor means it stays in the from_assigned_task method code path. Moving it to the start method moves it out of that code path and really changes when we do this sort of I/O. is it common practice to do I/O in constructors elsewhere in this codebase? I'll defer to wickman here - Kevin --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review55161 --- On Sept. 30, 2014, 5:17 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 30, 2014, 5:17 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Oct. 2, 2014, 4:20 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Changes --- Brian's feedback. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs (updated) - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review55309 --- src/test/python/apache/aurora/executor/common/test_announcer.py https://reviews.apache.org/r/25974/#comment95688 I'm confused -- nobody calls client.connected anymore, right? In theory, this test will pass, except it will take 30-60 seconds to run since client_connect_event.wait(timeout=...) will be called with the health check timeout. - Brian Wickman On Oct. 2, 2014, 11:20 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Oct. 2, 2014, 11:20 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Oct. 2, 2014, 6:44 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Changes --- Brian's feedback. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs (updated) - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
On Oct. 1, 2014, 11:14 p.m., Brian Wickman wrote: btw overall this review looks great and is much cleaner than what i was envisioning. - Brian --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review55158 --- On Oct. 1, 2014, 12:17 a.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Oct. 1, 2014, 12:17 a.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review55158 --- src/main/python/apache/aurora/executor/common/announcer.py https://reviews.apache.org/r/25974/#comment95541 i think if self.__connect_event.is_set() is more accurate -- it's possible for it to be connected and then race with disconnection. we just want to validate that it managed to connect initially. this complicates your test a little bit since you do connected.set() right off the bat. but you might be able to get around this by making a HealthCheckConfig that produces a tiny timeout. src/test/python/apache/aurora/executor/common/test_announcer.py https://reviews.apache.org/r/25974/#comment95542 out of scope for this review, but there's a new-ish pypi project called 'zake' that allows the kazoo client to be stubbed out with a mock zk tree (while preserving certain operations like create/delete.) i'd love to explore using it here in the future. - Brian Wickman On Oct. 1, 2014, 12:17 a.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Oct. 1, 2014, 12:17 a.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 30, 2014, 5:17 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Changes --- Brian's Feedback. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs (updated) - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing (updated) --- ./pants src/test/python/apache/aurora/executor:executor-small ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review54885 --- Ship it! src/main/python/apache/aurora/executor/aurora_executor.py https://reviews.apache.org/r/25974/#comment95208 TASK_FAILED in the message is redundant. - Bill Farner On Sept. 26, 2014, 9:24 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 26, 2014, 9:24 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review54890 --- Pinging Brian and Kevin to this review. - Zameer Manji On Sept. 29, 2014, 2:46 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 29, 2014, 2:46 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 29, 2014, 2:46 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Changes --- Bill's Feedback Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs (updated) - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review54892 --- src/main/python/apache/aurora/executor/common/announcer.py https://reviews.apache.org/r/25974/#comment95226 Some context for the error would be helpful here - e.g. log.error(Initial connection to ZooKeeper timed out: %s % e). - Kevin Sweeney On Sept. 29, 2014, 2:46 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 29, 2014, 2:46 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review54895 --- src/main/python/apache/aurora/executor/common/announcer.py https://reviews.apache.org/r/25974/#comment95228 Do you need input sanitization here? Could this logic be hidden behind an abstraction? It's unclear to me why the serverset connection code needs live with the executor data parsing code. - Kevin Sweeney On Sept. 29, 2014, 2:46 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 29, 2014, 2:46 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review54916 --- src/main/python/apache/aurora/executor/common/announcer.py https://reviews.apache.org/r/25974/#comment95251 this data is available within the mesos_task struct created in from_assigned_task. you should be getting the information from there as it may contain default values not required to be populated in the executorConfig json (in which case you'd be getting KeyErrors from make_serverset.) src/main/python/apache/aurora/executor/common/announcer.py https://reviews.apache.org/r/25974/#comment95257 i feel like we should be calling start_async instead, though this complicates implementation slightly. we want the creation of all the status managers to be non-blocking since the executor will be in a transient (TASK_STARTING) state, which can possibly timeout and go to TASK_LOST, which is less preferable than TASK_FAILED. client.create_async returns a threading.Event. we'd want (in a separate thread) to event.wait(timeout=timeout) and if not event.is_set(), set the .healthy property on the Announcer to StatusResult(..., mesos_pb2.TASK_FAILED) alternately, you could ignore the threading.Event from create_async and bake that logic into the contract between Announcer-ServerSetJoinThread. i'm not sure which is simpler or cleaner. - Brian Wickman On Sept. 29, 2014, 9:46 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 29, 2014, 9:46 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 29, 2014, 5:54 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Changes --- Brian's Feedback except for calling start_async() Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs (updated) - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
On Sept. 29, 2014, 4:43 p.m., Brian Wickman wrote: src/main/python/apache/aurora/executor/common/announcer.py, lines 112-116 https://reviews.apache.org/r/25974/diff/4/?file=708538#file708538line112 this data is available within the mesos_task struct created in from_assigned_task. you should be getting the information from there as it may contain default values not required to be populated in the executorConfig json (in which case you'd be getting KeyErrors from make_serverset.) done. - Zameer --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review54916 --- On Sept. 29, 2014, 5:54 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 29, 2014, 5:54 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 26, 2014, 2:24 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Changes --- Kevin's Feedback. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs (updated) - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
On Sept. 25, 2014, 4:29 p.m., Kevin Sweeney wrote: src/test/python/apache/aurora/executor/common/test_announcer.py, line 264 https://reviews.apache.org/r/25974/diff/2/?file=705691#file705691line264 where is this magic number coming from? extract a constant and import that here? This is the default initial health check + (max failures * health check). I'm unsure on the best way to fix this without duplicating the code it is trying to test. - Zameer --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review54623 --- On Sept. 25, 2014, 2:39 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 25, 2014, 2:39 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 25, 2014, 2:39 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Changes --- Brian's Feedback Summary (updated) - Prevent initial ZK timeouts from killing the executor. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description (updated) --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs (updated) - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji
Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/#review54623 --- src/main/python/apache/aurora/executor/common/announcer.py https://reviews.apache.org/r/25974/#comment94848 add a trailing comma here for easy import-sorting src/main/python/apache/aurora/executor/common/announcer.py https://reviews.apache.org/r/25974/#comment94854 log this? src/test/python/apache/aurora/executor/common/test_announcer.py https://reviews.apache.org/r/25974/#comment94851 where is this magic number coming from? extract a constant and import that here? - Kevin Sweeney On Sept. 25, 2014, 2:39 p.m., Zameer Manji wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25974/ --- (Updated Sept. 25, 2014, 2:39 p.m.) Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman. Bugs: AURORA-728 https://issues.apache.org/jira/browse/AURORA-728 Repository: aurora Description --- Prevent initial ZK timeouts from killing the executor. In addition, prevent uncaught exceptions from killing the executor. Diffs - src/main/python/apache/aurora/executor/aurora_executor.py 79a24855b2a68271b7478395dfdadab8755c3af2 src/main/python/apache/aurora/executor/common/announcer.py c466da8d48bbc2aa227c2db157cab84665ad6602 src/test/python/apache/aurora/executor/common/test_announcer.py 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 Diff: https://reviews.apache.org/r/25974/diff/ Testing --- ./pants src/test/python/apache/aurora/executor:executor-small Thanks, Zameer Manji