Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-03 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review55402
---


Pinging Kevin for final ship it and commit.

- Zameer Manji


On Oct. 2, 2014, 6:44 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Oct. 2, 2014, 6:44 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-02 Thread Zameer Manji


 On Oct. 1, 2014, 4:34 p.m., Kevin Sweeney wrote:
  src/main/python/apache/aurora/executor/common/announcer.py, line 228
  https://reviews.apache.org/r/25974/diff/6/?file=709964#file709964line228
 
  push this into `__start`, out of the constructor?
  
  At least on the Java side we try to avoid doing any I/O in constructors 
  (as they're more for wiring) but instead delegate to explicit lifecycle 
  methods like `start`.

I'd rather not. Keeping it in the constructor means it stays in the 
from_assigned_task method code path. Moving it to the start method moves it out 
of that code path and really changes when we do this sort of I/O.


- Zameer


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review55161
---


On Sept. 30, 2014, 5:17 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 30, 2014, 5:17 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-02 Thread Zameer Manji


 On Oct. 1, 2014, 4:14 p.m., Brian Wickman wrote:
  src/test/python/apache/aurora/executor/common/test_announcer.py, line 239
  https://reviews.apache.org/r/25974/diff/6/?file=709965#file709965line239
 
  out of scope for this review, but there's a new-ish pypi project called 
  'zake' that allows the kazoo client to be stubbed out with a mock zk tree 
  (while preserving certain operations like create/delete.)  i'd love to 
  explore using it here in the future.

It looks really nice for this sort of testing. It is definately out of the 
scope of this review. If we ever need to do more work on the Announcer/Service 
Discovery we should first migrate to zake.


- Zameer


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review55158
---


On Sept. 30, 2014, 5:17 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 30, 2014, 5:17 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-02 Thread Kevin Sweeney


 On Oct. 1, 2014, 4:34 p.m., Kevin Sweeney wrote:
  src/main/python/apache/aurora/executor/common/announcer.py, line 228
  https://reviews.apache.org/r/25974/diff/6/?file=709964#file709964line228
 
  push this into `__start`, out of the constructor?
  
  At least on the Java side we try to avoid doing any I/O in constructors 
  (as they're more for wiring) but instead delegate to explicit lifecycle 
  methods like `start`.
 
 Zameer Manji wrote:
 I'd rather not. Keeping it in the constructor means it stays in the 
 from_assigned_task method code path. Moving it to the start method moves it 
 out of that code path and really changes when we do this sort of I/O.

is it common practice to do I/O in constructors elsewhere in this codebase? 
I'll defer to wickman here


- Kevin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review55161
---


On Sept. 30, 2014, 5:17 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 30, 2014, 5:17 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-02 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/
---

(Updated Oct. 2, 2014, 4:20 p.m.)


Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.


Changes
---

Brian's feedback.


Bugs: AURORA-728
https://issues.apache.org/jira/browse/AURORA-728


Repository: aurora


Description
---

Prevent initial ZK timeouts from killing the executor. In addition, prevent 
uncaught exceptions from killing the executor.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
79a24855b2a68271b7478395dfdadab8755c3af2 
  src/main/python/apache/aurora/executor/common/announcer.py 
c466da8d48bbc2aa227c2db157cab84665ad6602 
  src/test/python/apache/aurora/executor/common/test_announcer.py 
4f6e200ecb1a4ea7cb45acd466a57f19d5815326 

Diff: https://reviews.apache.org/r/25974/diff/


Testing
---

./pants src/test/python/apache/aurora/executor:executor-small
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Zameer Manji



Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-02 Thread Brian Wickman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review55309
---



src/test/python/apache/aurora/executor/common/test_announcer.py
https://reviews.apache.org/r/25974/#comment95688

I'm confused -- nobody calls client.connected anymore, right?  In theory, 
this test will pass, except it will take 30-60 seconds to run since 
client_connect_event.wait(timeout=...) will be called with the health check 
timeout.


- Brian Wickman


On Oct. 2, 2014, 11:20 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Oct. 2, 2014, 11:20 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-02 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/
---

(Updated Oct. 2, 2014, 6:44 p.m.)


Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.


Changes
---

Brian's feedback.


Bugs: AURORA-728
https://issues.apache.org/jira/browse/AURORA-728


Repository: aurora


Description
---

Prevent initial ZK timeouts from killing the executor. In addition, prevent 
uncaught exceptions from killing the executor.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
79a24855b2a68271b7478395dfdadab8755c3af2 
  src/main/python/apache/aurora/executor/common/announcer.py 
c466da8d48bbc2aa227c2db157cab84665ad6602 
  src/test/python/apache/aurora/executor/common/test_announcer.py 
4f6e200ecb1a4ea7cb45acd466a57f19d5815326 

Diff: https://reviews.apache.org/r/25974/diff/


Testing
---

./pants src/test/python/apache/aurora/executor:executor-small
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Zameer Manji



Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-01 Thread Brian Wickman


 On Oct. 1, 2014, 11:14 p.m., Brian Wickman wrote:
 

btw overall this review looks great and is much cleaner than what i was 
envisioning.


- Brian


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review55158
---


On Oct. 1, 2014, 12:17 a.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Oct. 1, 2014, 12:17 a.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-10-01 Thread Brian Wickman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review55158
---



src/main/python/apache/aurora/executor/common/announcer.py
https://reviews.apache.org/r/25974/#comment95541

i think if self.__connect_event.is_set() is more accurate -- it's 
possible for it to be connected and then race with disconnection.  we just want 
to validate that it managed to connect initially.

this complicates your test a little bit since you do connected.set() right 
off the bat.  but you might be able to get around this by making a 
HealthCheckConfig that produces a tiny timeout.



src/test/python/apache/aurora/executor/common/test_announcer.py
https://reviews.apache.org/r/25974/#comment95542

out of scope for this review, but there's a new-ish pypi project called 
'zake' that allows the kazoo client to be stubbed out with a mock zk tree 
(while preserving certain operations like create/delete.)  i'd love to explore 
using it here in the future.


- Brian Wickman


On Oct. 1, 2014, 12:17 a.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Oct. 1, 2014, 12:17 a.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-30 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/
---

(Updated Sept. 30, 2014, 5:17 p.m.)


Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.


Changes
---

Brian's Feedback.


Bugs: AURORA-728
https://issues.apache.org/jira/browse/AURORA-728


Repository: aurora


Description
---

Prevent initial ZK timeouts from killing the executor. In addition, prevent 
uncaught exceptions from killing the executor.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
79a24855b2a68271b7478395dfdadab8755c3af2 
  src/main/python/apache/aurora/executor/common/announcer.py 
c466da8d48bbc2aa227c2db157cab84665ad6602 
  src/test/python/apache/aurora/executor/common/test_announcer.py 
4f6e200ecb1a4ea7cb45acd466a57f19d5815326 

Diff: https://reviews.apache.org/r/25974/diff/


Testing (updated)
---

./pants src/test/python/apache/aurora/executor:executor-small
./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh


Thanks,

Zameer Manji



Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-29 Thread Bill Farner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54885
---

Ship it!



src/main/python/apache/aurora/executor/aurora_executor.py
https://reviews.apache.org/r/25974/#comment95208

TASK_FAILED in the message is redundant.


- Bill Farner


On Sept. 26, 2014, 9:24 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 26, 2014, 9:24 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-29 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54890
---


Pinging Brian and Kevin to this review.

- Zameer Manji


On Sept. 29, 2014, 2:46 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 29, 2014, 2:46 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-29 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/
---

(Updated Sept. 29, 2014, 2:46 p.m.)


Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.


Changes
---

Bill's Feedback


Bugs: AURORA-728
https://issues.apache.org/jira/browse/AURORA-728


Repository: aurora


Description
---

Prevent initial ZK timeouts from killing the executor. In addition, prevent 
uncaught exceptions from killing the executor.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
79a24855b2a68271b7478395dfdadab8755c3af2 
  src/main/python/apache/aurora/executor/common/announcer.py 
c466da8d48bbc2aa227c2db157cab84665ad6602 
  src/test/python/apache/aurora/executor/common/test_announcer.py 
4f6e200ecb1a4ea7cb45acd466a57f19d5815326 

Diff: https://reviews.apache.org/r/25974/diff/


Testing
---

./pants src/test/python/apache/aurora/executor:executor-small


Thanks,

Zameer Manji



Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-29 Thread Kevin Sweeney

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54892
---



src/main/python/apache/aurora/executor/common/announcer.py
https://reviews.apache.org/r/25974/#comment95226

Some context for the error would be helpful here - e.g. log.error(Initial 
connection to ZooKeeper timed out: %s % e).


- Kevin Sweeney


On Sept. 29, 2014, 2:46 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 29, 2014, 2:46 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-29 Thread Kevin Sweeney

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54895
---



src/main/python/apache/aurora/executor/common/announcer.py
https://reviews.apache.org/r/25974/#comment95228

Do you need input sanitization here? Could this logic be hidden behind an 
abstraction? It's unclear to me why the serverset connection code needs live 
with the executor data parsing code.


- Kevin Sweeney


On Sept. 29, 2014, 2:46 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 29, 2014, 2:46 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-29 Thread Brian Wickman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54916
---



src/main/python/apache/aurora/executor/common/announcer.py
https://reviews.apache.org/r/25974/#comment95251

this data is available within the mesos_task struct created in 
from_assigned_task.  you should be getting the information from there as it may 
contain default values not required to be populated in the executorConfig json 
(in which case you'd be getting KeyErrors from make_serverset.)



src/main/python/apache/aurora/executor/common/announcer.py
https://reviews.apache.org/r/25974/#comment95257

i feel like we should be calling start_async instead, though this 
complicates implementation slightly.

we want the creation of all the status managers to be non-blocking since 
the executor will be in a transient (TASK_STARTING) state, which can possibly 
timeout and go to TASK_LOST, which is less preferable than TASK_FAILED.

client.create_async returns a threading.Event.  we'd want (in a separate 
thread) to event.wait(timeout=timeout) and if not event.is_set(), set the 
.healthy property on the Announcer to StatusResult(..., mesos_pb2.TASK_FAILED)

alternately, you could ignore the threading.Event from create_async and 
bake that logic into the contract between Announcer-ServerSetJoinThread.  i'm 
not sure which is simpler or cleaner.


- Brian Wickman


On Sept. 29, 2014, 9:46 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 29, 2014, 9:46 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-29 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/
---

(Updated Sept. 29, 2014, 5:54 p.m.)


Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.


Changes
---

Brian's Feedback except for calling start_async()


Bugs: AURORA-728
https://issues.apache.org/jira/browse/AURORA-728


Repository: aurora


Description
---

Prevent initial ZK timeouts from killing the executor. In addition, prevent 
uncaught exceptions from killing the executor.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
79a24855b2a68271b7478395dfdadab8755c3af2 
  src/main/python/apache/aurora/executor/common/announcer.py 
c466da8d48bbc2aa227c2db157cab84665ad6602 
  src/test/python/apache/aurora/executor/common/test_announcer.py 
4f6e200ecb1a4ea7cb45acd466a57f19d5815326 

Diff: https://reviews.apache.org/r/25974/diff/


Testing
---

./pants src/test/python/apache/aurora/executor:executor-small


Thanks,

Zameer Manji



Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-29 Thread Zameer Manji


 On Sept. 29, 2014, 4:43 p.m., Brian Wickman wrote:
  src/main/python/apache/aurora/executor/common/announcer.py, lines 112-116
  https://reviews.apache.org/r/25974/diff/4/?file=708538#file708538line112
 
  this data is available within the mesos_task struct created in 
  from_assigned_task.  you should be getting the information from there as it 
  may contain default values not required to be populated in the 
  executorConfig json (in which case you'd be getting KeyErrors from 
  make_serverset.)

done.


- Zameer


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54916
---


On Sept. 29, 2014, 5:54 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 29, 2014, 5:54 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-26 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/
---

(Updated Sept. 26, 2014, 2:24 p.m.)


Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.


Changes
---

Kevin's Feedback.


Bugs: AURORA-728
https://issues.apache.org/jira/browse/AURORA-728


Repository: aurora


Description
---

Prevent initial ZK timeouts from killing the executor. In addition, prevent 
uncaught exceptions from killing the executor.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
79a24855b2a68271b7478395dfdadab8755c3af2 
  src/main/python/apache/aurora/executor/common/announcer.py 
c466da8d48bbc2aa227c2db157cab84665ad6602 
  src/test/python/apache/aurora/executor/common/test_announcer.py 
4f6e200ecb1a4ea7cb45acd466a57f19d5815326 

Diff: https://reviews.apache.org/r/25974/diff/


Testing
---

./pants src/test/python/apache/aurora/executor:executor-small


Thanks,

Zameer Manji



Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-26 Thread Zameer Manji


 On Sept. 25, 2014, 4:29 p.m., Kevin Sweeney wrote:
  src/test/python/apache/aurora/executor/common/test_announcer.py, line 264
  https://reviews.apache.org/r/25974/diff/2/?file=705691#file705691line264
 
  where is this magic number coming from? extract a constant and import 
  that here?

This is the default initial health check + (max failures * health check). I'm 
unsure on the best way to fix this without duplicating the code it is trying to 
test.


- Zameer


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54623
---


On Sept. 25, 2014, 2:39 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 25, 2014, 2:39 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 
 
 Thanks,
 
 Zameer Manji
 




Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-25 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/
---

(Updated Sept. 25, 2014, 2:39 p.m.)


Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.


Changes
---

Brian's Feedback


Summary (updated)
-

Prevent initial ZK timeouts from killing the executor.


Bugs: AURORA-728
https://issues.apache.org/jira/browse/AURORA-728


Repository: aurora


Description (updated)
---

Prevent initial ZK timeouts from killing the executor. In addition, prevent 
uncaught exceptions from killing the executor.


Diffs (updated)
-

  src/main/python/apache/aurora/executor/aurora_executor.py 
79a24855b2a68271b7478395dfdadab8755c3af2 
  src/main/python/apache/aurora/executor/common/announcer.py 
c466da8d48bbc2aa227c2db157cab84665ad6602 
  src/test/python/apache/aurora/executor/common/test_announcer.py 
4f6e200ecb1a4ea7cb45acd466a57f19d5815326 

Diff: https://reviews.apache.org/r/25974/diff/


Testing
---

./pants src/test/python/apache/aurora/executor:executor-small


Thanks,

Zameer Manji



Re: Review Request 25974: Prevent initial ZK timeouts from killing the executor.

2014-09-25 Thread Kevin Sweeney

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25974/#review54623
---



src/main/python/apache/aurora/executor/common/announcer.py
https://reviews.apache.org/r/25974/#comment94848

add a trailing comma here for easy import-sorting



src/main/python/apache/aurora/executor/common/announcer.py
https://reviews.apache.org/r/25974/#comment94854

log this?



src/test/python/apache/aurora/executor/common/test_announcer.py
https://reviews.apache.org/r/25974/#comment94851

where is this magic number coming from? extract a constant and import that 
here?


- Kevin Sweeney


On Sept. 25, 2014, 2:39 p.m., Zameer Manji wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25974/
 ---
 
 (Updated Sept. 25, 2014, 2:39 p.m.)
 
 
 Review request for Aurora, Kevin Sweeney, Bill Farner, and Brian Wickman.
 
 
 Bugs: AURORA-728
 https://issues.apache.org/jira/browse/AURORA-728
 
 
 Repository: aurora
 
 
 Description
 ---
 
 Prevent initial ZK timeouts from killing the executor. In addition, prevent 
 uncaught exceptions from killing the executor.
 
 
 Diffs
 -
 
   src/main/python/apache/aurora/executor/aurora_executor.py 
 79a24855b2a68271b7478395dfdadab8755c3af2 
   src/main/python/apache/aurora/executor/common/announcer.py 
 c466da8d48bbc2aa227c2db157cab84665ad6602 
   src/test/python/apache/aurora/executor/common/test_announcer.py 
 4f6e200ecb1a4ea7cb45acd466a57f19d5815326 
 
 Diff: https://reviews.apache.org/r/25974/diff/
 
 
 Testing
 ---
 
 ./pants src/test/python/apache/aurora/executor:executor-small
 
 
 Thanks,
 
 Zameer Manji