Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148128
---



Please add a line to RELEASE_NOTES.md about this change.


src/main/python/apache/aurora/admin/admin.py (line 351)


`settings.explicitBatchSize` lacks context in the help output of an admin 
client command. This should probably say it defaults to the value of the 
scheduler's `-reconciliation_explicit_batch_size` flag?



src/main/python/apache/aurora/admin/admin.py (lines 357 - 358)


What if type is `foo`, etc? We should ensure the value is valid rather than 
defaulting to explicit reconciliation.



src/main/python/apache/aurora/client/api/__init__.py (line 357)


move to previous line.


- Joshua Cohen


On Sept. 8, 2016, 12:30 a.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 8, 2016, 12:30 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148123
---




api/src/main/thrift/org/apache/aurora/gen/api.thrift (line 1214)


s/with a batch size/with the given settings



api/src/main/thrift/org/apache/aurora/gen/api.thrift (line 1215)


s/settingsS/settings



src/main/python/apache/aurora/admin/admin.py (lines 340 - 341)


Consider using choices=['explicit', 'implicit'] to make it more robust.



src/main/python/apache/aurora/admin/admin.py (line 342)


type=int ?


- Santhosh Kumar Shanmugham


On Sept. 7, 2016, 5:30 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 5:30 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-07 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148125
---



Master (8fca745) is red with this patch.
  ./build-support/jenkins/build.sh

 
src/test/python/apache/thermos/common/test_planner.py::test_planner_ordered 
PASSED
 
src/test/python/apache/thermos/common/test_planner.py::test_planner_mixed 
PASSED
 
src/test/python/apache/thermos/common/test_planner.py::test_planner_unsatisfiables
 PASSED
 
  FAILURES 
 _ TestJobUpdateApis.test_get_job_update_details __
 
 self = 
 
 def test_get_job_update_details(self):
   """Test getting job update details."""
   api, mock_proxy = self.mock_api()
   key = JobUpdateKey(job=JobKey(role="role", 
environment="env", name="name"), id="id")
 > api.get_job_update_details(key)
 
 src/test/python/apache/aurora/client/api/test_api.py:192: 
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 
.pants.d/python-setup/chroots/8f4bc6d10c5c0a60369568c5d2b7748b7e500f99/apache/aurora/client/api/__init__.py:264:
 in get_job_update_details
 return 
self._scheduler_proxy.getJobUpdateDetails(key)
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 
 _mock_self = 
 args = (JobUpdateKey(job=JobKey(environment='env', 
role='role', name='name'), id='id'),)
 kwargs = {}
 
 def __call__(_mock_self, *args, **kwargs):
 # can't use self in-case a function / method 
we are mocking uses self
 # in the signature
 >   _mock_self._mock_check_sig(*args, **kwargs)
 E   TypeError: () takes exactly 3 
arguments (2 given)
 
 
.pants.d/python-setup/chroots/8f4bc6d10c5c0a60369568c5d2b7748b7e500f99/.deps/mock-1.0.1-py2-none-any.whl/mock.py:954:
 TypeError
  generated xml file: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/dist/test-results/415337499eb72578eab327a6487c1f5c9452b3d6.xml
 
  1 failed, 700 passed, 6 skipped, 1 warnings in 
285.55 seconds 
 
FAILURE


00:52:55 05:06   [complete]
   FAILURE


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 8, 2016, 12:03 a.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> ---
> 
> (Updated Sept. 8, 2016, 12:03 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148122
---


Ship it!




Master (8fca745) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 8, 2016, 12:30 a.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 8, 2016, 12:30 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-07 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148120
---



@ReviewBot retry

- Zameer Manji


On Sept. 7, 2016, 5:03 p.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> ---
> 
> (Updated Sept. 7, 2016, 5:03 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Karthik Anantha Padmanabhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/
---

(Updated Sept. 8, 2016, 12:30 a.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

* Introduce the struct for Explicit Reconcile settings. 
* Update the client batch size defaults to None.


Repository: aurora


Description
---

AURORA-1602 Aurora admin commands for reconcilation


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
c5765b70501c101f0535b4eed94e9948c36808f9 
  src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
3275d72a0a74909e635bce615e2036ec72aa38ee 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
929d021a336c6a3438613c9340c84a1096dc9069 
  src/main/python/apache/aurora/admin/admin.py 
76009b9c1c7a130c25abad544a176dc590dafb12 
  src/main/python/apache/aurora/client/api/__init__.py 
9149c3018ae58d405f284fcbd4076d251ccc8192 
  
src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
 b9317dc20456f90057ec2bf4d10619a5ae986187 
  
src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
 779dc302602ae8842084807ca868a491ea99b676 
  src/test/python/apache/aurora/admin/test_admin.py 
eb193c4a33f5275f05a995338446bec0e19bc1cf 
  src/test/python/apache/aurora/client/api/test_scheduler_client.py 
afac2500551af2fce406bb906aa4e33f353e90a1 

Diff: https://reviews.apache.org/r/51662/diff/


Testing
---

* Manually tested on my local vagrant installation
* ./build-support/jenkins/build.sh


Thanks,

Karthik Anantha Padmanabhan



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-07 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148119
---



Master (8fca745) is red with this patch.
  ./build-support/jenkins/build.sh

 
src/test/python/apache/thermos/common/test_planner.py::test_planner_ordered 
PASSED
 
src/test/python/apache/thermos/common/test_planner.py::test_planner_mixed 
PASSED
 
src/test/python/apache/thermos/common/test_planner.py::test_planner_unsatisfiables
 PASSED
 
  FAILURES 
 _ TestJobUpdateApis.test_get_job_update_details __
 
 self = 
 
 def test_get_job_update_details(self):
   """Test getting job update details."""
   api, mock_proxy = self.mock_api()
   key = JobUpdateKey(job=JobKey(role="role", 
environment="env", name="name"), id="id")
 > api.get_job_update_details(key)
 
 src/test/python/apache/aurora/client/api/test_api.py:192: 
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 
.pants.d/python-setup/chroots/8f4bc6d10c5c0a60369568c5d2b7748b7e500f99/apache/aurora/client/api/__init__.py:264:
 in get_job_update_details
 return 
self._scheduler_proxy.getJobUpdateDetails(key)
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 
 _mock_self = 
 args = (JobUpdateKey(job=JobKey(environment='env', 
role='role', name='name'), id='id'),)
 kwargs = {}
 
 def __call__(_mock_self, *args, **kwargs):
 # can't use self in-case a function / method 
we are mocking uses self
 # in the signature
 >   _mock_self._mock_check_sig(*args, **kwargs)
 E   TypeError: () takes exactly 3 
arguments (2 given)
 
 
.pants.d/python-setup/chroots/8f4bc6d10c5c0a60369568c5d2b7748b7e500f99/.deps/mock-1.0.1-py2-none-any.whl/mock.py:954:
 TypeError
  generated xml file: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/dist/test-results/415337499eb72578eab327a6487c1f5c9452b3d6.xml
 
  1 failed, 700 passed, 6 skipped, 1 warnings in 
226.75 seconds 
 
FAILURE


00:24:19 04:19   [complete]
   FAILURE


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 8, 2016, 12:03 a.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> ---
> 
> (Updated Sept. 8, 2016, 12:03 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Re: Review Request 51536: Modify the watch_secs assertion on scheduler

2016-09-07 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148118
---


Ship it!




Ship It!

- Zameer Manji


On Sept. 7, 2016, 4:10 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 7, 2016, 4:10 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Modify the watch_secs assertion on scheduler.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-07 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/#review148117
---



Reviewers: This is my first time causing a thrift level deprecation, please pay 
careful attention to ensure nothing is broken for 0.16.0.

- Zameer Manji


On Sept. 7, 2016, 5:03 p.m., Zameer Manji wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51712/
> ---
> 
> (Updated Sept. 7, 2016, 5:03 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen and Maxim Khutornenko.
> 
> 
> Bugs: AURORA-1764
> https://issues.apache.org/jira/browse/AURORA-1764
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> This extends getJobUpdateDetails to return a list of details instead of being 
> scoped to a single update.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> 4202f0eec59ef16668fca6b6ebb925335ad5dea1 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
> 
> Diff: https://reviews.apache.org/r/51712/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>



Review Request 51712: Extend getJobUpdateDetails to accept JobUpdateQuery

2016-09-07 Thread Zameer Manji

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51712/
---

Review request for Aurora, Joshua Cohen and Maxim Khutornenko.


Bugs: AURORA-1764
https://issues.apache.org/jira/browse/AURORA-1764


Repository: aurora


Description
---

This extends getJobUpdateDetails to return a list of details instead of being 
scoped to a single update.


Diffs
-

  RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
c5765b70501c101f0535b4eed94e9948c36808f9 
  src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
4202f0eec59ef16668fca6b6ebb925335ad5dea1 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
929d021a336c6a3438613c9340c84a1096dc9069 
  
src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java 
a7d1f74acdfe5f58eb822a4d826e920cad53dced 

Diff: https://reviews.apache.org/r/51712/diff/


Testing
---


Thanks,

Zameer Manji



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Santhosh Kumar Shanmugham


> On Sept. 7, 2016, 2:45 p.m., Santhosh Kumar Shanmugham wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 1209-1214
> > 
> >
> > Did we consider combining the "explicit" and "implicit" reconciliation 
> > RPCs into a single "reconcile" RPC with an optional batchSize argument, so 
> > that it maps to the mesos master API.
> > 
> > http://mesos.apache.org/documentation/latest/reconciliation/
> 
> Karthik Anantha Padmanabhan wrote:
> We have more than one option on the scheduler for the explicit reconcile. 
> If at all we want to expose those to the admin CLI in the future it may be 
> better to have them as two separate API's.
> 
> Santhosh Kumar Shanmugham wrote:
> "We have more than one option on the scheduler for the explicit 
> reconcile." - can you given an example of what you mean by this.
> 
> Karthik Anantha Padmanabhan wrote:
> There is another releavant parameter called `explicitBatchDelaySeconds` 
> which dictates how long we wait in between batches before we call reconcile 
> on mesos. 
> 
> But it is possible that we may add another option in the future ? In that 
> case wouldn't it be better to have two separate API's ?
> 
> Karthik Anantha Padmanabhan wrote:
> Hmm actually if we are passing in a struct of `ExplicitReconcileSettings` 
> , then adding options will be easier. So I can check `isNull` on the settings 
> object on the scheduler to identify an implicit reconcile. 
> 
> I can go either way. I just don't know if there is a strong reason to 
> make it a single API as opposed to two.
> 
> Maxim Khutornenko wrote:
> Please, don't save a few lines by merging the RPCs or forking off of the 
> presence of the `batchSize`. Explicit and implicit reconciliations have very 
> different scopes and cover different failure scenarios. I'd advocate in favor 
> of Karthik's original proposal with separate call paths (RPCs and client 
> commands) that clearly state their purpose in the admin API.
> 
> Santhosh Kumar Shanmugham wrote:
> Since the Scheduler's reconcile is basically a reflection of the master's 
> reconcile API, keeping them alike will make it more easier to reason about.
> 
> +1 for introducing a struct as the parameter incase the operator wishes 
> to override the settings.

I am okay with that too.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---


On Sept. 7, 2016, 2:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 2:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51536: Modify the watch_secs assertion on scheduler

2016-09-07 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148111
---


Ship it!




Master (19866b5) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 7, 2016, 11:10 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 7, 2016, 11:10 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Modify the watch_secs assertion on scheduler.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Santhosh Kumar Shanmugham


> On Sept. 7, 2016, 2:45 p.m., Santhosh Kumar Shanmugham wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 1209-1214
> > 
> >
> > Did we consider combining the "explicit" and "implicit" reconciliation 
> > RPCs into a single "reconcile" RPC with an optional batchSize argument, so 
> > that it maps to the mesos master API.
> > 
> > http://mesos.apache.org/documentation/latest/reconciliation/
> 
> Karthik Anantha Padmanabhan wrote:
> We have more than one option on the scheduler for the explicit reconcile. 
> If at all we want to expose those to the admin CLI in the future it may be 
> better to have them as two separate API's.
> 
> Santhosh Kumar Shanmugham wrote:
> "We have more than one option on the scheduler for the explicit 
> reconcile." - can you given an example of what you mean by this.
> 
> Karthik Anantha Padmanabhan wrote:
> There is another releavant parameter called `explicitBatchDelaySeconds` 
> which dictates how long we wait in between batches before we call reconcile 
> on mesos. 
> 
> But it is possible that we may add another option in the future ? In that 
> case wouldn't it be better to have two separate API's ?
> 
> Karthik Anantha Padmanabhan wrote:
> Hmm actually if we are passing in a struct of `ExplicitReconcileSettings` 
> , then adding options will be easier. So I can check `isNull` on the settings 
> object on the scheduler to identify an implicit reconcile. 
> 
> I can go either way. I just don't know if there is a strong reason to 
> make it a single API as opposed to two.
> 
> Maxim Khutornenko wrote:
> Please, don't save a few lines by merging the RPCs or forking off of the 
> presence of the `batchSize`. Explicit and implicit reconciliations have very 
> different scopes and cover different failure scenarios. I'd advocate in favor 
> of Karthik's original proposal with separate call paths (RPCs and client 
> commands) that clearly state their purpose in the admin API.

Since the Scheduler's reconcile is basically a reflection of the master's 
reconcile API, keeping them alike will make it more easier to reason about.

+1 for introducing a struct as the parameter incase the operator wishes to 
override the settings.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---


On Sept. 7, 2016, 2:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 2:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Maxim Khutornenko


> On Sept. 7, 2016, 9:45 p.m., Santhosh Kumar Shanmugham wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 1209-1214
> > 
> >
> > Did we consider combining the "explicit" and "implicit" reconciliation 
> > RPCs into a single "reconcile" RPC with an optional batchSize argument, so 
> > that it maps to the mesos master API.
> > 
> > http://mesos.apache.org/documentation/latest/reconciliation/
> 
> Karthik Anantha Padmanabhan wrote:
> We have more than one option on the scheduler for the explicit reconcile. 
> If at all we want to expose those to the admin CLI in the future it may be 
> better to have them as two separate API's.
> 
> Santhosh Kumar Shanmugham wrote:
> "We have more than one option on the scheduler for the explicit 
> reconcile." - can you given an example of what you mean by this.
> 
> Karthik Anantha Padmanabhan wrote:
> There is another releavant parameter called `explicitBatchDelaySeconds` 
> which dictates how long we wait in between batches before we call reconcile 
> on mesos. 
> 
> But it is possible that we may add another option in the future ? In that 
> case wouldn't it be better to have two separate API's ?
> 
> Karthik Anantha Padmanabhan wrote:
> Hmm actually if we are passing in a struct of `ExplicitReconcileSettings` 
> , then adding options will be easier. So I can check `isNull` on the settings 
> object on the scheduler to identify an implicit reconcile. 
> 
> I can go either way. I just don't know if there is a strong reason to 
> make it a single API as opposed to two.

Please, don't save a few lines by merging the RPCs or forking off of the 
presence of the `batchSize`. Explicit and implicit reconciliations have very 
different scopes and cover different failure scenarios. I'd advocate in favor 
of Karthik's original proposal with separate call paths (RPCs and client 
commands) that clearly state their purpose in the admin API.


> On Sept. 7, 2016, 9:45 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/aurora/admin/admin.py, line 342
> > 
> >
> > 'default=0' here and the usage doc reads default as 1000.
> 
> Karthik Anantha Padmanabhan wrote:
> We want the default value to match the one from 
> settings.explicitBatchSize on the scheduler. So on the scheduler we use the 
> default value if the value here is 0. Although I agree that the 0 here is a 
> bit misleading. 
> 
> I like the optional batchSize argument for the explicit reconcile RPC. 
> 
> I am thinking there can be a 
> 
> struct ExplicitReconcileSettings {
>  1. optional i32 batchSize;
> }
> 
> This can let us add more explicit reconcile settings in the future. 
> 
> Also does thrift lets us declare optional parameters in a function 
> (unless it is wrapped in a struct) ?
> 
> Santhosh Kumar Shanmugham wrote:
> AFAIK, Thrift does not support optional function parameters, due to the 
> limitations of the languages that it gets translated to. You are on the 
> correct path, in making this a 'struct'.

Don't really see much value in overcomplicating this scenario. You can leave 
the `int` on the API side by have your command_option as `default=None` and 
traslate `None` to `0` in the client/api call. Your call.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---


On Sept. 7, 2016, 9:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 9:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> 

Re: Review Request 51536: Modify the watch_secs assertion on scheduler

2016-09-07 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/
---

(Updated Sept. 7, 2016, 11:10 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

Remove one line about watch_secs that might cause confusions to users.


Summary (updated)
-

Modify the watch_secs assertion on scheduler


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description (updated)
---

- Modify the watch_secs assertion on scheduler.

This feature intends to improve reliability and performance of the Aurora 
scheduler job updater by relying on health check status rather than watch_secs 
timeout when deciding an individual instance update state. 

See this epic: https://issues.apache.org/jira/browse/AURORA-894 
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.


Diffs (updated)
-

  RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
  src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
04551f17999d742c53dfb1b36286b119b448550e 

Diff: https://reviews.apache.org/r/51536/diff/


Testing
---

./gradlew build

./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Re: Review Request 51536: Allow watch_secs to be set to 0

2016-09-07 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148107
---




RELEASE-NOTES.md (line 37)


IMO, this statement is useless to our users and could add to confusion. 
Client still does not allow `watch_secs` for health check enabled updates and 
scheduler always allowed zeros in the `SchedulerThriftInterface.startJobUpdate` 
for this value. It's only this assertion that was not in sync.

Suggest dropping this line and adding it into the client RB that will 
actually allow zero `watch_secs`.


- Maxim Khutornenko


On Sept. 7, 2016, 10:36 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 7, 2016, 10:36 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Allow watch_secs to be set to 0.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51536: Allow watch_secs to be set to 0

2016-09-07 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148106
---


Ship it!




Master (19866b5) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 7, 2016, 10:36 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 7, 2016, 10:36 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Allow watch_secs to be set to 0.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Karthik Anantha Padmanabhan


> On Sept. 7, 2016, 9:45 p.m., Santhosh Kumar Shanmugham wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 1209-1214
> > 
> >
> > Did we consider combining the "explicit" and "implicit" reconciliation 
> > RPCs into a single "reconcile" RPC with an optional batchSize argument, so 
> > that it maps to the mesos master API.
> > 
> > http://mesos.apache.org/documentation/latest/reconciliation/
> 
> Karthik Anantha Padmanabhan wrote:
> We have more than one option on the scheduler for the explicit reconcile. 
> If at all we want to expose those to the admin CLI in the future it may be 
> better to have them as two separate API's.
> 
> Santhosh Kumar Shanmugham wrote:
> "We have more than one option on the scheduler for the explicit 
> reconcile." - can you given an example of what you mean by this.
> 
> Karthik Anantha Padmanabhan wrote:
> There is another releavant parameter called `explicitBatchDelaySeconds` 
> which dictates how long we wait in between batches before we call reconcile 
> on mesos. 
> 
> But it is possible that we may add another option in the future ? In that 
> case wouldn't it be better to have two separate API's ?

Hmm actually if we are passing in a struct of `ExplicitReconcileSettings` , 
then adding options will be easier. So I can check `isNull` on the settings 
object on the scheduler to identify an implicit reconcile. 

I can go either way. I just don't know if there is a strong reason to make it a 
single API as opposed to two.


- Karthik


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---


On Sept. 7, 2016, 9:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 9:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Karthik Anantha Padmanabhan


> On Sept. 7, 2016, 9:45 p.m., Santhosh Kumar Shanmugham wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 1209-1214
> > 
> >
> > Did we consider combining the "explicit" and "implicit" reconciliation 
> > RPCs into a single "reconcile" RPC with an optional batchSize argument, so 
> > that it maps to the mesos master API.
> > 
> > http://mesos.apache.org/documentation/latest/reconciliation/
> 
> Karthik Anantha Padmanabhan wrote:
> We have more than one option on the scheduler for the explicit reconcile. 
> If at all we want to expose those to the admin CLI in the future it may be 
> better to have them as two separate API's.
> 
> Santhosh Kumar Shanmugham wrote:
> "We have more than one option on the scheduler for the explicit 
> reconcile." - can you given an example of what you mean by this.

There is another releavant parameter called `explicitBatchDelaySeconds` which 
dictates how long we wait in between batches before we call reconcile on mesos. 

But it is possible that we may add another option in the future ? In that case 
wouldn't it be better to have two separate API's ?


- Karthik


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---


On Sept. 7, 2016, 9:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 9:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Santhosh Kumar Shanmugham


> On Sept. 7, 2016, 2:45 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/aurora/admin/admin.py, line 342
> > 
> >
> > 'default=0' here and the usage doc reads default as 1000.
> 
> Karthik Anantha Padmanabhan wrote:
> We want the default value to match the one from 
> settings.explicitBatchSize on the scheduler. So on the scheduler we use the 
> default value if the value here is 0. Although I agree that the 0 here is a 
> bit misleading. 
> 
> I like the optional batchSize argument for the explicit reconcile RPC. 
> 
> I am thinking there can be a 
> 
> struct ExplicitReconcileSettings {
>  1. optional i32 batchSize;
> }
> 
> This can let us add more explicit reconcile settings in the future. 
> 
> Also does thrift lets us declare optional parameters in a function 
> (unless it is wrapped in a struct) ?

AFAIK, Thrift does not support optional function parameters, due to the 
limitations of the languages that it gets translated to. You are on the correct 
path, in making this a 'struct'.


> On Sept. 7, 2016, 2:45 p.m., Santhosh Kumar Shanmugham wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 1209-1214
> > 
> >
> > Did we consider combining the "explicit" and "implicit" reconciliation 
> > RPCs into a single "reconcile" RPC with an optional batchSize argument, so 
> > that it maps to the mesos master API.
> > 
> > http://mesos.apache.org/documentation/latest/reconciliation/
> 
> Karthik Anantha Padmanabhan wrote:
> We have more than one option on the scheduler for the explicit reconcile. 
> If at all we want to expose those to the admin CLI in the future it may be 
> better to have them as two separate API's.

"We have more than one option on the scheduler for the explicit reconcile." - 
can you given an example of what you mean by this.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---


On Sept. 7, 2016, 2:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 2:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51536: @ReviewBot retry Scheduler updater will not use watch_sec if health check is enabled

2016-09-07 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148100
---


Ship it!




Master (19866b5) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 7, 2016, 10:06 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 7, 2016, 10:06 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Scheduler updater will not use watch_sec if health check is enabled.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> After discussion on Aurora dev list, we decided to keep the watch_secs 
> infrastructure intact on scheduler side. Our final conclusion is that we 
> adopt the following implementation:
> 
> 1. If the users want purely health checking driven updates they can set 
> watch_secs to 0 and enable health checks.
> 
> 2. If they want to have both health checking and time driven updates they can 
> set watch_secs to the time that they care about, and doing health checks at 
> STARTING state as well.
> 
> 3. If they just want time driven updates they can disable health checking and 
> set watch_secs to the time that they care about.
> 
> In this review, there will be only one scheduler change: 
> Currently scheduler does not accept zero value for watch_secs, we need to 
> relax this constraint.
> 
> Executor change to do (in a separate review):
> The executor starts health check at STARTING, if a successful health check is 
> performed before initial_interval_sec expires, the executor will sends a 
> status message for RUNNING.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Karthik Anantha Padmanabhan


> On Sept. 7, 2016, 9:45 p.m., Santhosh Kumar Shanmugham wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 1209-1214
> > 
> >
> > Did we consider combining the "explicit" and "implicit" reconciliation 
> > RPCs into a single "reconcile" RPC with an optional batchSize argument, so 
> > that it maps to the mesos master API.
> > 
> > http://mesos.apache.org/documentation/latest/reconciliation/

We have more than one option on the scheduler for the explicit reconcile. If at 
all we want to expose those to the admin CLI in the future it may be better to 
have them as two separate API's.


> On Sept. 7, 2016, 9:45 p.m., Santhosh Kumar Shanmugham wrote:
> > src/main/python/apache/aurora/admin/admin.py, line 342
> > 
> >
> > 'default=0' here and the usage doc reads default as 1000.

We want the default value to match the one from settings.explicitBatchSize on 
the scheduler. So on the scheduler we use the default value if the value here 
is 0. Although I agree that the 0 here is a bit misleading. 

I like the optional batchSize argument for the explicit reconcile RPC. 

I am thinking there can be a 

struct ExplicitReconcileSettings {
 1. optional i32 batchSize;
}

This can let us add more explicit reconcile settings in the future. 

Also does thrift lets us declare optional parameters in a function (unless it 
is wrapped in a struct) ?


- Karthik


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---


On Sept. 7, 2016, 9:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 9:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51536: @ReviewBot retry Scheduler updater will not use watch_sec if health check is enabled

2016-09-07 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/
---

(Updated Sept. 7, 2016, 10:06 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Summary (updated)
-

@ReviewBot retry Scheduler updater will not use watch_sec if health check is 
enabled


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Scheduler updater will not use watch_sec if health check is enabled.

This feature intends to improve reliability and performance of the Aurora 
scheduler job updater by relying on health check status rather than watch_secs 
timeout when deciding an individual instance update state. 

See this epic: https://issues.apache.org/jira/browse/AURORA-894 
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

After discussion on Aurora dev list, we decided to keep the watch_secs 
infrastructure intact on scheduler side. Our final conclusion is that we adopt 
the following implementation:

1. If the users want purely health checking driven updates they can set 
watch_secs to 0 and enable health checks.

2. If they want to have both health checking and time driven updates they can 
set watch_secs to the time that they care about, and doing health checks at 
STARTING state as well.

3. If they just want time driven updates they can disable health checking and 
set watch_secs to the time that they care about.

In this review, there will be only one scheduler change: 
Currently scheduler does not accept zero value for watch_secs, we need to relax 
this constraint.

Executor change to do (in a separate review):
The executor starts health check at STARTING, if a successful health check is 
performed before initial_interval_sec expires, the executor will sends a status 
message for RUNNING.


Diffs
-

  RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
  src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
04551f17999d742c53dfb1b36286b119b448550e 

Diff: https://reviews.apache.org/r/51536/diff/


Testing
---

./gradlew build

./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Re: Review Request 51536: @ReviewBot Scheduler updater will not use watch_sec if health check is enabled

2016-09-07 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/
---

(Updated Sept. 7, 2016, 10:01 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Summary (updated)
-

@ReviewBot Scheduler updater will not use watch_sec if health check is enabled


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Scheduler updater will not use watch_sec if health check is enabled.

This feature intends to improve reliability and performance of the Aurora 
scheduler job updater by relying on health check status rather than watch_secs 
timeout when deciding an individual instance update state. 

See this epic: https://issues.apache.org/jira/browse/AURORA-894 
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

After discussion on Aurora dev list, we decided to keep the watch_secs 
infrastructure intact on scheduler side. Our final conclusion is that we adopt 
the following implementation:

1. If the users want purely health checking driven updates they can set 
watch_secs to 0 and enable health checks.

2. If they want to have both health checking and time driven updates they can 
set watch_secs to the time that they care about, and doing health checks at 
STARTING state as well.

3. If they just want time driven updates they can disable health checking and 
set watch_secs to the time that they care about.

In this review, there will be only one scheduler change: 
Currently scheduler does not accept zero value for watch_secs, we need to relax 
this constraint.

Executor change to do (in a separate review):
The executor starts health check at STARTING, if a successful health check is 
performed before initial_interval_sec expires, the executor will sends a status 
message for RUNNING.


Diffs
-

  RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
  src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
04551f17999d742c53dfb1b36286b119b448550e 

Diff: https://reviews.apache.org/r/51536/diff/


Testing
---

./gradlew build

./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled

2016-09-07 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/
---

(Updated Sept. 7, 2016, 10 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

@ReviewBot retry update the release notes


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Scheduler updater will not use watch_sec if health check is enabled.

This feature intends to improve reliability and performance of the Aurora 
scheduler job updater by relying on health check status rather than watch_secs 
timeout when deciding an individual instance update state. 

See this epic: https://issues.apache.org/jira/browse/AURORA-894 
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

After discussion on Aurora dev list, we decided to keep the watch_secs 
infrastructure intact on scheduler side. Our final conclusion is that we adopt 
the following implementation:

1. If the users want purely health checking driven updates they can set 
watch_secs to 0 and enable health checks.

2. If they want to have both health checking and time driven updates they can 
set watch_secs to the time that they care about, and doing health checks at 
STARTING state as well.

3. If they just want time driven updates they can disable health checking and 
set watch_secs to the time that they care about.

In this review, there will be only one scheduler change: 
Currently scheduler does not accept zero value for watch_secs, we need to relax 
this constraint.

Executor change to do (in a separate review):
The executor starts health check at STARTING, if a successful health check is 
performed before initial_interval_sec expires, the executor will sends a status 
message for RUNNING.


Diffs (updated)
-

  RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
  src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
04551f17999d742c53dfb1b36286b119b448550e 

Diff: https://reviews.apache.org/r/51536/diff/


Testing
---

./gradlew build

./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled

2016-09-07 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148095
---



Master (19866b5) is red with this patch.
  ./build-support/jenkins/build.sh

 # Create file stdout for capturing output. We 
can't use StringIO mock
 # because TestProcess is running fork.
 with open(os.path.join(td, 'sys_stdout'), 
'w+') as stdout:
   with open(os.path.join(td, 'sys_stderr'), 
'w+') as stderr:
 with mutable_sys():
   sys.stdout, sys.stderr = stdout, 
stderr
 
   p = TestProcess('process', 'echo hello 
world; echo >&2 hello stderr', 0,
   taskpath, sandbox, 
logger_destination=LoggerDestination.BOTH)
   p.start()
   rc = 
wait_for_rc(taskpath.getpath('process_checkpoint'))
 
   assert rc == 0
   # Check log files were created in std 
path with correct content
 > assert_log_content(taskpath, 'stdout', 
'hello world\n')
 
 src/test/python/apache/thermos/core/test_process.py:487: 
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
 
 taskpath = 
 log_name = 'stdout'
 expected_content = 'hello world\n'
 
 def assert_log_content(taskpath, log_name, 
expected_content):
   log = 
taskpath.with_filename(log_name).getpath('process_logdir')
   assert os.path.exists(log)
   with open(log, 'r') as fp:
 >   assert fp.read() == expected_content
 E   assert '' == 'hello world\n'
 E + hello world
 
 src/test/python/apache/thermos/core/test_process.py:313: 
AssertionError
  generated xml file: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/dist/test-results/415337499eb72578eab327a6487c1f5c9452b3d6.xml
 
  1 failed, 700 passed, 6 skipped, 1 warnings in 
396.57 seconds 
 
FAILURE


21:44:37 07:22   [complete]
   FAILURE


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 7, 2016, 9:28 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 7, 2016, 9:28 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Scheduler updater will not use watch_sec if health check is enabled.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> After discussion on Aurora dev list, we decided to keep the watch_secs 
> infrastructure intact on scheduler side. Our final conclusion is that we 
> adopt the following implementation:
> 
> 1. If the users want purely health checking driven updates they can set 
> watch_secs to 0 and enable health checks.
> 
> 2. If they want to have both health checking and time driven updates they can 
> set watch_secs to the time that they care about, and doing health checks at 
> STARTING state as well.
> 
> 3. If they just want time driven updates they can disable health checking and 
> set watch_secs to the time that they care about.
> 
> In this review, there will be only one scheduler change: 
> Currently scheduler does not accept zero value for watch_secs, we need to 
> relax this constraint.
> 
> Executor change to do (in a separate review):
> The executor starts health check at STARTING, if a successful health check is 
> 

Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148089
---




api/src/main/thrift/org/apache/aurora/gen/api.thrift (lines 1209 - 1214)


Did we consider combining the "explicit" and "implicit" reconciliation RPCs 
into a single "reconcile" RPC with an optional batchSize argument, so that it 
maps to the mesos master API.

http://mesos.apache.org/documentation/latest/reconciliation/



src/main/python/apache/aurora/admin/admin.py (lines 339 - 344)


How about we set the default value for --batch_size to None. This way 
reconciliation will be implicit when the batch_size is None and explicit if the 
batch_size is not None.



src/main/python/apache/aurora/admin/admin.py (line 342)


'default=0' here and the usage doc reads default as 1000.



src/main/python/apache/aurora/admin/admin.py (line 346)


Add "--batch_size" and "--type" options to usage.


- Santhosh Kumar Shanmugham


On Sept. 7, 2016, 2:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 2:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled

2016-09-07 Thread Joshua Cohen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148092
---


Ship it!




Ship It!

- Joshua Cohen


On Sept. 7, 2016, 9:28 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 7, 2016, 9:28 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Scheduler updater will not use watch_sec if health check is enabled.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> After discussion on Aurora dev list, we decided to keep the watch_secs 
> infrastructure intact on scheduler side. Our final conclusion is that we 
> adopt the following implementation:
> 
> 1. If the users want purely health checking driven updates they can set 
> watch_secs to 0 and enable health checks.
> 
> 2. If they want to have both health checking and time driven updates they can 
> set watch_secs to the time that they care about, and doing health checks at 
> STARTING state as well.
> 
> 3. If they just want time driven updates they can disable health checking and 
> set watch_secs to the time that they care about.
> 
> In this review, there will be only one scheduler change: 
> Currently scheduler does not accept zero value for watch_secs, we need to 
> relax this constraint.
> 
> Executor change to do (in a separate review):
> The executor starts health check at STARTING, if a successful health check is 
> performed before initial_interval_sec expires, the executor will sends a 
> status message for RUNNING.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148090
---


Ship it!




Master (19866b5) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On Sept. 7, 2016, 9:08 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 7, 2016, 9:08 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled

2016-09-07 Thread Kai Huang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/
---

(Updated Sept. 7, 2016, 9:28 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

Update release notes.


Bugs: AURORA-894
https://issues.apache.org/jira/browse/AURORA-894


Repository: aurora


Description
---

- Scheduler updater will not use watch_sec if health check is enabled.

This feature intends to improve reliability and performance of the Aurora 
scheduler job updater by relying on health check status rather than watch_secs 
timeout when deciding an individual instance update state. 

See this epic: https://issues.apache.org/jira/browse/AURORA-894 
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

After discussion on Aurora dev list, we decided to keep the watch_secs 
infrastructure intact on scheduler side. Our final conclusion is that we adopt 
the following implementation:

1. If the users want purely health checking driven updates they can set 
watch_secs to 0 and enable health checks.

2. If they want to have both health checking and time driven updates they can 
set watch_secs to the time that they care about, and doing health checks at 
STARTING state as well.

3. If they just want time driven updates they can disable health checking and 
set watch_secs to the time that they care about.

In this review, there will be only one scheduler change: 
Currently scheduler does not accept zero value for watch_secs, we need to relax 
this constraint.

Executor change to do (in a separate review):
The executor starts health check at STARTING, if a successful health check is 
performed before initial_interval_sec expires, the executor will sends a status 
message for RUNNING.


Diffs (updated)
-

  RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
  src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
04551f17999d742c53dfb1b36286b119b448550e 

Diff: https://reviews.apache.org/r/51536/diff/


Testing
---

./gradlew build

./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"

./build-support/jenkins/build.sh


Thanks,

Kai Huang



Re: Review Request 51536: Scheduler updater will not use watch_sec if health check is enabled

2016-09-07 Thread Kai Huang


> On Sept. 7, 2016, 8:18 p.m., Joshua Cohen wrote:
> > This change should probably be called out in RELEASE_NOTES.md?

This feature will not be available to users until we relax the client-side 
constraint. We can add it to RELEASE_NOTES when the executor/client feature is 
ready.


- Kai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51536/#review148075
---


On Sept. 6, 2016, 6:46 p.m., Kai Huang wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51536/
> ---
> 
> (Updated Sept. 6, 2016, 6:46 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-894
> https://issues.apache.org/jira/browse/AURORA-894
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> - Scheduler updater will not use watch_sec if health check is enabled.
> 
> This feature intends to improve reliability and performance of the Aurora 
> scheduler job updater by relying on health check status rather than 
> watch_secs timeout when deciding an individual instance update state. 
> 
> See this epic: https://issues.apache.org/jira/browse/AURORA-894 
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> After discussion on Aurora dev list, we decided to keep the watch_secs 
> infrastructure intact on scheduler side. Our final conclusion is that we 
> adopt the following implementation:
> 
> 1. If the users want purely health checking driven updates they can set 
> watch_secs to 0 and enable health checks.
> 
> 2. If they want to have both health checking and time driven updates they can 
> set watch_secs to the time that they care about, and doing health checks at 
> STARTING state as well.
> 
> 3. If they just want time driven updates they can disable health checking and 
> set watch_secs to the time that they care about.
> 
> In this review, there will be only one scheduler change: 
> Currently scheduler does not accept zero value for watch_secs, we need to 
> relax this constraint.
> 
> Executor change to do (in a separate review):
> The executor starts health check at STARTING, if a successful health check is 
> performed before initial_interval_sec expires, the executor will sends a 
> status message for RUNNING.
> 
> 
> Diffs
> -
> 
>   src/main/java/org/apache/aurora/scheduler/updater/UpdateFactory.java 
> ac8df3e5a2da8cf22e1ba8a90944546e19ccdcaa 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
> 
> Diff: https://reviews.apache.org/r/51536/diff/
> 
> 
> Testing
> ---
> 
> ./gradlew build
> 
> ./gradlew :test --tests "org.apache.aurora.scheduler.updater.JobUpdaterIT"
> 
> ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Kai Huang
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Karthik Anantha Padmanabhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/
---

(Updated Sept. 7, 2016, 9:08 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Repository: aurora


Description
---

AURORA-1602 Aurora admin commands for reconcilation


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
c5765b70501c101f0535b4eed94e9948c36808f9 
  src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
3275d72a0a74909e635bce615e2036ec72aa38ee 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
929d021a336c6a3438613c9340c84a1096dc9069 
  src/main/python/apache/aurora/admin/admin.py 
76009b9c1c7a130c25abad544a176dc590dafb12 
  src/main/python/apache/aurora/client/api/__init__.py 
9149c3018ae58d405f284fcbd4076d251ccc8192 
  
src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
 b9317dc20456f90057ec2bf4d10619a5ae986187 
  
src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
 779dc302602ae8842084807ca868a491ea99b676 
  src/test/python/apache/aurora/admin/test_admin.py 
eb193c4a33f5275f05a995338446bec0e19bc1cf 
  src/test/python/apache/aurora/client/api/test_scheduler_client.py 
afac2500551af2fce406bb906aa4e33f353e90a1 

Diff: https://reviews.apache.org/r/51662/diff/


Testing
---

* Manually tested on my local vagrant installation
* ./build-support/jenkins/build.sh


Thanks,

Karthik Anantha Padmanabhan



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Karthik Anantha Padmanabhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/
---

(Updated Sept. 7, 2016, 9:01 p.m.)


Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Changes
---

Address code review comments


Repository: aurora


Description
---

AURORA-1602 Aurora admin commands for reconcilation


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
c5765b70501c101f0535b4eed94e9948c36808f9 
  src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
3275d72a0a74909e635bce615e2036ec72aa38ee 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
929d021a336c6a3438613c9340c84a1096dc9069 
  src/main/python/apache/aurora/admin/admin.py 
76009b9c1c7a130c25abad544a176dc590dafb12 
  src/main/python/apache/aurora/client/api/__init__.py 
9149c3018ae58d405f284fcbd4076d251ccc8192 
  
src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
 b9317dc20456f90057ec2bf4d10619a5ae986187 
  
src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
 779dc302602ae8842084807ca868a491ea99b676 
  src/test/python/apache/aurora/admin/test_admin.py 
eb193c4a33f5275f05a995338446bec0e19bc1cf 
  src/test/python/apache/aurora/client/api/test_scheduler_client.py 
afac2500551af2fce406bb906aa4e33f353e90a1 

Diff: https://reviews.apache.org/r/51662/diff/


Testing
---

* Manually tested on my local vagrant installation
* ./build-support/jenkins/build.sh


Thanks,

Karthik Anantha Padmanabhan



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Karthik Anantha Padmanabhan


> On Sept. 7, 2016, 6:14 p.m., Maxim Khutornenko wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, lines 944-950
> > 
> >
> > I don't think there is much value in having this struct and the related 
> > `reconcileStatus` RPC. Cluster operators are normally relying on scheduler 
> > stats to monitor activities like this.

Cool makes sense


> On Sept. 7, 2016, 6:14 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java,
> >  lines 135-143
> > 
> >
> > This only prevents overlapping explicit or implicit runs but does not 
> > prevent explicit and implicit overlapping runs. Also, this does not cancel 
> > any calls already waiting for a lock.
> > 
> > Given that we deal with admin-triggered events (who are expected to 
> > know what they do), I suggest we simplify the logic here by simply 
> > scheduling an out of band run and not try to synchronize anything around 
> > reconciliation runs. This would let you cut the bulk of the changes below.

okay agree.


> On Sept. 7, 2016, 6:14 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/admin/admin.py, line 355
> > 
> >
> > The default here would be 0 if you accept my suggestion above.

Okay I am assuming that an operator will never need to explicitly run 
reconcilation with a batch size of 0. Else I could choose -1 ?


- Karthik


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148046
---


On Sept. 6, 2016, 11:34 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 6, 2016, 11:34 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/admin/admin_util.py 
> 394deb57af9ad8832a02ceab15f33b3c1e5c902b 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51662: AURORA-1602 Aurora admin commands for reconcilation

2016-09-07 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51662/#review148046
---




api/src/main/thrift/org/apache/aurora/gen/api.thrift (lines 944 - 950)


I don't think there is much value in having this struct and the related 
`reconcileStatus` RPC. Cluster operators are normally relying on scheduler 
stats to monitor activities like this.



api/src/main/thrift/org/apache/aurora/gen/api.thrift (line 1223)


How about `triggerExplicitTaskReconciliation` and 
`triggerImplicitTaskReconciliation`? This would be self-explanatory to any API 
consumer.

nit: please end all comments with '.'



src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
(line 135)


Suggest defaulting to the configured `settings.explicitBatchSize` in case 
the provided one is 0. This will allow the correspondent admin command to make 
batch size completely optional.



src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
(lines 135 - 143)


This only prevents overlapping explicit or implicit runs but does not 
prevent explicit and implicit overlapping runs. Also, this does not cancel any 
calls already waiting for a lock.

Given that we deal with admin-triggered events (who are expected to know 
what they do), I suggest we simplify the logic here by simply scheduling an out 
of band run and not try to synchronize anything around reconciliation runs. 
This would let you cut the bulk of the changes below.



src/main/python/apache/aurora/admin/admin.py (line 355)


The default here would be 0 if you accept my suggestion above.


- Maxim Khutornenko


On Sept. 6, 2016, 11:34 p.m., Karthik Anantha Padmanabhan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51662/
> ---
> 
> (Updated Sept. 6, 2016, 11:34 p.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> AURORA-1602 Aurora admin commands for reconcilation
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   
> src/main/java/org/apache/aurora/scheduler/reconciliation/TaskReconciler.java 
> 3275d72a0a74909e635bce615e2036ec72aa38ee 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   src/main/python/apache/aurora/admin/admin.py 
> 76009b9c1c7a130c25abad544a176dc590dafb12 
>   src/main/python/apache/aurora/admin/admin_util.py 
> 394deb57af9ad8832a02ceab15f33b3c1e5c902b 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   
> src/test/java/org/apache/aurora/scheduler/reconciliation/TaskReconcilerTest.java
>  b9317dc20456f90057ec2bf4d10619a5ae986187 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/python/apache/aurora/admin/test_admin.py 
> eb193c4a33f5275f05a995338446bec0e19bc1cf 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> afac2500551af2fce406bb906aa4e33f353e90a1 
> 
> Diff: https://reviews.apache.org/r/51662/diff/
> 
> 
> Testing
> ---
> 
> * Manually tested on my local vagrant installation
> * ./build-support/jenkins/build.sh
> 
> 
> Thanks,
> 
> Karthik Anantha Padmanabhan
> 
>



Re: Review Request 51384: Introduce UpdateMetadata fields in JobUpdateRequest.

2016-09-07 Thread Santhosh Kumar Shanmugham


> On Sept. 7, 2016, 10:36 a.m., Maxim Khutornenko wrote:
> > Ship It!

Thank you.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51384/#review148035
---


On Sept. 6, 2016, 8:29 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51384/
> ---
> 
> (Updated Sept. 6, 2016, 8:29 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Maxim 
> Khutornenko.
> 
> 
> Bugs: AURORA-1711
> https://issues.apache.org/jira/browse/AURORA-1711
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Allow custom metadata to be stored with the job updates. This feature
> is to help case the reconciliation on job update request on the clients.
> 
> Today when a client's requests are proxied via a middle-man,
> if the middle-man times out before the scheduler responds with success,
> the client is left the only option of retry. In the case of updates,
> since multiple updates are not allowed in parallel, the retries fail,
> with INVALID_REQUEST. Although the original update is already accepted
> by the scheduler and is in-progress.
> 
> With this feature, clients can reconcile the job updates,
> - by marking updates them with a unique id in the metadata field
> - scheduler returns the in-progress job update in its response
> - client can match the client-generated ids to reconcile its state.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/jmh/java/org/apache/aurora/benchmark/JobUpdates.java 
> f4f8d0037751c9c2096747264c19f6292461b308 
>   src/jmh/java/org/apache/aurora/benchmark/UpdateStoreBenchmarks.java 
> e5228ae9baadb8cad1e5ce143df09426d6107583 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> d2673e6b328cb1e249fbe91d18e0d9e935636eaa 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java
>  a3b04949f1726f110638e4f93ef45947cdb9e7f8 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/migration/V008_CreateUpdateMetadataTable.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  594bb6219294dcc77d48dcad14e2a6f9caa0c534 
>   
> src/main/java/org/apache/aurora/scheduler/updater/UpdateInProgressException.java
>  PRE-CREATION 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/python/apache/aurora/client/cli/update.py 
> d23243dcf93dd82f66b4c8cc4342bd2c8d2adc79 
>   src/main/python/apache/aurora/client/hooks/hooked_api.py 
> 542f165add0f1d01a782fe6d6089bff3e736fb82 
>   
> src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
>  8563630a179cb6d1e3b0e22c730ccbfe1c9291e2 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql 
> a40830fd668aa650c22a48cbc757b45aef27e961 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStoreTest.java
>  08530397ff75081bde6f07f9d53317b5486e0da4 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
>   src/test/python/apache/aurora/client/cli/test_supdate.py 
> 92fb039fb7d3e368d7c0dfa5ebdb465c7f112416 
>   src/test/python/apache/aurora/client/cli/util.py 
> aac9f9c802e4ad1f06cee8cf3f56749760b33af5 
> 
> Diff: https://reviews.apache.org/r/51384/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> ./pants test src/test/python/apache/aurora/client/cli:cli
> 
> 
> Thanks,
> 
> Santhosh Kumar Shanmugham
> 
>



Re: Review Request 51384: Introduce UpdateMetadata fields in JobUpdateRequest.

2016-09-07 Thread Maxim Khutornenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51384/#review148035
---


Ship it!




Ship It!

- Maxim Khutornenko


On Sept. 7, 2016, 3:29 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51384/
> ---
> 
> (Updated Sept. 7, 2016, 3:29 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Maxim 
> Khutornenko.
> 
> 
> Bugs: AURORA-1711
> https://issues.apache.org/jira/browse/AURORA-1711
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Allow custom metadata to be stored with the job updates. This feature
> is to help case the reconciliation on job update request on the clients.
> 
> Today when a client's requests are proxied via a middle-man,
> if the middle-man times out before the scheduler responds with success,
> the client is left the only option of retry. In the case of updates,
> since multiple updates are not allowed in parallel, the retries fail,
> with INVALID_REQUEST. Although the original update is already accepted
> by the scheduler and is in-progress.
> 
> With this feature, clients can reconcile the job updates,
> - by marking updates them with a unique id in the metadata field
> - scheduler returns the in-progress job update in its response
> - client can match the client-generated ids to reconcile its state.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/jmh/java/org/apache/aurora/benchmark/JobUpdates.java 
> f4f8d0037751c9c2096747264c19f6292461b308 
>   src/jmh/java/org/apache/aurora/benchmark/UpdateStoreBenchmarks.java 
> e5228ae9baadb8cad1e5ce143df09426d6107583 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> d2673e6b328cb1e249fbe91d18e0d9e935636eaa 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java
>  a3b04949f1726f110638e4f93ef45947cdb9e7f8 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/migration/V008_CreateUpdateMetadataTable.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  594bb6219294dcc77d48dcad14e2a6f9caa0c534 
>   
> src/main/java/org/apache/aurora/scheduler/updater/UpdateInProgressException.java
>  PRE-CREATION 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/python/apache/aurora/client/cli/update.py 
> d23243dcf93dd82f66b4c8cc4342bd2c8d2adc79 
>   src/main/python/apache/aurora/client/hooks/hooked_api.py 
> 542f165add0f1d01a782fe6d6089bff3e736fb82 
>   
> src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
>  8563630a179cb6d1e3b0e22c730ccbfe1c9291e2 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql 
> a40830fd668aa650c22a48cbc757b45aef27e961 
>   
> src/test/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStoreTest.java
>  08530397ff75081bde6f07f9d53317b5486e0da4 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImplTest.java
>  a7d1f74acdfe5f58eb822a4d826e920cad53dced 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
>  779dc302602ae8842084807ca868a491ea99b676 
>   src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
> 04551f17999d742c53dfb1b36286b119b448550e 
>   src/test/python/apache/aurora/client/cli/test_supdate.py 
> 92fb039fb7d3e368d7c0dfa5ebdb465c7f112416 
>   src/test/python/apache/aurora/client/cli/util.py 
> aac9f9c802e4ad1f06cee8cf3f56749760b33af5 
> 
> Diff: https://reviews.apache.org/r/51384/diff/
> 
> 
> Testing
> ---
> 
> ./build-support/jenkins/build.sh
> ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
> ./pants test src/test/python/apache/aurora/client/cli:cli
> 
> 
> Thanks,
> 
> Santhosh Kumar Shanmugham
> 
>



Re: Review Request 51384: Introduce UpdateMetadata fields in JobUpdateRequest.

2016-09-07 Thread Maxim Khutornenko


> On Sept. 2, 2016, 9:37 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java,
> >  line 913
> > 
> >
> > This will be a null ref if scheduler is called from a v-1 client that 
> > does not know anything about update metadata. You need to do something like 
> > this:
> > ```
> > .setMetadata(request.isSetMetadata() 
> > ? toBuildersSet(request.getMetadata()) 
> > : ImmutableSet.of())
> > ```
> 
> Santhosh Kumar Shanmugham wrote:
> Thrift does not expose isSetMetadata() method for optional fileds of 
> primitive type in the IJobUpdateRequest. It is available however in 
> JobUpdateRequest.java and the earlier code that converts from 
> JobUpdateRequest to IJobUpdateRequest does the check, via 
> IJobUpdateRequest.build().
> 
> // SchedulerThrift.java
>   request = IJobUpdateRequest.build(new 
> JobUpdateRequest(mutableRequest).setTaskConfig(
>   configurationManager.validateAndPopulate(
>   
> ITaskConfig.build(mutableRequest.getTaskConfig())).newBuilder()));
>   
> // IJobUpdateRequest.java
>   private IJobUpdateRequest(JobUpdateRequest wrapped) {
> ...
> this.metadata = wrapped.isSetMetadata()
> ? FluentIterable.from(wrapped.getMetadata())
>   .transform(IMetadata::build)
>   .toSet()
> : ImmutableSet.of();
>   }
>   
>   public static IJobUpdateRequest build(JobUpdateRequest wrapped) {
> return new IJobUpdateRequest(wrapped);
>   }

That's right, I was mistakenly assuming that was a `JobUpdateRequest` but that 
is tracked under `mutableRequest`.


- Maxim


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51384/#review147694
---


On Sept. 7, 2016, 3:29 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51384/
> ---
> 
> (Updated Sept. 7, 2016, 3:29 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Maxim 
> Khutornenko.
> 
> 
> Bugs: AURORA-1711
> https://issues.apache.org/jira/browse/AURORA-1711
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Allow custom metadata to be stored with the job updates. This feature
> is to help case the reconciliation on job update request on the clients.
> 
> Today when a client's requests are proxied via a middle-man,
> if the middle-man times out before the scheduler responds with success,
> the client is left the only option of retry. In the case of updates,
> since multiple updates are not allowed in parallel, the retries fail,
> with INVALID_REQUEST. Although the original update is already accepted
> by the scheduler and is in-progress.
> 
> With this feature, clients can reconcile the job updates,
> - by marking updates them with a unique id in the metadata field
> - scheduler returns the in-progress job update in its response
> - client can match the client-generated ids to reconcile its state.
> 
> 
> Diffs
> -
> 
>   RELEASE-NOTES.md d79aaadc197697d09a71c83494a01765d6a983d4 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/jmh/java/org/apache/aurora/benchmark/JobUpdates.java 
> f4f8d0037751c9c2096747264c19f6292461b308 
>   src/jmh/java/org/apache/aurora/benchmark/UpdateStoreBenchmarks.java 
> e5228ae9baadb8cad1e5ce143df09426d6107583 
>   src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java 
> d2673e6b328cb1e249fbe91d18e0d9e935636eaa 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.java
>  a3b04949f1726f110638e4f93ef45947cdb9e7f8 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/migration/V008_CreateUpdateMetadataTable.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  929d021a336c6a3438613c9340c84a1096dc9069 
>   
> src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java
>  594bb6219294dcc77d48dcad14e2a6f9caa0c534 
>   
> src/main/java/org/apache/aurora/scheduler/updater/UpdateInProgressException.java
>  PRE-CREATION 
>   src/main/python/apache/aurora/client/api/__init__.py 
> 9149c3018ae58d405f284fcbd4076d251ccc8192 
>   src/main/python/apache/aurora/client/cli/update.py 
> d23243dcf93dd82f66b4c8cc4342bd2c8d2adc79 
>   src/main/python/apache/aurora/client/hooks/hooked_api.py 
> 542f165add0f1d01a782fe6d6089bff3e736fb82 
>   
> src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml
>  

Re: Review Request 51384: Introduce UpdateMetadata fields in JobUpdateRequest.

2016-09-07 Thread Santhosh Kumar Shanmugham


> On Sept. 2, 2016, 2:37 p.m., Maxim Khutornenko wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift, line 774
> > 
> >
> > nit: all comments should be ended with '.'. Here and everywhere.

Fixed.


> On Sept. 2, 2016, 2:37 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/storage/db/DbJobUpdateStore.java, 
> > line 102
> > 
> >
> > `checkNotBlank` is redundant here as you enter this block only if it's 
> > not empty.

Fixed.


> On Sept. 2, 2016, 2:37 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java,
> >  lines 932-934
> > 
> >
> > tabbing is off here

Fixed.


> On Sept. 2, 2016, 2:37 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/client/cli/context.py, line 79
> > 
> >
> > This is only used in update.py, please move it there.

Fixed.


> On Sept. 2, 2016, 2:37 p.m., Maxim Khutornenko wrote:
> > src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java,
> >  line 913
> > 
> >
> > This will be a null ref if scheduler is called from a v-1 client that 
> > does not know anything about update metadata. You need to do something like 
> > this:
> > ```
> > .setMetadata(request.isSetMetadata() 
> > ? toBuildersSet(request.getMetadata()) 
> > : ImmutableSet.of())
> > ```

Thrift does not expose isSetMetadata() method for optional fileds of primitive 
type in the IJobUpdateRequest. It is available however in JobUpdateRequest.java 
and the earlier code that converts from JobUpdateRequest to IJobUpdateRequest 
does the check, via IJobUpdateRequest.build().

// SchedulerThrift.java
  request = IJobUpdateRequest.build(new 
JobUpdateRequest(mutableRequest).setTaskConfig(
  configurationManager.validateAndPopulate(
  ITaskConfig.build(mutableRequest.getTaskConfig())).newBuilder()));
  
// IJobUpdateRequest.java
  private IJobUpdateRequest(JobUpdateRequest wrapped) {
...
this.metadata = wrapped.isSetMetadata()
? FluentIterable.from(wrapped.getMetadata())
  .transform(IMetadata::build)
  .toSet()
: ImmutableSet.of();
  }
  
  public static IJobUpdateRequest build(JobUpdateRequest wrapped) {
return new IJobUpdateRequest(wrapped);
  }


> On Sept. 2, 2016, 2:37 p.m., Maxim Khutornenko wrote:
> > src/main/resources/org/apache/aurora/scheduler/storage/db/JobUpdateDetailsMapper.xml,
> >  lines 357-358
> > 
> >
> > Instead of sub-selecting from a left join, have you considered a 
> > collection with a subselect instead? See `details.updateEvents` for 
> > example. 
> > 
> > It should be more compact and most importantly, perform better on large 
> > metadata counts. You can validate the last assumption via `./gradlew jmh 
> > -Pbenchmarks='UpdateStoreBenchmarks.JobDetailsBenchmark'` or create a new 
> > benchmark to test fetching job summaries only.

Updated to use a sub-select instead of the join.

Added a benchmark test. Avoiding joins and using sub-selects appear to help out 
in the throughput. Beyond 5000 metadata pairs the throughput flattens out to 1 
op/sec.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51384/#review147694
---


On Sept. 6, 2016, 8:29 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51384/
> ---
> 
> (Updated Sept. 6, 2016, 8:29 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Maxim 
> Khutornenko.
> 
> 
> Bugs: AURORA-1711
> https://issues.apache.org/jira/browse/AURORA-1711
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> Allow custom metadata to be stored with the job updates. This feature
> is to help case the reconciliation on job update request on the clients.
> 
> Today when a client's requests are proxied via a middle-man,
> if the middle-man times out before the scheduler responds with success,
> the client is left the only option of retry. In the case of updates,
> since multiple updates are not allowed in parallel, the retries fail,
> with INVALID_REQUEST. Although the original update is already accepted
> by the scheduler and is