> On Sept. 21, 2016, 5:34 p.m., Maxim Khutornenko wrote:
> > src/main/python/apache/aurora/executor/common/health_checker.py, lines 
> > 279-283
> > <https://reviews.apache.org/r/51876/diff/3/?file=1505370#file1505370line279>
> >
> >     This logic needs to be refactored to reduce duplication and reuse 
> > what's now in `is_health_check_enabled` as much as possible. Ideally, we 
> > should have the only place we extract `health_checker` and the like.
> 
> Kai Huang wrote:
>     The complexity is that the health checker did some computation to 
> calculate the port map while going through the if condition. It would be 
> helpful to introduce a utitlity class to store all the side-effect values 
> like port map.

I had two concerns here:
1. If we decide to extract the health_check_enabled() from health_checker.py, 
and put it into task_info.py, this will result in a circular dependency, 
because is_health_check_enabled() function requires some string constants and 
variables defined in health_checker.py, which in turn requires task_info.py. 
This explains why I move the string constants to api.thrift as global constants.

2. It seems that is_health_check_enabled function will always be an overkill. 
We cannot just check if health check is enabled for a task, without computing 
the port maps and extracting the health_checker and health_check_config from 
it. It's coupled with the setting up of a health checker. To eliminate 
duplication, Joshua and I discussed the idea to check if health check is 
enabled while we set up the necessary health checkers. However this approach 
results in flaky tests, probably because it is too heavy-weighted.


- Kai


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/#review149846
-----------------------------------------------------------


On Sept. 20, 2016, 12:25 a.m., Kai Huang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51876/
> -----------------------------------------------------------
> 
> (Updated Sept. 20, 2016, 12:25 a.m.)
> 
> 
> Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.
> 
> 
> Bugs: AURORA-1225
>     https://issues.apache.org/jira/browse/AURORA-1225
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Modify executor state transition logic to rely on health checks (if enabled).
> 
> [Summary]
> Executor needs to start executing user content in STARTING and transition to 
> RUNNING when a successful required number of health checks is reached.
> 
> This review contains a series of executor changes that implement the health 
> check driven updates. It gives more context of the design of this feature.
> 
> [Background]
> Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
> and the design doc: 
> https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
>  for more details and background.
> 
> [Description]
> If health check is enabled on vCurrent executor, the health checker will send 
> a "TASK_RUNNING" message when a successful required number of health checks 
> is reached within the initial_interval_secs. On the other hand, a 
> "TASK_FAILED" message was sent if the health checker fails to reach the 
> required number of health checks within that period, or a maximum number of 
> failed health check limit is reached after the initital_interval_secs.
> 
> If health check is disabled on the vCurrent executor, it will sends 
> "TASK_RUNNING" message to scheduler after the thermos runner was started. In 
> this scenario, the behavior of vCurrent executor will be the same as the 
> vPrev executor.
> 
> [Change List]
> The current change set includes:
> 1. Removed the status memoization in ChainedStatusChecker.
> 2. Modified the StatusManager to be edge triggered.
> 3. Changed the Aurora Executor callback function.
> 4. Modified the Health Checker and redefined the meaning 
> initial_interval_secs.
> 
> 
> Diffs
> -----
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> c5765b70501c101f0535b4eed94e9948c36808f9 
>   src/main/python/apache/aurora/executor/aurora_executor.py 
> ce5ef680f01831cd89fced8969ae3246c7f60cfd 
>   src/main/python/apache/aurora/executor/common/health_checker.py 
> 5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
>   src/main/python/apache/aurora/executor/common/status_checker.py 
> 795dae2d6b661fc528d952c2315196d94127961f 
>   src/main/python/apache/aurora/executor/common/task_info.py 
> 4ef49e30eeb942886d02c1fb448055264f9aa874 
>   src/main/python/apache/aurora/executor/status_manager.py 
> 228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
>   src/test/python/apache/aurora/executor/common/test_health_checker.py 
> bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
>   src/test/python/apache/aurora/executor/common/test_status_checker.py 
> 5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
>   src/test/python/apache/aurora/executor/common/test_task_info.py 
> 0d9cc5cb340d697a887d8e001f23c948f4fa70c7 
>   src/test/python/apache/aurora/executor/test_status_manager.py 
> ce4679ba1aa7b42cf0115c943d84663030182d23 
>   src/test/python/apache/aurora/executor/test_thermos_executor.py 
> 0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 
> 
> Diff: https://reviews.apache.org/r/51876/diff/
> 
> 
> Testing
> -------
> 
> ./build-support/jenkins/build.sh
> 
> ./pants test.pytest src/test/python/apache/aurora/executor::
> 
> 
> Thanks,
> 
> Kai Huang
> 
>

Reply via email to