ephraimbuddy edited a comment on pull request #17654:
URL: https://github.com/apache/airflow/pull/17654#issuecomment-900387355


   > I can’t say I understand the full rationale of the original 
implementation, but one reason `return_code` should be called here is to ensure 
it’s called at least once during the task runner’s lifetime. Otherwise, if we 
terminate a runner without it ever heartbeating, we would incorrectly set 
`self._rc = -9` even if the underlying process ended cleanly.
   > 
   > Maybe the correct approach to test this is to say 
`StandardTaskRunner.return_code()` is guaranteed to be called one additional 
time when it terminates. Otherwise, we’d want to do some refactoring to ensure 
we have the underlying’s return code without calling `return_code()`.
   
   
   For me, it's better to set `self._rc = -9` than call the `return_code` here. 
Calling the `return_code` looks like pretending that the LocalTaskJob called 
it. If LocalTaskJob fails to call the return_code, then `self._rc` should be 
set to -9 so that we don't wait for the process twice as mentioned in 
`return_code` method docstring. 
   
   >  if we terminate a runner without it ever heartbeating, we would 
incorrectly set `self._rc = -9` even if the underlying process ended cleanly.
   
   I think that if we end a runner without it heartbeating, we should 
explicitly use SIGKILL...that’s the -9 used in the terminate method. Because I 
can’t think of scenario where it will not heartbeat and end cleanly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to