[
https://issues.apache.org/jira/browse/AURORA-669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118663#comment-14118663
]
brian wickman commented on AURORA-669:
--------------------------------------
https://reviews.apache.org/r/25175/ is out for review (it really is just
changing 'if timeout' to 'if timeout is not None') but it contains no
regression test. that's my top priority right now but it might not get out
today.
> Thermos runner collect_updates() gets stuck in a while loop when timeout is 0
> -----------------------------------------------------------------------------
>
> Key: AURORA-669
> URL: https://issues.apache.org/jira/browse/AURORA-669
> Project: Aurora
> Issue Type: Bug
> Components: Thermos
> Reporter: Maxim Khutornenko
> Assignee: brian wickman
>
> The following code in runner.py:collect_update() may result in an infinite
> while loop when the provided timeout is passed as 0:
> {noformat}
> while True:
> ...
> if timeout and total_time >= timeout:
> break
> ...
> {noformat}
> We have observed a case when the thermos runner gets stuck in a "deadlocked"
> state not reacting to SIGTERM with the last message in __main__.log as:
> {noformat}
> D0827 15:35:26.022495 30886 runner.py:856] Run loop: Work to be done within
> 0.0s
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)