brian wickman created AURORA-698:
------------------------------------

             Summary: aurora executor _shutdown deadline calls should be 
daemonized
                 Key: AURORA-698
                 URL: https://issues.apache.org/jira/browse/AURORA-698
             Project: Aurora
          Issue Type: Bug
          Components: Executor
            Reporter: brian wickman


In the aurora executor shutdown method, we have deadline() calls:

{noformat}
  def _shutdown(self, status_result):
    runner_status = self._runner.status

    try:
      deadline(self._runner.stop, timeout=self.STOP_TIMEOUT)
    except Timeout:
      log.error('Failed to stop runner within deadline.')

    try:
      deadline(self._chained_checker.stop, timeout=self.STOP_TIMEOUT)
    except Timeout:
      log.error('Failed to stop all checkers within deadline.')

    # If the runner was alive when _shutdown was called, defer to the 
status_result,
    # otherwise the runner's terminal state is the preferred state.
    exit_status = runner_status or status_result

    self.send_update(
        self._driver,
        self._task_id,
        exit_status.status,
        status_result.reason)

    self.terminated.set()
    defer(self._driver.stop, delay=self.PERSISTENCE_WAIT)
{noformat}

However if runner.stop fails with a Timeout exception, the spawned 
AnonymousThread is not daemonized and causes the executor to fail to exit.  
This means that the cgroup will not be torn down and if the runner.stop 
actually failed, the process can stay alive even if TASK_KILLED was delivered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to