[
https://issues.apache.org/jira/browse/AURORA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956568#comment-15956568
]
Stephan Erb commented on AURORA-1909:
-------------------------------------
This is now on master. Thanks for your contribution!
{code}
commit 7678d194f918143d5e8d771796e7dfbaabc931e7
Author: Charles Raimbert <[email protected]>
Date: Wed Apr 5 11:25:03 2017 +0200
Fix Thermos Health Check for MesosContainerizer with
`--nosetuid-health-checks`
With MesosContainerizer, the health check is performed using a
"mesos-containerizer
launch" process, but there is actually a code bug in the way of getting the
user
under which to run the health check process:
https://github.com/apache/aurora/blob/master/src/main/python/apache/aurora/executor/common/health_checker.py#L370
```
health_check_user = (os.getusername() if self._nosetuid_health_checks
else assigned_task.task.job.role)
```
If the scheduler is configured with `--nosetuid-health-checks` then
"os.getusername()"
is executed, but the "os" python module does not present any
"getusername()" function,
which leads the Thermos execution to abort as follow:
```
D0323 01:08:15.453372 16 aurora_executor.py:159] Task started.
E0323 01:08:15.571124 16 aurora_executor.py:121] Traceback (most recent
call last):
File "apache/aurora/executor/aurora_executor.py", line 119, in _run
self._start_status_manager(driver, assigned_task)
File "apache/aurora/executor/aurora_executor.py", line 168, in
_start_status_manager
status_checker = status_provider.from_assigned_task(assigned_task,
self._sandbox)
File "apache/aurora/executor/common/health_checker.py", line 370, in
from_assigned_task
health_check_user = (os.getusername() if self._nosetuid_health_checks
AttributeError: 'module' object has no attribute 'getusername'
```
Following the existing unit testing pattern from test_health_checker.py, a
test case
was added to cover the `--nosetuid-health-checks` case for
MesosContainerizer.
Bugs closed: AURORA-1909
Reviewed at https://reviews.apache.org/r/58167/
src/main/python/apache/aurora/executor/common/health_checker.py | 3 ++-
src/test/python/apache/aurora/executor/common/test_health_checker.py | 185
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------
2 files changed, 120 insertions(+), 68 deletions(-)
{code}
> Thermos Health Check fails for MesosContainerizer if
> `--nosetuid-health-checks` is set
> --------------------------------------------------------------------------------------
>
> Key: AURORA-1909
> URL: https://issues.apache.org/jira/browse/AURORA-1909
> Project: Aurora
> Issue Type: Bug
> Components: Executor
> Reporter: Charles Raimbert
> Assignee: Charles Raimbert
> Labels: easyfix
>
> With MesosContainerizer, the sandbox is of type FileSystemImageSandbox and
> the health check is performed using a "mesos-containerizer launch" process,
> but there is actually a code bug in the way of getting the user under which
> to run the health check process:
> https://github.com/apache/aurora/blob/master/src/main/python/apache/aurora/executor/common/health_checker.py#L370
> {code}
> health_check_user = (os.getusername() if self._nosetuid_health_checks
> else assigned_task.task.job.role)
> {code}
> If the Aurora scheduler is configured with `--nosetuid-health-checks` then
> "os.getusername()" is executed, but the python "os" module does not present a
> "getusername()" function.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)