[
https://issues.apache.org/jira/browse/MESOS-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421574#comment-16421574
]
Qian Zhang commented on MESOS-8754:
-----------------------------------
The root cause of this issue is, agent failed to change owner of the executor's
sandbox directory to an inexistent user in {{createSandboxDirectory()}} which
caused [this
CHECK|https://github.com/apache/mesos/blob/1.5.0/src/slave/paths.cpp#L728:L729]
failed.
> Agent will crash when launching a task with an inexistent user
> --------------------------------------------------------------
>
> Key: MESOS-8754
> URL: https://issues.apache.org/jira/browse/MESOS-8754
> Project: Mesos
> Issue Type: Bug
> Reporter: Qian Zhang
> Priority: Major
>
> When I launched a task with an inexistent user in its {{command}}Info via
> {{mesos-execute}}, see the JSON below for the task info (the user {{xxx}}
> does not exist in the agent host):
> {code:java}
> {
> "name": "test",
> "task_id": {"value" : "test"},
> "agent_id": {"value" : ""},
> "resources": [
> {"name": "cpus", "type": "SCALAR", "scalar": {"value": 0.1}},
> {"name": "mem", "type": "SCALAR", "scalar": {"value": 32}}
> ],
> "command": {
> "value": "sleep 1000",
> "user": "xxx"
> },
> "container": {
> "type": "MESOS",
> "mesos": {
> "image": {
> "type": "DOCKER",
> "docker": {
> "name": "library/busybox"
> }
> }
> }
> }
> }
> {code}
> I found agent crashed immediately:
> {code:java}
> I0331 18:55:31.110792 15945 slave.cpp:2018] Got assigned task 'test' for
> framework af812642-320c-47f8-953a-200f7cf3d1dc-0000
> I0331 18:55:31.113301 15945 slave.cpp:2324] Authorizing task 'test' for
> framework af812642-320c-47f8-953a-200f7cf3d1dc-0000
> I0331 18:55:31.115114 15936 slave.cpp:2770] Launching task 'test' for
> framework af812642-320c-47f8-953a-200f7cf3d1dc-0000
> I0331 18:55:31.117607 15936 paths.cpp:727] Creating sandbox
> '/home/qzhang/opt/mesos/slaves/af812642-320c-47f8-953a-200f7cf3d1dc-S0/frameworks/af812642-320c-47f8-953a-200f7cf3d1dc-0000/executors/test/runs/8a014318-ed02-4fb6-96ea-c60c15e3ee7a'
> for user 'xxx'
> F0331 18:55:31.119047 15936 paths.cpp:735] CHECK_SOME(mkdir): Failed to chown
> directory to 'xxx': No such user 'xxx' Failed to create executor directory
> '/home/qzhang/opt/mesos/slaves/af812642-320c-47f8-953a-200f7cf3d1dc-S0/frameworks/af812642-320c-47f8-953a-200f7cf3d1dc-0000/executors/test/runs/8a014318-ed02-4fb6-96ea-c60c15e3ee7a'
> *** Check failure stack trace: ***
> @ 0x7fa10117419c (unknown)
> @ 0x7fa1011740fb (unknown)
> @ 0x7fa101173b0c (unknown)
> @ 0x7fa101176840 (unknown)
> @ 0x55c9fb46283f (unknown)
> @ 0x7fa0ffc1963f (unknown)
> @ 0x7fa0ffc681fc (unknown)
> @ 0x7fa0ffc33845 (unknown)
> @ 0x7fa0ffc371f3 (unknown)
> @ 0x7fa0ffcbf1de (unknown)
> @ 0x7fa0ffd6a0e5 (unknown)
> @ 0x7fa0ffd5fb67 (unknown)
> @ 0x7fa0ffd5a09a (unknown)
> @ 0x7fa0ffd56567 (unknown)
> @ 0x7fa0ffd549f1 (unknown)
> @ 0x7fa0ffd526ec (unknown)
> @ 0x7fa101087f8a (unknown)
> @ 0x7fa10106586d (unknown)
> @ 0x7fa101074188 (unknown)
> @ 0x7fa0fef7bc9c Try<>::error()::__PRETTY_FUNCTION__
> @ 0x7fa101062879 (unknown)
> @ 0x7fa10105ec80 (unknown)
> @ 0x7fa101071870 (unknown)
> @ 0x7fa101071360 (unknown)
> @ 0x7fa101070d8e (unknown)
> @ 0x7fa0f4ef02b0 (unknown)
> @ 0x7fa0f514ae25 start_thread
> @ 0x7fa0f495a34d __clone
> [1] 15883 abort sudo ./bin/mesos-slave.sh --master=10.0.49.2:36250
> --port=36251
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)