[ https://issues.apache.org/jira/browse/MESOS-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902552#comment-16902552 ]
James Peach commented on MESOS-9875: ------------------------------------ {noformat} f9330006-d885-4ef0-b2c7-c9c6fcc239e5 is the persistence ID. 5fa5c810-2dd3-41cb-9633-a3ef404b08c4 is the operation UUID. honvr62494cqk_ff4e953f-0eca-4b41-a08d-ddea27980b14 is the operation ID. I0627 22:03:17.360236 3529210 slave.cpp:4282] Updated checkpointed operations from [ cfd6b624-996f-45d7-9aaf-9a13ab9714b4 (RESERVE for framework efd8f75d-25a9-4346-8c7b-d8c8c95ba328-22525, ID: honvr62494cqk_a5b92fff-5491-4616-8970-8c390265c009, latest state: OPERATION_FINISHED) ] to [ cfd6b624-996f-45d7-9aaf-9a13ab9714b4 (RESERVE for framework efd8f75d-25a9-4346-8c7b-d8c8c95ba328-22525, ID: honvr62494cqk_a5b92fff-5491-4616-8970-8c390265c009, latest state: OPERATION_FINISHED), 5fa5c810-2dd3-41cb-9633-a3ef404b08c4 (CREATE for framework efd8f75d-25a9-4346-8c7b-d8c8c95ba328-22525, ID: honvr62494cqk_ff4e953f-0eca-4b41-a08d-ddea27980b14, latest state: OPERATION_PENDING) ] I0627 22:03:17.360723 3529210 slave.cpp:8670] Updating the state of operation 'honvr62494cqk_ff4e953f-0eca-4b41-a08d-ddea27980b14' (uuid: 5fa5c810-2dd3-41cb-9633-a3ef404b08c4) for framework efd8f75d-25a9-4346-8c7b-d8c8c95ba328-22525 (latest state: OPERATION_FINISHED, status update state: OPERATION_FINISHED) E0627 22:03:17.365811 3529210 slave.cpp:4257] EXIT with status 1: Failed to sync checkpointed resources: Failed to create the persistent volume f9330006-d885-4ef0-b2c7-c9c6fcc239e5 at '/srv/mesos/work/volumes/roles/test-3/f9330006-d885-4ef0-b2c7-c9c6fcc239e5': Operation not permitted {noformat} > Mesos did not respond correctly when operations should fail > ----------------------------------------------------------- > > Key: MESOS-9875 > URL: https://issues.apache.org/jira/browse/MESOS-9875 > Project: Mesos > Issue Type: Bug > Components: agent > Reporter: Yifan Xing > Assignee: Greg Mann > Priority: Major > Labels: foundations, mesosphere > Attachments: Screen Shot 2019-06-27 at 15.07.20.png > > > For testing persistent volumes with {{OPERATION_FAILED/ERROR}} feedbacks, we > sshed into the mesos-agent and made it unable to create subdirectories in > {{/srv/mesos/work/volumes}}, however, mesos did not respond any operation > failed response. Instead, we received {{OPERATION_FINISHED}} feedback. > Steps to recreate the issue: > 1. Ssh into a magent. > 2. Make it impossible to create a persistent volume (we expect the agent to > crash and reregister, and the master to release that the operation is > {{OPERATION_DROPPED}}): > * cd /srv/mesos/work (if it doesn't exist mkdir /srv/mesos/work/volumes) > * chattr -RV +i volumes (then no subdirectories can be created) > 3. Launch a service with persistent volumes with the constraint of only using > the magent modified above. > > > Logs for the scheduler for receiving `OPERATION_FINISHED`: > (Also see screenshot) > > 2019-06-27 21:57:11.879 [12768651|rdar://12768651] > [Jarvis-mesos-dispatcher-105] INFO c.a.j.s.ServicePodInstance - Stored > operation=4g3k02s1gjb0q_5f912b59-a32d-462c-9c46-8401eba4d2c1 and > feedback=OPERATION_FINISHED in podInstanceID=4g3k02s1gjb0q on > serviceID=yifan-badagents-1 > > * 2019-06-27 21:55:23: task reached state TASK_FAILED for mesos reason: > REASON_CONTAINER_LAUNCH_FAILED with mesos message: Failed to launch > container: Failed to change the ownership of the persistent volume at > '/srv/mesos/work/volumes/roles/test-2/19b564e8-3a90-4f2f-981d-b3dd2a5d9f90' > with uid 264 and gid 264: No such file or directory -- This message was sent by Atlassian JIRA (v7.6.14#76016)