[ 
https://issues.apache.org/jira/browse/MESOS-9352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662684#comment-16662684
 ] 

Chun-Hung Hsiao commented on MESOS-9352:
----------------------------------------

Two problems here:
 # Persistent volumes were not unmounted from the killed task container 
correctly.
 # Agent GC accidently cleaned up the persistent volumes.

I thought Problem 2 should have been fixed by MESOS-8830 and MESOS-9049 in 
DC/OS 1.11.6, which uses Mesos commit 5a7ad47e8fc1a14101e47a29eb8e7e2a20d959c5, 
but it seems not the case. Agent logs in {{/var/log/mesos/}} and 
{{/var/log/mesos/archive/}} would be helpful to debug this. Also does this 
happen if UCR is used (i.e., set container {{type}} to {{MESOS}})?

> Data in persistent volume deleted accidentally when using Docker container 
> and Persistent volume
> ------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-9352
>                 URL: https://issues.apache.org/jira/browse/MESOS-9352
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent, containerization, docker
>    Affects Versions: 1.5.1, 1.5.2
>         Environment: DCOS 1.11.6
> Mesos 1.5.2
>            Reporter: David Ko
>            Priority: Critical
>              Labels: mesosphere, persistent-volumes
>         Attachments: image-2018-10-24-22-20-51-059.png, 
> image-2018-10-24-22-21-13-399.png
>
>
> Using docker image w/ persistent volume to start a service, it will cause 
> data in persistent volume deleted accidentally when task killed and 
> restarted, also old mount points not unmounted, even the service already 
> deleted. 
> *The expected result should be data in persistent volume kept until task 
> deleted completely, also dangling mount points should be unmounted correctly.*
>  
> *Step 1:* Use below JSON config to create a Mysql server using Docker image 
> and Persistent Volume
> {code:javascript}
> {
>   "env": {
>     "MYSQL_USER": "wordpress",
>     "MYSQL_PASSWORD": "secret",
>     "MYSQL_ROOT_PASSWORD": "supersecret",
>     "MYSQL_DATABASE": "wordpress"
>   },
>   "id": "/mysqlgc",
>   "backoffFactor": 1.15,
>   "backoffSeconds": 1,
>   "constraints": [
>     [
>       "hostname",
>       "IS",
>       "172.27.12.216"
>     ]
>   ],
>   "container": {
>     "portMappings": [
>       {
>         "containerPort": 3306,
>         "hostPort": 0,
>         "protocol": "tcp",
>         "servicePort": 10000
>       }
>     ],
>     "type": "DOCKER",
>     "volumes": [
>       {
>         "persistent": {
>           "type": "root",
>           "size": 1000,
>           "constraints": []
>         },
>         "mode": "RW",
>         "containerPath": "mysqldata"
>       },
>       {
>         "containerPath": "/var/lib/mysql",
>         "hostPath": "mysqldata",
>         "mode": "RW"
>       }
>     ],
>     "docker": {
>       "image": "mysql",
>       "forcePullImage": false,
>       "privileged": false,
>       "parameters": []
>     }
>   },
>   "cpus": 1,
>   "disk": 0,
>   "instances": 1,
>   "maxLaunchDelaySeconds": 3600,
>   "mem": 512,
>   "gpus": 0,
>   "networks": [
>     {
>       "mode": "container/bridge"
>     }
>   ],
>   "residency": {
>     "relaunchEscalationTimeoutSeconds": 3600,
>     "taskLostBehavior": "WAIT_FOREVER"
>   },
>   "requirePorts": false,
>   "upgradeStrategy": {
>     "maximumOverCapacity": 0,
>     "minimumHealthCapacity": 0
>   },
>   "killSelection": "YOUNGEST_FIRST",
>   "unreachableStrategy": "disabled",
>   "healthChecks": [],
>   "fetch": []
> }
> {code}
> *Step 2:* Kill mysqld process to force rescheduling new Mysql task, but found 
> 2 mount points to the same persistent volume, it means old mount point did 
> not be unmounted immediately.
> !image-2018-10-24-22-20-51-059.png!
> *Step 3:* After GC, data in persistent volume was deleted accidentally, but 
> mysqld (Mesos task) still running
> !image-2018-10-24-22-21-13-399.png!
> *Step 4:* Delete Mysql service from Marathon, all mount points unable to 
> unmount, even the service already deleted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to