[ 
https://issues.apache.org/jira/browse/MESOS-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li updated MESOS-7366:
-----------------------------
    Description: 
When 1) a persistent volume is mounted, 2) umount is stuck or something, 3) 
executor directory gc being invoked, agent seems to emit a log like:

```
 Failed to delete directory  <executor_dir>/runs/<uuid>/volume: Device or 
resource busy
```

After this, the persistent volume directory is empty.

This could trigger data loss on critical workload so we should fix this ASAP.

The triggering environment is a custom executor w/o rootfs image.

Please let me know if you need more signal.

  was:
When 1) a persistent volume is mounted, 2) umount is stuck or something, 3) 
executor directory gc being invoked, agent seems to emit a log like:

```
 Failed to delete directory  <executor_dir>/runs/<uuid>/volume: Device or 
resource busy
```

The triggering environment is a custom executor w/o rootfs image.

Please let me know if you need more signal.


> Incorrect agent gc could empty up entire persistent volume content
> ------------------------------------------------------------------
>
>                 Key: MESOS-7366
>                 URL: https://issues.apache.org/jira/browse/MESOS-7366
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Zhitao Li
>            Assignee: Jie Yu
>            Priority: Critical
>
> When 1) a persistent volume is mounted, 2) umount is stuck or something, 3) 
> executor directory gc being invoked, agent seems to emit a log like:
> ```
>  Failed to delete directory  <executor_dir>/runs/<uuid>/volume: Device or 
> resource busy
> ```
> After this, the persistent volume directory is empty.
> This could trigger data loss on critical workload so we should fix this ASAP.
> The triggering environment is a custom executor w/o rootfs image.
> Please let me know if you need more signal.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to