[
https://issues.apache.org/jira/browse/MESOS-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855968#comment-15855968
]
Pierre Cheynier edited comment on MESOS-7007 at 2/7/17 1:23 PM:
----------------------------------------------------------------
Hi [~jieyu], [~gilbert],
I had a discussion on Friday with [~jieyu] about that issue.
Since, I did tests on 1.1.0 :
* {{--launcher=linux}} doesn't change anything. As seen with Jie Yu, I was
already on this launcher, by default I guess.
* by removing {{filesystem/shared}} isolator, /tmp content is no more trashed
on container creation/deletion BUT now the /tmp volume feature does not work
anymore:
** the tmp in the sandbox is {{root:root}} and {{0777}} and it is a pure bind
mount, not something isolated - meaning if I erase here, it will erase on /tmp
as well -
** I run into this issue: MESOS-6563, looking at the mounts visible from
root:
{noformat}
# There is only 1 task, so theoretically 1 mount
$ mesos-ps --master=127.0.0.1:5050
USER FRAMEWORK TASK SLAVE MEM TIME
CPU (allocated)
mara... marathon visibi... mesos-cluster-c... 13.7 MB/42.0 MB
00:00:01.490000 0.2
# But in fact, ... no !
$ mount | grep "mesos/slaves" | wc -l
56
# 56 is probably the number of container I launched for my CI tests
$ mount | grep "mesos/slaves" | head -5
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_simplehttp.dcde69c5-ed32-11e6-b388-02427970a3a5/runs/45277613-6129-4eb3-b8d0-acc0c2fe8605/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_simplehttp.dcde69c5-ed32-11e6-b388-02427970a3a5/runs/45277613-6129-4eb3-b8d0-acc0c2fe8605/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_security.f6152faa-ed32-11e6-b388-02427970a3a5/runs/f74453b6-aa39-456f-a4a1-bd953b870d38/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_simplehttp.dcde69c5-ed32-11e6-b388-02427970a3a5/runs/45277613-6129-4eb3-b8d0-acc0c2fe8605/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_security.f6152faa-ed32-11e6-b388-02427970a3a5/runs/f74453b6-aa39-456f-a4a1-bd953b870d38/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
{noformat}
was (Author: pierrecdn):
Hi [~jieyu], [~gilbert],
I had a discussion on Friday with [~jieyu] about that issue.
Since, I did tests on 1.1.0 :
* {{--launcher=linux}} doesn't change anything. As seen with Jie Yu, I was
already on this launcher, by default I guess.
* by removing filesystem/shared, /tmp content is no more trashed on container
creation/deletion BUT now the /tmp volume feature does not work anymore:
** the tmp in the sandbox is {{root:root}} and {{0777}} and it is a pure bind
mount, not something isolated - meaning if I erase here, it will erase on /tmp
as well-
** I run into this issue: https://issues.apache.org/jira/browse/MESOS-6563,
looking at the mounts visible from root:
{noformat}
# There is only 1 task, so theoretically 1 mount
$ mesos-ps --master=127.0.0.1:5050
USER FRAMEWORK TASK SLAVE MEM TIME
CPU (allocated)
mara... marathon visibi... mesos-cluster-c... 13.7 MB/42.0 MB
00:00:01.490000 0.2
# But in fact, ... no !
$ mount | grep "mesos/slaves" | wc -l
56
# 56 is probably the number of container I launched for my CI tests
$ mount | grep "mesos/slaves" | head -5
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_simplehttp.dcde69c5-ed32-11e6-b388-02427970a3a5/runs/45277613-6129-4eb3-b8d0-acc0c2fe8605/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_simplehttp.dcde69c5-ed32-11e6-b388-02427970a3a5/runs/45277613-6129-4eb3-b8d0-acc0c2fe8605/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_security.f6152faa-ed32-11e6-b388-02427970a3a5/runs/f74453b6-aa39-456f-a4a1-bd953b870d38/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_simplehttp.dcde69c5-ed32-11e6-b388-02427970a3a5/runs/45277613-6129-4eb3-b8d0-acc0c2fe8605/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
/dev/sda3 on
/var/opt/mesos/slaves/e02761a5-308e-4797-b43b-b56c3da66616-S0/frameworks/e02761a5-308e-4797-b43b-b56c3da66616-0000/executors/group_security.f6152faa-ed32-11e6-b388-02427970a3a5/runs/f74453b6-aa39-456f-a4a1-bd953b870d38/tmp
type ext4 (rw,relatime,seclabel,data=ordered)
{noformat}
What's the plan ?
> filesystem/shared and --default_container_info broken since 1.1
> ---------------------------------------------------------------
>
> Key: MESOS-7007
> URL: https://issues.apache.org/jira/browse/MESOS-7007
> Project: Mesos
> Issue Type: Bug
> Components: agent
> Affects Versions: 1.1.0
> Reporter: Pierre Cheynier
>
> I face this issue, that prevent me to upgrade to 1.1.0 (and the change was
> consequently introduced in this version):
> I'm using default_container_info to mount a /tmp volume in the container's
> mount namespace from its current sandbox, meaning that each container have a
> dedicated /tmp, thanks to the {{filesystem/shared}} isolator.
> I noticed through our automation pipeline that integration tests were failing
> and found that this is because /tmp (the one from the host!) contents is
> trashed each time a container is created.
> Here is my setup:
> *
> {{--isolation='cgroups/cpu,cgroups/mem,namespaces/pid,*disk/du,filesystem/shared,filesystem/linux*,docker/runtime'}}
> *
> {{--default_container_info='\{"type":"MESOS","volumes":\[\{"host_path":"tmp","container_path":"/tmp","mode":"RW"\}\]\}'}}
> I discovered this issue in the early days of 1.1 (end of Nov, spoke with
> someone on Slack), but had unfortunately no time to dig into the symptoms a
> bit more.
> I found nothing interesting even using GLOGv=3.
> Maybe it's a bad usage of isolators that trigger this issue ? If it's the
> case, then at least a documentation update should be done.
> Let me know if more information is needed.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)