[
https://issues.apache.org/jira/browse/MESOS-9300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642305#comment-16642305
]
James Peach commented on MESOS-9300:
------------------------------------
MacOS has
[ATTR_DIR_MOUNTSTATUS|https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/getattrlist.2.html#//apple_ref/doc/man/2/getattrlist],
but AFAIK there's not a straight-forward equivalent on Linux.
However like we can detect this on Linux with [EXDEV rename
trick|http://blog.schmorp.de/2016-03-03-detecting-a-mount-point.html]
> XFS isolator can mislabel project IDs on persistence volumes.
> -------------------------------------------------------------
>
> Key: MESOS-9300
> URL: https://issues.apache.org/jira/browse/MESOS-9300
> Project: Mesos
> Issue Type: Bug
> Components: agent
> Reporter: James Peach
> Assignee: James Peach
> Priority: Major
>
> What happens here is that we are erroneously applying the sandbox's project
> ID to the persistent volume.
> First, the filesystem/linux isolator bind mounts the persistent volume into
> the sandbox:
> {noformat}
> I1003 06:49:21.907644 2812466 linux.cpp:593] Mounting
> '/srv/mesos/work/volumes/roles/pie.mobius/21cb2eb6-b3e5-46f2-944e-8f6e5db9f07f'
> to
> '/srv/mesos/work/slaves/909cff92-8e17-41bf-a251-9b5eb6186c35-S0/frameworks/363e6d80-8c38-46cf-815f-2fbf60a62628-0309/executors/mobius-mloop-1538549013_438156792-v2-shared-volume.pod1.writer-job.0.e93hs3uips2i9_1/runs/9e5770a7-9f78-46dc-9264-3e80be0e40cc/shared'
> for persistent volume disk(allocated: pie.mobius)(reservations:
> [(DYNAMIC,pie.mobius,jarvis-principal,\{podInstance: e93hs3uips2i9, pod:
> pod1, service:
> mobius-mloop-1538549013_438156792-v2-shared-volume})])[21cb2eb6-b3e5-46f2-944e-8f6e5db9f07f:shared]<SHARED>:1
> of container 9e5770a7-9f78-46dc-9264-3e80be0e40cc
> {noformat}
> Next, the `disk/xfs` isolator assigns a project ID to the sandbox:
> {noformat}
> I1003 06:49:21.920197 2812452 disk.cpp:402] Assigned project 6806 to
> '/srv/mesos/work/slaves/909cff92-8e17-41bf-a251-9b5eb6186c35-S0/frameworks/363e6d80-8c38-46cf-815f-2fbf60a62628-0309/executors/mobius-mloop-1538549013_438156792-v2-shared-volume.pod1.writer-job.0.e93hs3uips2i9_1/runs/9e5770a7-9f78-46dc-9264-3e80be0e40cc'
> {noformat}
> Note, that when this happens, the isolator recursively applies the project ID
> to the contents of the sandbox. It doesn't follow symlinks or cross devices
> when it does this, but on Linux, a bind mount would not trigger either of
> these conditions.
> Finally, the `disk/xfs` isolator tries to assign a project ID to the
> persistent volume as it is used by the task:
> {noformat}
> F1003 06:49:21.920577 2812452 disk.cpp:532] Check failed:
> scheduledProjects.contains(projectId.get()) untracked project ID 6806 for
> volume ID 21cb2eb6-b3e5-46f2-944e-8f6e5db9f07f on
> /srv/mesos/work/volumes/roles/pie.mobius/21cb2eb6-b3e5-46f2-944e-8f6e5db9f07f
> {noformat}
> This check fails, because if the persistent volume has a project ID, we
> expect that is had already be scheduled for reclaimation. However, it's
> project ID is the one we assigned to the sandbox. We don't scheduled the
> ssandbox for reclaimation until cleanup, so (fortunately) the invariant check
> triggers.
> So, apart from triggering the CHECK, the root cause of this is that we are
> altering the project ID of the persistent volume, which permanently
> misattributes the corresponding quote.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)