[
https://issues.apache.org/jira/browse/MESOS-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382646#comment-16382646
]
Harold Dost III commented on MESOS-6575:
----------------------------------------
One other thing while viewing the source for how {{disk/du}} handles disk
resources and how {{disk/xfs}} handles resources. When the resource is updated
in the xfs handler they are not tracked, but instead are added up. With this
being the case, there's no way to set a limitation on a disk resource [because
of this
function|https://github.com/apache/mesos/blob/32f6d4eec2724414e217875f4f7d3b2538db5381/src/slave/containerizer/mesos/isolators/xfs/disk.cpp#L70].
The reasoning behind doing it this way may have made sense, but the logic is
lost in translation. My thought would be to track it similarly to how
{{disk/du}} does.
> Change `disk/xfs` isolator to terminate executor when it exceeds quota
> ----------------------------------------------------------------------
>
> Key: MESOS-6575
> URL: https://issues.apache.org/jira/browse/MESOS-6575
> Project: Mesos
> Issue Type: Task
> Components: agent, containerization
> Reporter: Santhosh Kumar Shanmugham
> Assignee: James Peach
> Priority: Major
>
> Unlike {{disk/du}} isolator which sends a {{ContainerLimitation}} protobuf
> when the executor exceeds the quota, {{disk/xfs}} isolator, which relies on
> XFS's internal quota enforcement, silently fails the {{write}} operation,
> that causes the quota limit to be exceeded, without surfacing the quota
> breach information.
> This task is to change the `disk/xfs` isolator so that, a
> {{ContainerLimitation}} message is triggered when the quota is exceeded.
> This feature will rely on the underlying filesystem being mounted with
> {{pqnoenforce}} (accounting-only mode), so that XFS does not silently causes
> a {{EDQUOT}} error on writes that causes the quota to be exceeded. Now the
> isolator can track the disk quota via {{xfs_quota}}, very much like
> {{disk/du}} using {{du}}, every {{container_disk_watch_interval}} and surface
> the disk quota limit exceed event via a {{ContainerLimitation}} protobuf,
> causing the executor to be terminated. This feature can then be turned on/off
> via the existing {{enforce_container_disk_quota}} option.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)