[
https://issues.apache.org/jira/browse/MESOS-6575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383144#comment-16383144
]
Harold Dost III commented on MESOS-6575:
----------------------------------------
{quote}This is because the XFS isolator doesn't support path volumes so there's
no need to track any paths. {quote}
That's a good point, but the part that is missing is how we would add the
container limitation if we don't have a resource to bind it to.
{quote}Thinking about this some more, I'm not sure that we need to do anything
with soft limits at all. Let's assume that we implement this for task sandboxes
by applying a hard limit that is "disk_resource + some_constant_slop". {quote}
xfs_use_disk_reservation_as_soft_limit becomes useful because when you set a
soft limit the isolator doesn't need to worry about raising the limit. The
actual problem with hard limits is not when the capacity is actually met it is
when it falls short by some varied amount depending on tasks. The advantage
would be that when a soft limit is violated the project has the amount of time
in the xfs project timer to come back into range or it will get the container
limitation and therefore killed.
{quote}We still need to have the isolator periodically check the usage in order
to raise the limitation, so it doesn't really matter whether we have a soft
limit. All we really need to do is check the current usage against the resource
limit.{quote}
So the proposition around having the isolator raise the limit itself is the
potential for a runaway effect and then to make it useful it seems like you're
also going to need additional tweaking parameters like backoff , a
percentage/blocks raised per increase, limit in increases, possibly a mechanism
to reduce the limit.
To be honest though I don't know how much I am even behind the idea of
diff_bytes as a concept and would much rather have apps be explicit. The flag
{{xfs_use_disk_reservation_as_soft_limit}} plus having the ability for per task
soft limits available should be enough without adding too much complexity.
> Change `disk/xfs` isolator to terminate executor when it exceeds quota
> ----------------------------------------------------------------------
>
> Key: MESOS-6575
> URL: https://issues.apache.org/jira/browse/MESOS-6575
> Project: Mesos
> Issue Type: Task
> Components: agent, containerization
> Reporter: Santhosh Kumar Shanmugham
> Assignee: James Peach
> Priority: Major
>
> Unlike {{disk/du}} isolator which sends a {{ContainerLimitation}} protobuf
> when the executor exceeds the quota, {{disk/xfs}} isolator, which relies on
> XFS's internal quota enforcement, silently fails the {{write}} operation,
> that causes the quota limit to be exceeded, without surfacing the quota
> breach information.
> This task is to change the `disk/xfs` isolator so that, a
> {{ContainerLimitation}} message is triggered when the quota is exceeded.
> This feature will rely on the underlying filesystem being mounted with
> {{pqnoenforce}} (accounting-only mode), so that XFS does not silently causes
> a {{EDQUOT}} error on writes that causes the quota to be exceeded. Now the
> isolator can track the disk quota via {{xfs_quota}}, very much like
> {{disk/du}} using {{du}}, every {{container_disk_watch_interval}} and surface
> the disk quota limit exceed event via a {{ContainerLimitation}} protobuf,
> causing the executor to be terminated. This feature can then be turned on/off
> via the existing {{enforce_container_disk_quota}} option.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)