[
https://issues.apache.org/jira/browse/MESOS-2903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594152#comment-14594152
]
Jie Yu commented on MESOS-2903:
-------------------------------
I think this is related to the recent change about the slave recovery semantics
(MESOS-2367). Previously, slave won't finish recovery if some orphan containers
cannot be destroyed. Therefore, the port mapping isolator simply assumes that
it knows about all the filters on host. However, this is no longer true after
MESOS-2367 is committed. So the isolator code needs to adapt to that new
semantics.
> Network isolator should not fail when target state already exists
> -----------------------------------------------------------------
>
> Key: MESOS-2903
> URL: https://issues.apache.org/jira/browse/MESOS-2903
> Project: Mesos
> Issue Type: Bug
> Components: isolation
> Affects Versions: 0.23.0
> Reporter: Paul Brett
> Priority: Critical
>
> Network isolator has multiple instances of the following pattern:
> {noformat}
> Try<bool> something = ....::create();
> if (something.isError()) {
>
> ++metrics.something_errors;
> return Failure("Failed to create something ...")
> } else if (!icmpVethToEth0.get()) {
>
> ++metrics.adding_veth_icmp_filters_already_exist;
>
> return Failure("Something already exists");
> }
>
> {noformat}
> These failures have occurred in operation due to the failure to recover or
> delete an orphan, causing the slave to remain on line but unable to create
> new resources. We should convert the second failure message in this
> pattern to an information message since the final state of the system is the
> state that we requested.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)