[ 
https://issues.apache.org/jira/browse/MESOS-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-1649:
--------------------------

    Story Points: 3

> Network isolator should tolerate slave crashes while doing isolate/cleanup.
> ---------------------------------------------------------------------------
>
>                 Key: MESOS-1649
>                 URL: https://issues.apache.org/jira/browse/MESOS-1649
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Jie Yu
>            Assignee: Jie Yu
>             Fix For: 0.20.0
>
>
> A slave may crash while we are installing/removing filters. The slave 
> recovery for the network isolator should tolerate those partially installed 
> filters. Also, we want to avoid leaking a filter on host eth0 and host lo.
> The current code cannot tolerate that, thus may cause the following error:
> {noformat}
> Failed to perform recovery: Collect failed: Failed to recover container 
> d409a100-2afb-497c-864f-fe3002cf65d9 with pid 50405: No ephemeral ports found
> To remedy this do as follows:
> Step 1: rm -f /var/lib/mesos/meta/slaves/latest
>        This ensures slave doesn't recover old live executors.
> Step 2: Restart the slave.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to