[
https://issues.apache.org/jira/browse/MESOS-7166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932872#comment-15932872
]
Pierre Cheynier commented on MESOS-7166:
----------------------------------------
Not 100% sure, but this was maybe observed during a removal of the port_mapping
isolator.
In this case, to me, there is no way to clean the old tasks network
configuration (_cleanup will never be called) except if you properly drain your
agent before restarting (and it was not the case).
I will maybe close this ticket and reopen if I have any other information.
> port_mapping isolator: netns and veth are not GC-ed
> ---------------------------------------------------
>
> Key: MESOS-7166
> URL: https://issues.apache.org/jira/browse/MESOS-7166
> Project: Mesos
> Issue Type: Bug
> Components: isolation, network
> Reporter: Pierre Cheynier
>
> By testing port_mapping isolator during a few days in a preproduction
> environment where a lot of container starts, sometime fails, are destroyed
> continuously, I faced this issue: some agents host still have their network
> configuration, meaning that the netns, the veth interfaces, the tc rules are
> there.
> Here is my setup.
> * Cent OS 7.2
> * LTS Kernel, 4.4.21 at that time
> * libnl 3.2.28
> * mesos 1.0.2 compiled using:
> {noformat}
> ./configure \
> CFLAGS="%{optflags}" \
> CXXFLAGS="%{optflags}" \
> --disable-silent-rules \
> --prefix=%{_prefix} \
> --bindir=%{_bindir} \
> --libdir=%{_libdir} \
> --includedir=%{_includedir} \
> --disable-python \
> --disable-python-dependency-install \
> --enable-libevent \
> --enable-ssl \
> --enable-optimize \
> --with-network-isolator
> {noformat}
> I have logs that apparently says that at some point the container was
> considered as orphan (maybe due to an operation on the host, like an agent
> configuration update).
> {noformat}
> Feb 24 13:21:30 mesos-slave049-par mesos-slave[48375]: I0224 13:21:30.421066
> 48395 containerizer.cpp:690] Removing orphan container
> a8e05a03-7499-4566-bcba-53d8bf204e5f
> Feb 24 13:21:30 mesos-slave049-par mesos-slave[48375]: I0224 13:21:30.421968
> 48395 linux_launcher.cpp:349] Using pid namespace to destroy container
> a8e05a03-7499-4566-bcba-53d8bf204e5f
> {noformat}
> It would be nice if the agent isolator could handle the cleaning of that.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)