[jira] [Assigned] (MESOS-9966) Agent crashes when trying to destroy orphaned nested container if root container is orphaned as well
[ https://issues.apache.org/jira/browse/MESOS-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gilbert Song reassigned MESOS-9966: --- Assignee: Qian Zhang (was: Gilbert Song) > Agent crashes when trying to destroy orphaned nested container if root > container is orphaned as well > > > Key: MESOS-9966 > URL: https://issues.apache.org/jira/browse/MESOS-9966 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.7.3 >Reporter: Jan Schlicht >Assignee: Qian Zhang >Priority: Major > > Noticed an agent crash-looping when trying to recover. It recognized a > container and its nested container as orphaned. When trying to destroy the > nested container, the agent crashes. Probably when trying to [get the sandbox > path of the root > container|https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L2966]. > {noformat} > 2019-09-09 05:04:26: I0909 05:04:26.382326 89950 linux_launcher.cpp:286] > Recovering Linux launcher > 2019-09-09 05:04:26: I0909 05:04:26.383162 89950 linux_launcher.cpp:331] Not > recovering cgroup mesos/a127917b-96fe-4100-b73d-5f876ce9ffc1/mesos > 2019-09-09 05:04:26: I0909 05:04:26.383199 89950 linux_launcher.cpp:343] > Recovered container > a127917b-96fe-4100-b73d-5f876ce9ffc1.9783e2bb-7c2e-4930-9d39-4225bb6f1b97 > 2019-09-09 05:04:26: I0909 05:04:26.383216 89950 linux_launcher.cpp:331] Not > recovering cgroup > mesos/a127917b-96fe-4100-b73d-5f876ce9ffc1/mesos/9783e2bb-7c2e-4930-9d39-4225bb6f1b97/mesos > 2019-09-09 05:04:26: I0909 05:04:26.383229 89950 linux_launcher.cpp:343] > Recovered container 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.383237 89950 linux_launcher.cpp:343] > Recovered container a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.383249 89950 linux_launcher.cpp:343] > Recovered container > 2ee154e2-3cc4-420a-99fb-065e740f3091.49fe2bf9-17af-415f-92b6-92a4db619436 > 2019-09-09 05:04:26: I0909 05:04:26.383260 89950 linux_launcher.cpp:331] Not > recovering cgroup mesos/2ee154e2-3cc4-420a-99fb-065e740f3091/mesos > 2019-09-09 05:04:26: I0909 05:04:26.383271 89950 linux_launcher.cpp:331] Not > recovering cgroup > mesos/2ee154e2-3cc4-420a-99fb-065e740f3091/mesos/49fe2bf9-17af-415f-92b6-92a4db619436/mesos > 2019-09-09 05:04:26: I0909 05:04:26.383280 89950 linux_launcher.cpp:437] > 2ee154e2-3cc4-420a-99fb-065e740f3091.49fe2bf9-17af-415f-92b6-92a4db619436 is > a known orphaned container > 2019-09-09 05:04:26: I0909 05:04:26.383289 89950 linux_launcher.cpp:437] > a127917b-96fe-4100-b73d-5f876ce9ffc1 is a known orphaned container > 2019-09-09 05:04:26: I0909 05:04:26.383296 89950 linux_launcher.cpp:437] > 2ee154e2-3cc4-420a-99fb-065e740f3091 is a known orphaned container > 2019-09-09 05:04:26: I0909 05:04:26.383304 89950 linux_launcher.cpp:437] > a127917b-96fe-4100-b73d-5f876ce9ffc1.9783e2bb-7c2e-4930-9d39-4225bb6f1b97 is > a known orphaned container > 2019-09-09 05:04:26: I0909 05:04:26.383414 89950 containerizer.cpp:1092] > Recovering isolators > 2019-09-09 05:04:26: I0909 05:04:26.385931 89977 memory.cpp:478] Started > listening for OOM events for container a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.386118 89977 memory.cpp:590] Started > listening on 'low' memory pressure events for container > a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.386152 89977 memory.cpp:590] Started > listening on 'medium' memory pressure events for container > a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.386175 89977 memory.cpp:590] Started > listening on 'critical' memory pressure events for container > a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.386227 89977 memory.cpp:478] Started > listening for OOM events for container 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.386248 89977 memory.cpp:590] Started > listening on 'low' memory pressure events for container > 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.386270 89977 memory.cpp:590] Started > listening on 'medium' memory pressure events for container > 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.386376 89977 memory.cpp:590] Started > listening on 'critical' memory pressure events for container > 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.386694 89921 containerizer.cpp:1131] > Recovering provisioner > 2019-09-09 05:04:26: I0909 05:04:26.388226 90010 metadata_manager.cpp:286] > Successfully loaded 64 Docker images > 2019-09-09 05:04:26: I0909 05:04:26.388420 89932 provisioner.cpp:494]
[jira] [Assigned] (MESOS-9966) Agent crashes when trying to destroy orphaned nested container if root container is orphaned as well
[ https://issues.apache.org/jira/browse/MESOS-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Dellar reassigned MESOS-9966: -- Assignee: Gilbert Song > Agent crashes when trying to destroy orphaned nested container if root > container is orphaned as well > > > Key: MESOS-9966 > URL: https://issues.apache.org/jira/browse/MESOS-9966 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.7.3 >Reporter: Jan Schlicht >Assignee: Gilbert Song >Priority: Major > > Noticed an agent crash-looping when trying to recover. It recognized a > container and its nested container as orphaned. When trying to destroy the > nested container, the agent crashes. Probably when trying to [get the sandbox > path of the root > container|https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/containerizer.cpp#L2966]. > {noformat} > 2019-09-09 05:04:26: I0909 05:04:26.382326 89950 linux_launcher.cpp:286] > Recovering Linux launcher > 2019-09-09 05:04:26: I0909 05:04:26.383162 89950 linux_launcher.cpp:331] Not > recovering cgroup mesos/a127917b-96fe-4100-b73d-5f876ce9ffc1/mesos > 2019-09-09 05:04:26: I0909 05:04:26.383199 89950 linux_launcher.cpp:343] > Recovered container > a127917b-96fe-4100-b73d-5f876ce9ffc1.9783e2bb-7c2e-4930-9d39-4225bb6f1b97 > 2019-09-09 05:04:26: I0909 05:04:26.383216 89950 linux_launcher.cpp:331] Not > recovering cgroup > mesos/a127917b-96fe-4100-b73d-5f876ce9ffc1/mesos/9783e2bb-7c2e-4930-9d39-4225bb6f1b97/mesos > 2019-09-09 05:04:26: I0909 05:04:26.383229 89950 linux_launcher.cpp:343] > Recovered container 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.383237 89950 linux_launcher.cpp:343] > Recovered container a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.383249 89950 linux_launcher.cpp:343] > Recovered container > 2ee154e2-3cc4-420a-99fb-065e740f3091.49fe2bf9-17af-415f-92b6-92a4db619436 > 2019-09-09 05:04:26: I0909 05:04:26.383260 89950 linux_launcher.cpp:331] Not > recovering cgroup mesos/2ee154e2-3cc4-420a-99fb-065e740f3091/mesos > 2019-09-09 05:04:26: I0909 05:04:26.383271 89950 linux_launcher.cpp:331] Not > recovering cgroup > mesos/2ee154e2-3cc4-420a-99fb-065e740f3091/mesos/49fe2bf9-17af-415f-92b6-92a4db619436/mesos > 2019-09-09 05:04:26: I0909 05:04:26.383280 89950 linux_launcher.cpp:437] > 2ee154e2-3cc4-420a-99fb-065e740f3091.49fe2bf9-17af-415f-92b6-92a4db619436 is > a known orphaned container > 2019-09-09 05:04:26: I0909 05:04:26.383289 89950 linux_launcher.cpp:437] > a127917b-96fe-4100-b73d-5f876ce9ffc1 is a known orphaned container > 2019-09-09 05:04:26: I0909 05:04:26.383296 89950 linux_launcher.cpp:437] > 2ee154e2-3cc4-420a-99fb-065e740f3091 is a known orphaned container > 2019-09-09 05:04:26: I0909 05:04:26.383304 89950 linux_launcher.cpp:437] > a127917b-96fe-4100-b73d-5f876ce9ffc1.9783e2bb-7c2e-4930-9d39-4225bb6f1b97 is > a known orphaned container > 2019-09-09 05:04:26: I0909 05:04:26.383414 89950 containerizer.cpp:1092] > Recovering isolators > 2019-09-09 05:04:26: I0909 05:04:26.385931 89977 memory.cpp:478] Started > listening for OOM events for container a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.386118 89977 memory.cpp:590] Started > listening on 'low' memory pressure events for container > a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.386152 89977 memory.cpp:590] Started > listening on 'medium' memory pressure events for container > a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.386175 89977 memory.cpp:590] Started > listening on 'critical' memory pressure events for container > a127917b-96fe-4100-b73d-5f876ce9ffc1 > 2019-09-09 05:04:26: I0909 05:04:26.386227 89977 memory.cpp:478] Started > listening for OOM events for container 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.386248 89977 memory.cpp:590] Started > listening on 'low' memory pressure events for container > 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.386270 89977 memory.cpp:590] Started > listening on 'medium' memory pressure events for container > 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.386376 89977 memory.cpp:590] Started > listening on 'critical' memory pressure events for container > 2ee154e2-3cc4-420a-99fb-065e740f3091 > 2019-09-09 05:04:26: I0909 05:04:26.386694 89921 containerizer.cpp:1131] > Recovering provisioner > 2019-09-09 05:04:26: I0909 05:04:26.388226 90010 metadata_manager.cpp:286] > Successfully loaded 64 Docker images > 2019-09-09 05:04:26: I0909 05:04:26.388420 89932 provisioner.cpp:494] > Provisioner reco