[
https://issues.apache.org/jira/browse/MESOS-8416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440897#comment-16440897
]
Alexander Rukletsov commented on MESOS-8416:
--------------------------------------------
[~gilbert] promoted it to the blocker for 1.6.0 per your comment above. Can you
please help me estimate the workload and find someone to help fix it before we
cut 1.6 branch?
> CHECK failure if trying to recover nested containers but the framework
> checkpointing is not enabled.
> ----------------------------------------------------------------------------------------------------
>
> Key: MESOS-8416
> URL: https://issues.apache.org/jira/browse/MESOS-8416
> Project: Mesos
> Issue Type: Bug
> Components: containerization
> Reporter: Gilbert Song
> Assignee: Gilbert Song
> Priority: Blocker
> Labels: containerizer, mesosphere
>
> {noformat}
> I0108 23:05:25.313344 31743 slave.cpp:620] Agent attributes: [ ]
> I0108 23:05:25.313832 31743 slave.cpp:629] Agent hostname:
> vagrant-ubuntu-wily-64
> I0108 23:05:25.314916 31763 task_status_update_manager.cpp:181] Pausing
> sending task status updates
> I0108 23:05:25.323496 31766 state.cpp:66] Recovering state from
> '/var/lib/mesos/slave/meta'
> I0108 23:05:25.323639 31766 state.cpp:724] No committed checkpointed
> resources found at '/var/lib/mesos/slave/meta/resources/resources.info'
> I0108 23:05:25.326169 31760 task_status_update_manager.cpp:207] Recovering
> task status update manager
> I0108 23:05:25.326954 31759 containerizer.cpp:674] Recovering containerizer
> F0108 23:05:25.331529 31759 containerizer.cpp:919]
> CHECK_SOME(container->directory): is NONE
> *** Check failure stack trace: ***
> @ 0x7f769dbc98bd google::LogMessage::Fail()
> @ 0x7f769dbc8c8e google::LogMessage::SendToLog()
> @ 0x7f769dbc958d google::LogMessage::Flush()
> @ 0x7f769dbcca08 google::LogMessageFatal::~LogMessageFatal()
> @ 0x556cb4c2b937 _CheckFatal::~_CheckFatal()
> @ 0x7f769c5ac653
> mesos::internal::slave::MesosContainerizerProcess::recover()
> {noformat}
> If the framework does not enable the checkpointing. It means there is no
> slave state checkpointed. But containers are still checkpointed at the
> runtime dir, which mean recovering a nested container would cause the CHECK
> failure due to its parent's sandbox dir is unknown.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)