[
https://issues.apache.org/jira/browse/MESOS-473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod Kone updated MESOS-473:
-----------------------------
Affects Version/s: 0.13.0
0.12.0
0.11.0
0.10.0
> Freezer fails fatally when it is unable to write 'FROZEN' to freezer.state
> --------------------------------------------------------------------------
>
> Key: MESOS-473
> URL: https://issues.apache.org/jira/browse/MESOS-473
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.10.0, 0.11.0, 0.12.0, 0.13.0
> Reporter: Vinod Kone
> Assignee: Vinod Kone
>
> Observed this when running tests in a loop. This was
> SlaveRecoveryTest.RecoverTerminatedExecutor.
> F0517 22:40:00.163806 9004 cgroups_isolator.cpp:1165] Failed to destroy
> cgroup
> mesos_test/framework_201305172240-1740121354-46893-8981-0000_executor_59f49d23-9b61-4d08-868c-87af1b06a019_tag_8be5f3f8-e0ce-40d6-83dc-9866a984cbb8:
> Failed to kill tasks in nested cgroups: Collect failed: Failed to write
> control 'freezer.state': Device or resource busy
> *** Check failure stack trace: ***
> @ 0x7facb0d080ed google::LogMessage::Fail()
> @ 0x7facb0d0dd57 google::LogMessage::SendToLog()
> @ 0x7facb0d0999c google::LogMessage::Flush()
> @ 0x7facb0d09c06 google::LogMessageFatal::~LogMessageFatal()
> @ 0x7facb0a96837
> mesos::internal::slave::CgroupsIsolator::_killExecutor()
> @ 0x7facb0aaa6b0 std::tr1::_Mem_fn<>::operator()()
> @ 0x7facb0aabdce std::tr1::_Bind<>::operator()<>()
> @ 0x7facb0aabdfd std::tr1::_Function_handler<>::_M_invoke()
> @ 0x7facb0ab1043 std::tr1::function<>::operator()()
> @ 0x7facb0ab875e process::internal::vdispatcher<>()
> @ 0x7facb0ab9b98 std::tr1::_Bind<>::operator()<>()
> @ 0x7facb0ab9bed std::tr1::_Function_handler<>::_M_invoke()
> @ 0x7facb0c09059 std::tr1::function<>::operator()()
> @ 0x7facb0bcf54d process::ProcessBase::visit()
> @ 0x7facb0be43ca process::DispatchEvent::visit()
> @ 0x5fcd90 process::ProcessBase::serve()
> @ 0x7facb0bd8e3d process::ProcessManager::resume()
> @ 0x7facb0bd9688 process::schedule()
> @ 0x7facafcb473d start_thread
> @ 0x7facae698f6d clone
> The process state of tasks in cgroup are either in un-interruptible sleep
> ('D') or traced ('T'):
> [vinod@smfd-bkq-03-sr4
> framework_201305172240-1740121354-46893-8981-0000_executor_59f49d23-9b61-4d08-868c-87af1b06a019_tag_8be5f3f8-e0ce-40d6-83dc-9866a984cbb8]$
> cat tasks | xargs ps -F -p
> UID PID PPID C SZ RSS PSR STIME TTY STAT TIME CMD
> root 25761 1 0 91854 15648 4 22:39 ? Dl 0:00
> /home/vinod/mesos/build/src/.libs/lt-mesos-executor
> root 25802 25761 0 14734 544 13 22:39 ? Ts 0:00 sleep 1000
> root 25804 25761 0 15961 1296 7 22:39 ? D 0:00 /bin/bash
> /home/vinod/mesos/build/../src/scripts/killtree.sh -p 25802 -s 15 -g -x -v
> root 25814 25804 0 15961 224 14 22:39 ? D 0:00 /bin/bash
> /home/vinod/mesos/build/../src/scripts/killtree.sh -p 25802 -s 15 -g -x -v
> gdb hangs when trying to attach to the mesos executor, likely because its in
> 'D' state.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira