I'm still not sure what exactly is the issue here but we have had couple of gc related fixes included in 0.15.0-rc5. Are you willing to try that out?
On Thu, Dec 26, 2013 at 10:56 AM, Thomas Petr <[email protected]> wrote: > Hi, > > We're running Mesos 0.14.0-rc4 on CentOS from the mesosphere repository. > Last week we had an issue where the mesos-slave process died due running > out of disk space. [1] > > The mesos-slave usage docs mention the "[GC] delay may be shorter > depending on the available disk usage." Does anyone have any insight into > how the GC logic works? Is there a configurable threshold percentage or > amount that will force it to clean up more often? > > If the mesos-slave process is going to die due to lack of disk space, > would it make sense for it to attempt one last GC run before giving up? > > Thanks, > Tom > > > [1] > Could not create logging file: No space left on device > COULD NOT CREATE A LOGGINGFILE 20131221-120618.20562!F1221 12:06:18.978813 > 20567 paths.hpp:333] CHECK_SOME(mkdir): Failed to create executor directory > '/usr/share/hubspot/mesos/slaves/201311111611-3792629514-5050-11268-18/frameworks/Singularity11/executors/singularity-ContactsHadoopDynamicListSegJobs-contacts-wal-dynamic-list-seg-refresher-1387627577839-1-littleslash-us_east_1e/runs/457a8df0-baa7-4d22-a5ac-ba5935ea6032'No > space left on device > *** Check failure stack trace: *** > I1221 12:06:19.008946 20564 cgroups_isolator.cpp:1275] Successfully > destroyed cgroup > mesos/framework_Singularity11_executor_singularity-ContactsTasks-parallel-machines:6988:list-intersection-count:1387565552709-1387627447707-1-littleslash-us_east_1e_tag_fc028903-d303-468d-902a-dade8c22e206 > @ 0x7f2c806bcb5d google::LogMessage::Fail() > @ 0x7f2c806c0b77 google::LogMessage::SendToLog() > @ 0x7f2c806be9f9 google::LogMessage::Flush() > @ 0x7f2c806becfd google::LogMessageFatal::~LogMessageFatal() > @ 0x40f6cf _CheckSome::~_CheckSome() > @ 0x7f2c804492e3 > mesos::internal::slave::paths::createExecutorDirectory() > @ 0x7f2c80418a6d > mesos::internal::slave::Framework::launchExecutor() > @ 0x7f2c80419dd3 mesos::internal::slave::Slave::_runTask() > @ 0x7f2c8042d5d1 std::tr1::_Function_handler<>::_M_invoke() > @ 0x7f2c805d3ae8 process::ProcessManager::resume() > @ 0x7f2c805d3e8c process::schedule() > @ 0x7f2c7fe41851 start_thread > @ 0x7f2c7e78794d clone >

