In this instance there were three old slave directories, and there are three log lines in the mesos-slave.INFO file;
I0708 11:24:52.023453 2425 slave.cpp:3499] Garbage collecting old slave 20150515-105200-84152492-5050-9915-S46 I0708 11:24:52.023923 2425 slave.cpp:3499] Garbage collecting old slave 20150217-184553-67375276-5050-18563-S74 I0708 11:24:52.023921 2428 gc.cpp:56] Scheduling '/mnt/mesos/mesos-slave/slaves/20150515-105200-84152492-5050-9915-S46' for gc 6.99999972599407days in the future I0708 11:24:52.054704 2425 slave.cpp:3499] Garbage collecting old slave 20150515-105200-84152492-5050-9915-S22 I0708 11:24:52.054723 2424 gc.cpp:56] Scheduling '/mnt/mesos/mesos-slave/slaves/20150217-184553-67375276-5050-18563-S74' for gc 6.99999937182815days in the future I0708 11:24:52.067934 2425 gc.cpp:56] Scheduling '/mnt/mesos/mesos-slave/slaves/20150515-105200-84152492-5050-9915-S22' for gc 6.99999922252444days in the future This happens right after the recovery process finishes after the slave boots up. I've looked at another slave that's currently at 99% disk capacity and the slave has been up since 27th May 2015, it also has the "Garbage collecting old slave" log lines just after boot for ~6 days. Looking a little deeper in to this slave logs; this looks like an interesting error; W0527 17:35:08.935755 1749 gc.cpp:139] Failed to delete '/mnt/mesos/mesos-slave/slaves/20150217-184553-67375276-5050-18563-S72': Directory not empty I think I actually discussed this with BenH a while back, we're running 0.21.0 on this cluster. Anyone else seen this before? Using the standard `rm` unix tool clears out the directories fine currently, running as the same user as the slave (root). -- Tom Arnfeld Senior Developer // DueDil On Wed, Jul 8, 2015 at 7:00 PM, Vinod Kone <vinodk...@gmail.com> wrote: > On Wed, Jul 8, 2015 at 10:54 AM, Tom Arnfeld <t...@duedil.com> wrote: >> When this happens the old slave directories appear not to be tracked by >> the mesos GC process, and stay around indefinitely. Over time if enough >> full slave restarts happen (say, due to reconfiguration) the disks can be >> completely filled and the mesos slave won't do anything about it. >> > This shouldn't happen. Old slave directories should be gc'ed by the slave > based on their last modification time > <https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L4059>. Do > you see any log lines with "Garbage collecting old slave" ?