That's useful to know, thanks Vinod. I'll try and dig deeper. On Mon, Sep 8, 2014 at 5:33 AM, Vinod Kone <vinodk...@gmail.com> wrote:
> On Sat, Sep 6, 2014 at 8:23 AM, Tom Arnfeld <t...@duedil.com> wrote: >> If I try and manually remove the directory mentioned, it works fine. Is >> this a known issue, or should I do a little more debugging? I've not tried >> to reproduce it under specific conditions yet. >> >> > This is surprising. GC does a recursive directory removal (see os::rmdir() > in stout) using post-order traversal. Definitely some debugging is in order > to see which directory failed and why. Does your sandbox contain any > special files (other than directories and files) like mounts, devices etc? >> As a side note, should mesos perhaps have some kind of retry mechanism for >> GC? Also, will GC still run for an executor if the slave restarts after an >> executor terminates but before the GC process runs? >> > I don't know what the error was above but I doubt a retry would've helped > here. And yes GC runs for a terminated executor when slave restarts.