Re: [MESOS-10007] random "Failed to get exit status for Command" for short-lived commands

2019-10-21 Thread Benjamin Mahler
Hi Charles, thanks for the thorough ticket and for surfacing it here for attention, it didn't get spotted amongst the JIRA noise. I replied on the ticket with a patch that should fix the issue, we can discuss further in the ticket. Ben On Sat, Oct 19, 2019 at 7:35 AM Charles-François Natali

[MESOS-10007] random "Failed to get exit status for Command" for short-lived commands

2019-10-19 Thread Charles-François Natali
Hi, I'm wondering if there's anything I could do to help https://issues.apache.org/jira/browse/MESOS-10007 move forward? Basically it's a race condition in libprocess/command executor causing spurious errors to be reported for short-lived tasks. I've got a detailed code path of the race and a