This is an automated email from the ASF dual-hosted git repository. alexr pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/mesos.git
commit c5cf4d49f47579b5a6cb7afc2f7df7c8f51dc6d0 Author: Andrei Budnik <[email protected]> AuthorDate: Tue Sep 18 19:10:20 2018 +0200 Fixed broken pipe error in IOSwitchboard. We force IOSwitchboard to return a final response to the client for the `ATTACH_CONTAINER_INPUT` call after IO redirects are finished. In this case, we don't read remaining messages from the input stream. So the agent might send an acknowledgment for the request before IOSwitchboard has received remaining messages. We need to delay termination of IOSwitchboard to give it a chance to read the remaining messages. Otherwise, the agent might get `HTTP 500` "broken pipe" while attempting to write the final message. Review: https://reviews.apache.org/r/62187/ --- src/slave/containerizer/mesos/io/switchboard.cpp | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/slave/containerizer/mesos/io/switchboard.cpp b/src/slave/containerizer/mesos/io/switchboard.cpp index 1982d9b..b1bd0c1 100644 --- a/src/slave/containerizer/mesos/io/switchboard.cpp +++ b/src/slave/containerizer/mesos/io/switchboard.cpp @@ -1615,9 +1615,14 @@ IOSwitchboardServerProcess::acknowledgeContainerInputResponse() if (--numPendingAcknowledgments == 0) { // If IO redirects are finished or writing to `stdin` failed we want to // terminate ourselves (after flushing any outstanding messages from our - // message queue). + // message queue). Since IOSwitchboard might receive an acknowledgment for + // the `ATTACH_CONTAINER_INPUT` request before reading a final message from + // the corresponding connection, we need to delay our termination to give + // IOSwitchboard a chance to read the final message. Otherwise, the agent + // might get `HTTP 500` "broken pipe" while attempting to write the final + // message. if (!redirectFinished.future().isPending() || failure.isSome()) { - terminate(self(), false); + after(Seconds(1)).onAny([=]() { terminate(self(), false); }); } } return http::OK();
