[
https://issues.apache.org/jira/browse/MESOS-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664299#comment-16664299
]
Meng Zhu commented on MESOS-7434:
---------------------------------
Observed this again today on mac, the failure is still regarding the `cat`
command which leads to premature container termination. I will cross-post
[~kaysoky]'s comment from Greg's patch above for visibility:
bq. From the logs I've seen, the cat command seems to be exiting due to a pipe
closure.
bq.
bq. In the past, commands like this would be launched sharing the stdin of the
agent process (which in tests, is equal to the test process). But after the
introduction of the IO switchboard, there are more layers to consider:
bq.
bq. 1) If the container is launched with a tty_info (not the case in this
test), the stdin will come from a TTY.
bq. 2) In local mode, the stdin is shared with the parent process.
bq. 3) In normal mode (this test), the stdin will be a pipe to the IO
switchboard server process.
bq.
bq. Perhaps, when the agent gets restarted in the test, it ends up killing the
IO switchboard server somehow? The agent restart is a semi-graceful shutdown,
meaning it may call destructors. In an actual agent restart, there may not be
time to call destructors.
bq.
bq. So TL;DR: Investigate if the IO Switchboard server is dying in some test
runs.
> SlaveTest.RestartSlaveRequireExecutorAuthentication is flaky.
> -------------------------------------------------------------
>
> Key: MESOS-7434
> URL: https://issues.apache.org/jira/browse/MESOS-7434
> Project: Mesos
> Issue Type: Bug
> Environment: Debian 8
> CentOS 6
> other Linux distros
> Reporter: Greg Mann
> Priority: Major
> Labels: flaky, flaky-test, mesosphere
> Attachments: RestartSlaveRequireExecutorAuthentication is
> flaky_failure_log_centos6.txt,
> RestartSlaveRequireExecutorAuthentication_failure_log_debian8.txt,
> SlaveTest.RestartSlaveRequireExecAuth-Ubuntu-16.txt
>
>
> This test failure has been observed on an internal CI system. It occurs on a
> variety of Linux distributions. It seems that using {{cat}} as the task
> command may be problematic; see attached log file
> {{SlaveTest.RestartSlaveRequireExecutorAuthentication.txt}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)