Adar Dembo has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10435 )

Change subject: KUDU-2427: retry more system calls on EINTR
......................................................................

KUDU-2427: retry more system calls on EINTR

In order to collect its own stack traces, Kudu periodically sends itself a
SIGUSR2. The diagnostics log initiates stack collection every 60s, as do
some service queue overflow events. In theory, the collection shouldn't
affect any ongoing syscalls because the SIGUSR2 signal handler is installed
with SA_RESTART; in practice, not all syscalls are restartable, and
precisely categorizing those that are and those that aren't is difficult. As
such, it's really important that we retry every interruptible syscall rather
than surfacing the EINTR up the call stack as a failure.

For whatever reason this happens more frequently on Ubuntu 18.04, though
maybe it's because I've placed my test directory on tmpfs. For example, I
can easily repro a crash due to non-existent retry with the following
command line:

  bin/tablet_server-test --gtest_repeat=1000 --gtest_throw_on_failure \
    --diagnostics_log_stack_traces_interval_ms=100 \
    --unlock_experimental_flags --gtest_filter=*KUDU_177

This patch also fixes KUDU-2151.

Change-Id: I6cce03c4e1b2be32c1910382737526082fc99966
Reviewed-on: http://gerrit.cloudera.org:8080/10435
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin <aser...@cloudera.com>
---
M src/kudu/consensus/log_index.cc
M src/kudu/gutil/macros.h
M src/kudu/gutil/sysinfo.cc
M src/kudu/tools/tool_action_common.cc
M src/kudu/tools/tool_action_test.cc
M src/kudu/util/env-test.cc
M src/kudu/util/env_posix.cc
M src/kudu/util/net/socket.cc
M src/kudu/util/os-util.cc
M src/kudu/util/os-util.h
M src/kudu/util/pstack_watcher-test.cc
M src/kudu/util/pstack_watcher.cc
M src/kudu/util/semaphore.cc
M src/kudu/util/subprocess-test.cc
M src/kudu/util/subprocess.cc
15 files changed, 253 insertions(+), 138 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/10435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6cce03c4e1b2be32c1910382737526082fc99966
Gerrit-Change-Number: 10435
Gerrit-PatchSet: 8
Gerrit-Owner: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to