Repository: kudu
Updated Branches:
  refs/heads/master ecd67486b -> 6de378296


rpc: hook up a callback for libev fatal errors

In troubleshooting a recent cluster issue, I found that the daemon had
run out of file descriptors. This caused libev to abort(), but the error
message wasn't anywhere obvious since the default implementation just
writes to stderr.

Piping this through to a GLog FATAL is more likely to result in an
obvious log message.

It's difficult to write an automated test for this, but I tested by
setting my ulimit to 10 and running rpc-test. This resulted in:

F0809 19:03:39.882194  3358 reactor.cc:108] LibEV fatal error: (libev)
error creating signal/async pipe: Too many open files [24]

Change-Id: I5fa77237a40f43d6bb82e9f1ceecd31d52268f9d
Reviewed-on: http://gerrit.cloudera.org:8080/7633
Tested-by: Kudu Jenkins
Reviewed-by: Matthew Jacobs <m...@cloudera.com>
Reviewed-by: David Ribeiro Alves <davidral...@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/2a99bb3e
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/2a99bb3e
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/2a99bb3e

Branch: refs/heads/master
Commit: 2a99bb3e5864ae4ae4dd6d2dcdff557fed81aa1d
Parents: ecd6748
Author: Todd Lipcon <t...@apache.org>
Authored: Wed Aug 9 19:01:07 2017 -0700
Committer: Todd Lipcon <t...@apache.org>
Committed: Thu Aug 10 18:47:58 2017 +0000

----------------------------------------------------------------------
 src/kudu/rpc/reactor.cc | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/2a99bb3e/src/kudu/rpc/reactor.cc
----------------------------------------------------------------------
diff --git a/src/kudu/rpc/reactor.cc b/src/kudu/rpc/reactor.cc
index cf63672..b3b7ea2 100644
--- a/src/kudu/rpc/reactor.cc
+++ b/src/kudu/rpc/reactor.cc
@@ -96,6 +96,22 @@ Status ShutdownError(bool aborted) {
       Status::Aborted(msg, "", ESHUTDOWN) :
       Status::ServiceUnavailable(msg, "", ESHUTDOWN);
 }
+
+// Callback for libev fatal errors (eg running out of file descriptors).
+// Unfortunately libev doesn't plumb these back through to the caller, but
+// instead just expects the callback to abort.
+//
+// This implementation is slightly preferable to the built-in one since
+// it uses a FATAL log message instead of printing to stderr, which might
+// not end up anywhere useful in a daemonized context.
+void LibevSysErr(const char* msg) throw() {
+  PLOG(FATAL) << "LibEV fatal error: " << msg;
+}
+
+void DoInitLibEv() {
+  ev::set_syserr_cb(LibevSysErr);
+}
+
 } // anonymous namespace
 
 ReactorThread::ReactorThread(Reactor *reactor, const MessengerBuilder& bld)
@@ -620,6 +636,8 @@ Reactor::Reactor(shared_ptr<Messenger> messenger,
       name_(StringPrintf("%s_R%03d", messenger_->name().c_str(), index)),
       closing_(false),
       thread_(this, bld) {
+  static std::once_flag libev_once;
+  std::call_once(libev_once, DoInitLibEv);
 }
 
 Status Reactor::Init() {

Reply via email to