they don't understand nonblocking fds. easily visible when you have an
alert script that sleeps for a while, during this time nagios will
happily consume all COU cycles it can get. on EAGAIN, you must not
loop over and over again without any form of sleeping...

the fix is to use poll, of course.
has the nice side effect of lowering nagios' CPU usage consierably.

reported upstream. I think. their lists are weird.

$OpenBSD$
--- base/utils.c.orig   Fri Jan  9 14:20:10 2009
+++ base/utils.c        Fri Jan  9 14:28:24 2009
@@ -610,8 +610,16 @@ int my_system(char *cmd,int timeout,int *early_timeout
                                /* handle errors */
                                if(bytes_read==-1){
                                        /* we encountered a recoverable error, 
so try again */
-                                       if(errno==EINTR || errno==EAGAIN)
+                                       if(errno==EINTR)
                                                continue;
+                                       else if (errno == EAGAIN) {
+                                               struct pollfd   pfd;
+
+                                               pfd.fd = fd[0];
+                                               pfd.revents = POLLIN;
+                                               poll(&pfd, 1, -1);
+                                               continue;
+                                               }
                                        else
                                                break;
                                        }


and the diff for our port

Index: Makefile
===================================================================
RCS file: /cvs/ports/net/nagios/nagios/Makefile,v
retrieving revision 1.33
diff -u -p -u -r1.33 Makefile
--- Makefile    24 Dec 2008 20:03:42 -0000      1.33
+++ Makefile    9 Jan 2009 15:10:24 -0000
@@ -5,7 +5,7 @@ COMMENT-web=    cgis and webpages for nagio
 
 V=             3.0.6
 DISTNAME=      nagios-${V}
-PKGNAME-main=  nagios-${V}
+PKGNAME-main=  nagios-${V}p0
 PKGNAME-web=   nagios-web-${V}
 CATEGORIES=    net
 
Index: patches/patch-base_utils_c
===================================================================
RCS file: patches/patch-base_utils_c
diff -N patches/patch-base_utils_c
--- /dev/null   1 Jan 1970 00:00:00 -0000
+++ patches/patch-base_utils_c  9 Jan 2009 15:10:24 -0000
@@ -0,0 +1,21 @@
+$OpenBSD$
+--- base/utils.c.orig  Fri Jan  9 14:20:10 2009
++++ base/utils.c       Fri Jan  9 14:28:24 2009
+@@ -610,8 +610,16 @@ int my_system(char *cmd,int timeout,int *early_timeout
+                               /* handle errors */
+                               if(bytes_read==-1){
+                                       /* we encountered a recoverable error, 
so try again */
+-                                      if(errno==EINTR || errno==EAGAIN)
++                                      if(errno==EINTR)
+                                               continue;
++                                      else if (errno == EAGAIN) {
++                                              struct pollfd   pfd;
++
++                                              pfd.fd = fd[0];
++                                              pfd.revents = POLLIN;
++                                              poll(&pfd, 1, -1);
++                                              continue;
++                                              }
+                                       else
+                                               break;
+                                       }
 

Reply via email to