Hi Dennis,

First thing I would upgrade to a recent release, either v2-stable or
v3-stable. The version you are using is heavily outdated and looking
into the issue with that version simply makes no sense. Plus, I think
chances are good the problem will disappear with the new versions.

Rainer 

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Dennis Ordanov
> Sent: Tuesday, February 17, 2009 6:30 AM
> To: [email protected]
> Subject: [rsyslog] rsyslog eating FDs stops logging locally 
> or remotely andeventually dies
> 
> Hello Everyone,
> 
> I use syslog to log locally and remotely via stunnel which is bound to
> the loopback address.  It seems syslog will steadily use up FDs until
> it runs into the per process limit on my oBSD boxes and then either
> stops logging locally or forwarding traffic to stunnel, I don't know
> if this is a problem with stunnel or syslog or how to tell, but
> something is causing it to open a new file descriptor or unable to
> re-use another one or something...?
> 
> 
> I can just restart syslogd with a cron job weekly and increase the
> file descriptor limit, but that's not really a path I want to go down
> if I don't have to.
> 
> If you think it will be useful to run syslogd in debug mode, but it
> can take a week for this problem to occur...
> 
> I have 4 hypothesis of why this might be hapening:
> 1) syslog's interaction with stunnel is causing it to just to use more
> and more FDs
> 2) regarding #1, if there is a problem with stunnel accepting
> connections or being too overloaded or not being able to connect to
> the remote stunnel gateway then maybe its not accepting new conns to
> it or something?
> 3) something with myconfiguration is instigating this behaviour...?
> 4) none of the above
> 
> Here is a box that will soon be in a broken state:
> 
> r...@hostname# /usr/sbin/syslogd -v
> rsyslogd 1.12.2, compiled with:
>         FEATURE_REGEXP
>         FEATURE_LARGEFILE
>         SYSLOG_INET (Internet/remote support)
> r...@hostname# uname -a
> OpenBSD hostname 4.1 GENERIC#1435 i386
> 
> # ulimit -a
> time(cpu-seconds)    unlimited
> file(blocks)         unlimited
> coredump(blocks)     unlimited
> data(kbytes)         1048576
> stack(kbytes)        8192
> lockedmem(kbytes)    153844
> memory(kbytes)       460268
> nofiles(descriptors) 128
> processes            532
> 
> I am running ktrace on this pid until I see it use another file
> descriptor being used by this process, right now at 109 it looks like.
> Come to think of it maybe I should be tracing stunnel too?
> 
> r...@host# fstat -n |grep syslog
> USER     CMD          PID   FD MOUNT        INUM MODE       
> R/W    DV|SZ
> root     syslogd    20085   wd  0,0         2       40755  r      512
> root     syslogd    20085    0* unix dgram 0xd14f6a00
> root     syslogd    20085    1* internet stream tcp
> root     syslogd    20085    2* internet stream tcp
> root     syslogd    20085    3* internet stream tcp
> root     syslogd    20085    4* internet stream tcp
> root     syslogd    20085    5* internet stream tcp
> root     syslogd    20085    6* internet stream tcp
> root     syslogd    20085    7* internet stream tcp
> root     syslogd    20085    8* internet stream tcp
> root     syslogd    20085    9* internet stream tcp
> root     syslogd    20085   10* internet stream tcp
> root     syslogd    20085   11* internet stream tcp
> root     syslogd    20085   12* internet stream tcp
> root     syslogd    20085   13* internet stream tcp
> root     syslogd    20085   14* internet stream tcp
> root     syslogd    20085   15* internet stream tcp
> root     syslogd    20085   16* unix dgram 0xd14048c0
> root     syslogd    20085   17* internet dgram udp *:514
> root     syslogd    20085   18* internet stream tcp
> root     syslogd    20085   19* internet stream tcp
> root     syslogd    20085   20* internet stream tcp
> root     syslogd    20085   21* internet stream tcp
> root     syslogd    20085   22* internet stream tcp
> root     syslogd    20085   23* internet stream tcp
> root     syslogd    20085   24* internet stream tcp
> root     syslogd    20085   25* internet stream tcp
> root     syslogd    20085   26* internet stream tcp
> root     syslogd    20085   27* internet stream tcp
> root     syslogd    20085   28* internet stream tcp
> root     syslogd    20085   29* internet stream tcp
> root     syslogd    20085   30* internet stream tcp
> root     syslogd    20085   31* internet stream tcp
> root     syslogd    20085   32* internet stream tcp
> root     syslogd    20085   33* internet stream tcp
> root     syslogd    20085   34* internet stream tcp
> root     syslogd    20085   35* internet stream tcp
> root     syslogd    20085   36* internet stream tcp
> root     syslogd    20085   37* internet stream tcp
> root     syslogd    20085   38* internet stream tcp
> root     syslogd    20085   39* internet stream tcp
> root     syslogd    20085   40* internet stream tcp
> root     syslogd    20085   41* internet stream tcp
> root     syslogd    20085   42* internet stream tcp
> root     syslogd    20085   43* internet stream tcp
> root     syslogd    20085   44* internet stream tcp
> root     syslogd    20085   45* internet stream tcp
> root     syslogd    20085   46* internet stream tcp
> root     syslogd    20085   47* internet stream tcp
> root     syslogd    20085   48* internet stream tcp
> root     syslogd    20085   49* internet stream tcp
> root     syslogd    20085   50* internet stream tcp
> root     syslogd    20085   51* internet stream tcp
> root     syslogd    20085   52* internet stream tcp
> root     syslogd    20085   53* internet stream tcp
> root     syslogd    20085   54* internet stream tcp
> root     syslogd    20085   55* internet stream tcp
> root     syslogd    20085   56* internet stream tcp
> root     syslogd    20085   57* internet stream tcp
> root     syslogd    20085   58* internet stream tcp
> root     syslogd    20085   59* internet stream tcp
> root     syslogd    20085   60* internet stream tcp 0xd6906648
> 127.0.0.1:4392 --> 127.0.0.1:5140
> root     syslogd    20085   61* internet stream tcp
> root     syslogd    20085   62* internet stream tcp
> root     syslogd    20085   63  0,4    844952      100644  w    81695
> root     syslogd    20085   64* internet stream tcp
> root     syslogd    20085   65* internet stream tcp
> root     syslogd    20085   66  0,4    844952      100644  w    81695
> root     syslogd    20085   67* internet stream tcp
> root     syslogd    20085   68* internet stream tcp
> root     syslogd    20085   69  0,4    844984      100644  w       73
> root     syslogd    20085   70* internet stream tcp
> root     syslogd    20085   71* internet stream tcp
> root     syslogd    20085   72  0,4    844984      100644  w       73
> root     syslogd    20085   73* internet stream tcp
> root     syslogd    20085   74* internet stream tcp
> root     syslogd    20085   75* internet stream tcp
> root     syslogd    20085   76  0,4    844984      100644  w       73
> root     syslogd    20085   77* internet stream tcp
> root     syslogd    20085   78* internet stream tcp
> root     syslogd    20085   79* internet stream tcp
> root     syslogd    20085   80  0,4    844969      100644  w  3437673
> root     syslogd    20085   81* internet stream tcp
> root     syslogd    20085   82* internet stream tcp
> root     syslogd    20085   83  0,4    844976      100644  w      442
> root     syslogd    20085   84* internet stream tcp
> root     syslogd    20085   85* internet stream tcp
> root     syslogd    20085   86  0,4    844976      100644  w      442
> root     syslogd    20085   87* internet stream tcp
> root     syslogd    20085   88* internet stream tcp
> root     syslogd    20085   89  0,4    844930      100640  w    18747
> root     syslogd    20085   90* internet stream tcp
> root     syslogd    20085   91* internet stream tcp
> root     syslogd    20085   92  0,4    844936      100600  w       74
> root     syslogd    20085   93* internet stream tcp
> root     syslogd    20085   94* internet stream tcp
> root     syslogd    20085   95  0,4   2328711      100600  w    46522
> root     syslogd    20085   96* internet stream tcp
> root     syslogd    20085   97* internet stream tcp
> root     syslogd    20085   98  0,4    844972      100640  w      476
> root     syslogd    20085   99* internet stream tcp
> root     syslogd    20085  100* internet stream tcp
> root     syslogd    20085  101  0,4    844941      100640  w        0
> root     syslogd    20085  102* internet stream tcp
> root     syslogd    20085  103* internet stream tcp
> root     syslogd    20085  104  0,4    844935      100640  w        0
> root     syslogd    20085  105* internet stream tcp
> root     syslogd    20085  106* internet stream tcp
> root     syslogd    20085  107  0,4    844931      100600  w       74
> root     syslogd    20085  108* internet stream tcp 0xd694c7d4
> 127.0.0.1:19723 --> 127.0.0.1:5140
> root     syslogd    20085  109* internet stream tcp 0xd694c964
> 127.0.0.1:42849 --> 127.0.0.1:5140
> 
> 
> r...@host# netstat -an
> Active Internet connections (including servers)
> Proto Recv-Q Send-Q  Local Address          Foreign Address   
>      (state)
> tcp        0     32  172.20.20.51.22        remote.63117     
> ESTABLISHED
> tcp        0      0  172.20.20.51.38090     remote.443        
> ESTABLISHED
> tcp        0      0  127.0.0.1.5140         127.0.0.1.42849   
>      ESTABLISHED
> tcp        0      0  127.0.0.1.42849        127.0.0.1.5140    
>      ESTABLISHED
> tcp        0      0  172.20.20.51.19898     remote.443        
> ESTABLISHED
> tcp        0      0  127.0.0.1.5140         127.0.0.1.19723   
>      ESTABLISHED
> tcp        0      0  127.0.0.1.19723        127.0.0.1.5140    
>      ESTABLISHED
> tcp        0      0  172.20.20.51.5494      remote.443        
> ESTABLISHED
> tcp        0      0  127.0.0.1.5140         127.0.0.1.4392    
>      ESTABLISHED
> tcp        0      0  127.0.0.1.4392         127.0.0.1.5140    
>      ESTABLISHED
> tcp        0      0  *.22                   *.*               
>      LISTEN
> tcp        0      0  127.0.0.1.5140         *.*               
>      LISTEN
> tcp        0      0  127.0.0.1.587          *.*               
>      LISTEN
> tcp        0      0  127.0.0.1.25           *.*               
>      LISTEN
> tcp        0      0  *.37                   *.*               
>      LISTEN
> tcp        0      0  *.13                   *.*               
>      LISTEN
> tcp        0      0  *.113                  *.*               
>      LISTEN
> Active Internet connections (including servers)
> Proto Recv-Q Send-Q  Local Address          Foreign Address   
>      (state)
> udp        0      0  172.20.20.51.2947      172.20.20.92.123
> udp        0      0  172.20.20.51.12927     172.20.20.91.123
> udp        0      0  *.514                  *.*
> udp        0      0  10.144.73.23.123       *.*
> udp        0      0  10.144.73.21.123       *.*
> udp        0      0  172.20.20.51.123       *.*
> udp        0      0  127.0.0.1.123          *.*
> udp        0      0  127.0.0.1.512          *.*
> 
> 
> r...@host# fstat|grep stunnel
> USER     CMD          PID   FD MOUNT        INUM MODE       
> R/W    DV|SZ
> _stunnel stunnel    32055 root /var      6141184 drwxr-xr-x   
> r      512
> _stunnel stunnel    32055   wd /var      6141184 drwxr-xr-x   
> r      512
> _stunnel stunnel    32055    0 /          166117 crw-rw-rw-  
> rw     null
> _stunnel stunnel    32055    1 /          166117 crw-rw-rw-  
> rw     null
> _stunnel stunnel    32055    2 /          166117 crw-rw-rw-  
> rw     null
> _stunnel stunnel    32055    3 pipe 0xe9505e10 state:
> _stunnel stunnel    32055    4 pipe 0xe9505e10 state:
> _stunnel stunnel    32055    5 /          165853 crw-rw-rw-  
> rw   crypto
> _stunnel stunnel    32055    6* internet stream tcp 0xd6906e18
> 127.0.0.1:5140 <-- 127.0.0.1:4392
> _stunnel stunnel    32055    7 pipe 0xe95057e0 state:
> _stunnel stunnel    32055    8 pipe 0xe95057e0 state:
> _stunnel stunnel    32055    9* internet stream tcp 
> 0xd694cc84 127.0.0.1:5140
> _stunnel stunnel    32055   10* internet stream tcp 0xd694c644
> 172.20.20.51:5494 --> remote:443
> _stunnel stunnel    32055   11* internet stream tcp 0xd694c4b4
> 127.0.0.1:5140 <-- 127.0.0.1:19723
> _stunnel stunnel    32055   12* internet stream tcp 0xd694ce14
> 172.20.20.51:19898 --> remote:443
> _stunnel stunnel    32055   13* internet stream tcp 0xd694caf4
> 127.0.0.1:5140 <-- 127.0.0.1:42849
> _stunnel stunnel    32055   14* internet stream tcp 0xd68cb19c
> 172.20.20.51:38090 --> remote:443
> 
> # ulimit -a
> time(cpu-seconds)    unlimited
> file(blocks)         unlimited
> coredump(blocks)     unlimited
> data(kbytes)         1048576
> stack(kbytes)        8192
> lockedmem(kbytes)    153844
> memory(kbytes)       460268
> nofiles(descriptors) 128
> processes            532
> 
> r...@hostname# /usr/sbin/syslogd -v
> rsyslogd 1.12.2, compiled with:
>         FEATURE_REGEXP
>         FEATURE_LARGEFILE
>         SYSLOG_INET (Internet/remote support)
> r...@hostname# uname -a
> OpenBSD hostname 4.1 GENERIC#1435 i386
> 
> Here is how its getting started out of rc:
> 
> syslogd_flags="-h -i /var/run/syslog.pid -m 0 -r 514"  # 
> flags for rsyslogd
> 
> Process Entries:
> # ps -axwww|egrep '[s]yslog|[s]tunnel'
> 32055 ??  Is      2:22.48 /usr/local/sbin/stunnel
> 20085 ??  Is      4:50.88 syslogd -h -i /var/run/syslog.pid -m 0 -r
> 514 -a /var/empty/dev/log
> 
> Here is the config:
> 
> /etc/rsyslog.conf
> # Template to include time received by the Admin Server when forwarded
> to the Data Center.
> # Juniper Messages are not passed with a timestamp.
> 
> $template MissingDate,"<%PRI%>%timegenerated% %HOSTNAME% 
> %syslogtag%%msg%"
> 
> # Template to remove the syslog tag "root:" for the heartbeat and
> checks when forwarded to the Data Center.
> 
> $template NoSyslogTag,"<%PRI%>%timegenerated% %HOSTNAME% %msg%"
> 
> # Template to allow for easier reading of the Cisco logs.
> # Include a text designation for the type of Cisco equipment.
> # Start the message at position at offset 19 to strip out time stamp.
> 
> $template CiscoSW1,"%TIMESTAMP% %HOSTNAME% Switch1: 
> %msg:19:500:drop-last-lf%\n"
> $template CiscoSW2,"%TIMESTAMP% %HOSTNAME% Switch2: 
> %msg:19:500:drop-last-lf%\n"
> $template CiscoTS1,"%TIMESTAMP% %HOSTNAME% Term1: 
> %msg:19:500:drop-last-lf%\n"
> 
> # Forward messages from the admin server heartbeat and checks based on
> message id
> :msg, contains, "NHB10001:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "CHK10002:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "CHK10003:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "CHK10004:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "CHK10005:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10006:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10007:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10008:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10009:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10011:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10012:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10015:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10016:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "ATH10017:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "CLP10020:"
> @@127.0.0.1:5140;NoSyslogTag
> :msg, contains, "CLP10021:"
> @@127.0.0.1:5140;NoSyslogTag
> 
> # Forward messages from juniper nodes basd on message id
> :msg, contains, "ADM10310:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ADM20255:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ADM20928:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ADM22798:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ADM23046:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ADM24336:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ADM24337:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ARC22051:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ARC23037:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ARC23038:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ARC23039:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT10301:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT21060:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT21089:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT22677:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT22678:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT22696:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT23391:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT23551:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT23552:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT24080:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT24417:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "AUT24418:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20146:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20147:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20148:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20149:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20150:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20151:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20152:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20153:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20154:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20155:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20643:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20644:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR20645:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR24016:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR24019:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "ERR24076:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "LIC10200:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10062:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10087:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10088:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10089:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10090:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10091:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10092:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10093:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10094:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10298:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10299:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS10314:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS23041:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS23402:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS23409:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS24015:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS24020:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS24316:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS24317:"
> @@127.0.0.1:5140;MissingDate
> :msg, contains, "SYS24348:"
> @@127.0.0.1:5140;MissingDate
> 
> # Forward messages from F5 nodes based on lb hostnames
> :HOSTNAME, contains, "lb1-"                             
> @@127.0.0.1:5140
> :HOSTNAME, contains, "lb2-"                             
> @@127.0.0.1:5140
> 
> # Log F5 messages locally for archival purposes based on lb hostnames
> :HOSTNAME, contains, "lb1-"                             
> /var/log/f5.log
> :HOSTNAME, contains, "lb2-"                             
> /var/log/f5.log
> 
> # Log Cisco messages locally for archival purposes based on 
> ip hostnames
> :HOSTNAME, contains, "172.20.20.101"                    
> /var/log/cisco.log
> :HOSTNAME, contains, "172.20.20.102"                    
> /var/log/cisco.log
> :HOSTNAME, contains, "172.20.20.227"                    
> /var/log/cisco.log
> 
> # Discard lb1/2 and cisco messages from further processing
> :HOSTNAME, contains, "lb1-"                             ~
> :HOSTNAME, contains, "lb2-"                             ~
> :HOSTNAME, contains, "172.20.20.101"                    ~
> :HOSTNAME, contains, "172.20.20.102"                    ~
> :HOSTNAME, contains, "172.20.20.227"                    ~
> 
> # Log local7 messages locally for archival purposes
> local7.*                                                
> /var/log/local7.log
> 
> *.notice;\
> auth,authpriv,cron,ftp,kern,lpr,mail,user,local7.none   
> /var/log/messages
> kern.debug;syslog,user.info                             
> /var/log/messages
> auth.info                                               
> /var/log/authlog
> authpriv.debug                                          
> /var/log/secure
> cron.info                                               /var/cron/log
> daemon.info                                             
> /var/log/daemon
> ftp.info                                                
> /var/log/xferlog
> lpr.debug                                               
> /var/log/lpd-errs
> mail.info                                               
> /var/log/maillog
> #uucp.info                                              /var/log/uucp
> 
> # Everyone gets emergency messages.
> *.emerg
> 
> I've tried to look on the net for anything that had to do with syslog
> and file descriptors and or how these problems happen and coming out
> with pretty much squat..
> 
> Thank you,
> Dennis O.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
> 
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to