More info: -The syslog server is running from a storage server -At May 20 23:30:52 one of our biggest dom0 went down, killing about 20-25 virtual servers. -Most of our servers run OpenSuSE.
2012/5/22 Juan Jose Pavlik <[email protected]> > Hi, im running this rsyslog version: > > bigbrother:/var/log/servidores/filomena # rsyslogd -v > rsyslogd 5.8.5, compiled with: > FEATURE_REGEXP: Yes > FEATURE_LARGEFILE: No > GSSAPI Kerberos 5 support: Yes > FEATURE_DEBUG (debug build, slow code): No > 32bit Atomic operations supported: Yes > 64bit Atomic operations supported: No > Runtime Instrumentation (slow code): No > > in my centralized logging server and im getting some troubles i'd really > love to figure out. I've around 170 servers/switches/otherthings logging on > this server, most of them just send auth.* logs, some apaches sending the > access and error logs, and switches sending warns and errors. Sometimes the > rsyslog queues get complettly filled up and it stops writing logs to disk, > this is the exact logs of what happened: > > May 20 23:30:52 bigbrother rsyslogd-pstats: main Q: size=1 > enqueued=6018511 full=0 maxqsize=1607 > May 20 23:31:12 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:31:12 bigbrother rsyslogd-pstats: main Q: size=83 > enqueued=6018951 full=0 maxqsize=1607 > May 20 23:31:32 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:31:32 bigbrother rsyslogd-pstats: main Q: size=140 > enqueued=6019008 full=0 maxqsize=1607 > May 20 23:31:52 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:31:52 bigbrother rsyslogd-pstats: main Q: size=146 > enqueued=6019046 full=0 maxqsize=1607 > May 20 23:32:12 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:32:12 bigbrother rsyslogd-pstats: main Q: size=169 > enqueued=6019101 full=0 maxqsize=1607 > May 20 23:32:32 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:32:32 bigbrother rsyslogd-pstats: main Q: size=333 > enqueued=6019265 full=0 maxqsize=1607 > May 20 23:32:52 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:32:52 bigbrother rsyslogd-pstats: main Q: size=431 > enqueued=6019395 full=0 maxqsize=1607 > May 20 23:33:12 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:33:12 bigbrother rsyslogd-pstats: main Q: size=469 > enqueued=6019433 full=0 maxqsize=1607 > May 20 23:33:32 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:33:32 bigbrother rsyslogd-pstats: main Q: size=574 > enqueued=6019538 full=0 maxqsize=1607 > May 20 23:33:52 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:33:52 bigbrother rsyslogd-pstats: main Q: size=603 > enqueued=6019567 full=0 maxqsize=1607 > May 20 23:34:12 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:34:12 bigbrother rsyslogd-pstats: main Q: size=654 > enqueued=6019650 full=0 maxqsize=1607 > May 20 23:34:32 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:34:32 bigbrother rsyslogd-pstats: main Q: > *size=687*enqueued=6019683 full=0 maxqsize=1607 > May 20 23:34:52 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 20 23:34:52 bigbrother rsyslogd-pstats: main Q: > *size=721*enqueued=6019717 full=0 maxqsize=1607 > May 20 23:48:13 bigbrother rsyslogd-pstats: main Q: > *size=5150*enqueued=6024786 full=0 maxqsize=5150 > May 20 23:48:33 bigbrother rsyslogd-pstats: imuxsock: submitted=3897 > ratelimit.discarded=0 ratelimit.numratelimiters=1505 > May 21 00:43:21 bigbrother rsyslogd-pstats: imuxsock: submitted=3898 > ratelimit.discarded=0 ratelimit.numratelimiters=1506 > May 21 00:43:21 bigbrother rsyslogd-pstats: main Q: > *size=9986*enqueued=6033582 full=1596 maxqsize=10000 > May 21 00:51:20 bigbrother rsyslogd-pstats: main Q: > *size=10000*enqueued=6034304 full=2086 maxqsize=10000 > > > Once *size* reaches 10000 (the default max as far as i know) things get > complicated, rsyslog starts to drop logs and misbehave. The rsyslog > configuration write a per host files into /var/log/servidores/, it also > sends some logs to another rsyslog server and a postgress database running > in another server. 2 weeks ago, i disabled sending logs to the postgress > databse, because i had this same problem and we lost too many hours of > logs. Most of the servers are sending logs by TCP and a few servers and > other devices use UDP. > > Is there a way i can avoid this problem? should i increase the mainqueue > size? use other queues? Any help will be great. Thanks > > -- > Pavlik Salles Juan José > Prosecretaría de Informática - UNC > Área Redes y Servidores > > -- Pavlik Salles Juan José Prosecretaría de Informática - UNC Área Redes y Servidores _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards

