On Mon, Jan 13, 2014 at 9:22 AM, dormando <[email protected]> wrote:
> There's no forwarder from the central server, if I understand what you > were saying. Also given the strace of the select (which is selecting for > nothing and waiting forever?) is it checking for something inbetween the > select runs? > > OK, the configs I think look good. What exactly it is waiting for is hard to tell without context. Can you gather a debug log while it is in this state. Details are here: http://www.rsyslog.com/doc/troubleshoot.html If a full debug log is too much to do, you may want to use debug on demand: http://www.rsyslog.com/how-to-use-debug-on-demand/ If that's also impossible, at least many more strace lines PLUS information on what the thread name is (obtained via top) may also help us get closer. Rainer > Edge servers -> central server -> disk on central server. > > The complete server: > > $ModLoad imuxsock > $IMUXSockRateLimitInterval 0 > $ModLoad imklog > $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat > > $FileOwner root > $FileGroup adm > $FileCreateMode 0640 > > IncludeConfig /etc/rsyslog.d/*.conf > > (then in an included config:) > > $DefaultNetstreamDriver gtls > > $MainMsgQueueType FixedArray > $MainMsgQueueSize 1000000 > $MainMsgQueueWorkerThreads 4 > $MainMsgQueueDequeueBatchSize 2000 > > $DefaultNetstreamDriverCAFile ca.pem > $DefaultNetstreamDriverCertFile cert.pem > $DefaultNetstreamDriverKeyFile key.pem > > $ModLoad imtcp > $InputTCPMaxSessions 1000 > $InputTCPServerStreamDriverMode 1 > $InputTCPServerStreamDriverAuthMode anon > $InputTCPServerRun 514 > > $template SomeFile,"/data/log/some/some-%$NOW%.log" > :programname, startswith, "some" -?SomeFile > & stop > > # Then two more of the above with different names > > $template AuthFile,"/data/log/auth/auth-%$NOW%.log" > $template CronFile,"/data/log/cron/cron-%$NOW%.log" > $template DaemonFile,"/data/log/daemon/daemon-%$NOW%.log" > $template KernFile,"/data/log/kern/kern-%$NOW%.log" > $template MailFile,"/data/log/mail/mail-%$NOW%.log" > $template UserFile,"/data/log/user/user-%$NOW%.log" > $template SyslogFile,"/data/log/syslog/syslog-%$NOW%.log" > auth,authpriv.* -?AuthFile > cron.* -?CronFile > daemon.* -?DaemonFile > kern.* -?KernFile > mail.* -?MailFile > user.* -?UserFile > *.*;auth,authpriv,daemon,kern,mail,user,cron.none -?SyslogFile > > > Then for the client: > > $ModLoad imuxsock > $IMUXSockRateLimitInterval 0 > $ModLoad imklog > $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat > > $FileOwner root > $FileGroup adm > $FileCreateMode 0640 > > IncludeConfig /etc/rsyslog.d/*.conf > > (then in an included config:) > > $DefaultNetstreamDriverCAFile ca.pem > > $SystemLogRateLimitInterval 2 > $SystemLogRateLimitBurst 10000 > > $ActionQueueType LinkedList > $ActionQueueSize 40000 > $ActionQueueDiscardMark 38000 > $ActionQueueTimeoutEnqueue 0 > $ActionResumeRetryCount -1 > > # set up the action > $DefaultNetstreamDriver gtls > $ActionSendStreamDriverMode 1 > $ActionSendStreamDriverAuthMode anon > *.* @@(o)logs.somesite.net:514 > > auth,authpriv.* /var/log/auth.log > *.*;auth,authpriv.none -/var/log/syslog > #cron.* /var/log/cron.log > daemon.* -/var/log/daemon.log > kern.* -/var/log/kern.log > lpr.* -/var/log/lpr.log > mail.* -/var/log/mail.log > user.* -/var/log/user.log > mail.info -/var/log/mail.info > mail.warn -/var/log/mail.warn > mail.err /var/log/mail.err > *.=debug;\ > auth,authpriv.none;\ > news.none;mail.none -/var/log/debug > *.=info;*.=notice;*.=warn;\ > auth,authpriv.none;\ > cron,daemon.none;\ > mail,news.none -/var/log/messages > > > There's nothing the server should ever be blocking on so far as I can > tell? It doesn't seem to self recover when it gets into this state, and no > logs are being written. > > On Mon, 13 Jan 2014, Rainer Gerhards wrote: > > > You should post the complete config, so that we do not need to guess. My > > *guess* is that you run a TCP forwarded on the main queue, the remote > > machine blocks and the forwarded backs up. > > > > Rainer > > > > > > On Mon, Jan 13, 2014 at 9:00 AM, dormando <[email protected]> wrote: > > > > > Hi, > > > > > > I've been experiencing a periodic hang in rsyslog 7.4.7 on a central > > > logging node that we have. > > > > > > Everything runs fine for anywhere betwen 6-48 hours, then one thread > will > > > hang at 100% CPU, with strace showing: > > > select(1, NULL, NULL, NULL, {82024, 797024}^C <unfinished ...> > > > > > > The selects are always similar. "1, NULL, NULL, NULL" and an increasing > > > timeout. > > > > > > The configuration is minimal.. using the old syntax since we just > recently > > > upgraded and I'm lazy. > > > > > > $DefaultNetstreamDriver gtls > > > > > > $MainMsgQueueType FixedArray > > > $MainMsgQueueSize 1000000 > > > $MainMsgQueueWorkerThreads 4 > > > $MainMsgQueueDequeueBatchSize 2000 > > > > > > (then the DriverCAFile/etc defines) > > > > > > $ModLoad imtcp > > > $InputTCPMaxSessions 1000 > > > $InputTCPServerStreamDriverMode 1 > > > $InputTCPServerStreamDriverAuthMode anon > > > $InputTCPServerRun 514 > > > > > > ... the rest are just log routing rules and the module imports > required to > > > make the above lines work. > > > > > > Host is running ubuntu 12.04. > > > > > > Clients are very similar. Works over TCP, not using RELP. Don't think > any > > > of the clients are hanging but I haven't checked too closely. > > > > > > Unfortunately the hang isn't frequent enough to do any brute force > > > debugging. Any ideas what it could be, or what information I could > collect > > > to track it down faster? If I don't figure this out soon I'll have to > try > > > something else, and I don't want to do that :( > > > > > > Thanks! > > > -Dormando > > > _______________________________________________ > > > rsyslog mailing list > > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > > http://www.rsyslog.com/professional-services/ > > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad > > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > > > DON'T LIKE THAT. > > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

