> On Mon, Jan 13, 2014 at 9:22 AM, dormando <[email protected]> wrote:
>
> > There's no forwarder from the central server, if I understand what you
> > were saying. Also given the strace of the select (which is selecting for
> > nothing and waiting forever?) is it checking for something inbetween the
> > select runs?
> >
> >
> OK, the configs I think look good. What exactly it is waiting for is hard
> to tell without context. Can you gather a debug log while it is in this
> state. Details are here:
>
> http://www.rsyslog.com/doc/troubleshoot.html
>
> If a full debug log is too much to do, you may want to use debug on demand:
>
> http://www.rsyslog.com/how-to-use-debug-on-demand/
>
> If that's also impossible, at least many more strace lines PLUS information
> on what the thread name is (obtained via top) may also help us get closer.

on-demand might work. It'll take a few days to iterate through this due to
the infrequency of the hang. Thought it was hung just now but it started
working again... I guess it could also be intermittent. Maybe I'll set up
a script to watch for log gaps and snapshot the debug stuff.

Thanks for your fast responses!


> > Edge servers -> central server -> disk on central server.
> >
> > The complete server:
> >
> > $ModLoad imuxsock
> > $IMUXSockRateLimitInterval 0
> > $ModLoad imklog
> > $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
> >
> > $FileOwner root
> > $FileGroup adm
> > $FileCreateMode 0640
> >
> > IncludeConfig /etc/rsyslog.d/*.conf
> >
> > (then in an included config:)
> >
> > $DefaultNetstreamDriver gtls
> >
> > $MainMsgQueueType FixedArray
> > $MainMsgQueueSize 1000000
> > $MainMsgQueueWorkerThreads 4
> > $MainMsgQueueDequeueBatchSize 2000
> >
> > $DefaultNetstreamDriverCAFile ca.pem
> > $DefaultNetstreamDriverCertFile cert.pem
> > $DefaultNetstreamDriverKeyFile key.pem
> >
> > $ModLoad imtcp
> > $InputTCPMaxSessions 1000
> > $InputTCPServerStreamDriverMode 1
> > $InputTCPServerStreamDriverAuthMode anon
> > $InputTCPServerRun 514
> >
> > $template SomeFile,"/data/log/some/some-%$NOW%.log"
> > :programname, startswith, "some" -?SomeFile
> > & stop
> >
> > # Then two more of the above with different names
> >
> > $template AuthFile,"/data/log/auth/auth-%$NOW%.log"
> > $template CronFile,"/data/log/cron/cron-%$NOW%.log"
> > $template DaemonFile,"/data/log/daemon/daemon-%$NOW%.log"
> > $template KernFile,"/data/log/kern/kern-%$NOW%.log"
> > $template MailFile,"/data/log/mail/mail-%$NOW%.log"
> > $template UserFile,"/data/log/user/user-%$NOW%.log"
> > $template SyslogFile,"/data/log/syslog/syslog-%$NOW%.log"
> > auth,authpriv.*         -?AuthFile
> > cron.*              -?CronFile
> > daemon.*            -?DaemonFile
> > kern.*              -?KernFile
> > mail.*              -?MailFile
> > user.*              -?UserFile
> > *.*;auth,authpriv,daemon,kern,mail,user,cron.none       -?SyslogFile
> >
> >
> > Then for the client:
> >
> > $ModLoad imuxsock
> > $IMUXSockRateLimitInterval 0
> > $ModLoad imklog
> > $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
> >
> > $FileOwner root
> > $FileGroup adm
> > $FileCreateMode 0640
> >
> > IncludeConfig /etc/rsyslog.d/*.conf
> >
> > (then in an included config:)
> >
> > $DefaultNetstreamDriverCAFile ca.pem
> >
> > $SystemLogRateLimitInterval 2
> > $SystemLogRateLimitBurst 10000
> >
> > $ActionQueueType LinkedList
> > $ActionQueueSize 40000
> > $ActionQueueDiscardMark 38000
> > $ActionQueueTimeoutEnqueue 0
> > $ActionResumeRetryCount -1
> >
> > # set up the action
> > $DefaultNetstreamDriver gtls
> > $ActionSendStreamDriverMode 1
> > $ActionSendStreamDriverAuthMode anon
> > *.* @@(o)logs.somesite.net:514
> >
> > auth,authpriv.*         /var/log/auth.log
> > *.*;auth,authpriv.none      -/var/log/syslog
> > #cron.*             /var/log/cron.log
> > daemon.*            -/var/log/daemon.log
> > kern.*              -/var/log/kern.log
> > lpr.*               -/var/log/lpr.log
> > mail.*              -/var/log/mail.log
> > user.*              -/var/log/user.log
> > mail.info           -/var/log/mail.info
> > mail.warn           -/var/log/mail.warn
> > mail.err            /var/log/mail.err
> > *.=debug;\
> >     auth,authpriv.none;\
> >     news.none;mail.none -/var/log/debug
> > *.=info;*.=notice;*.=warn;\
> >     auth,authpriv.none;\
> >     cron,daemon.none;\
> >     mail,news.none      -/var/log/messages
> >
> >
> > There's nothing the server should ever be blocking on so far as I can
> > tell? It doesn't seem to self recover when it gets into this state, and no
> > logs are being written.
> >
> > On Mon, 13 Jan 2014, Rainer Gerhards wrote:
> >
> > > You should post the complete config, so that we do not need to guess. My
> > > *guess* is that you run a TCP forwarded on the main queue, the remote
> > > machine blocks and the forwarded backs up.
> > >
> > > Rainer
> > >
> > >
> > > On Mon, Jan 13, 2014 at 9:00 AM, dormando <[email protected]> wrote:
> > >
> > > > Hi,
> > > >
> > > > I've been experiencing a periodic hang in rsyslog 7.4.7 on a central
> > > > logging node that we have.
> > > >
> > > > Everything runs fine for anywhere betwen 6-48 hours, then one thread
> > will
> > > > hang at 100% CPU, with strace showing:
> > > > select(1, NULL, NULL, NULL, {82024, 797024}^C <unfinished ...>
> > > >
> > > > The selects are always similar. "1, NULL, NULL, NULL" and an increasing
> > > > timeout.
> > > >
> > > > The configuration is minimal.. using the old syntax since we just
> > recently
> > > > upgraded and I'm lazy.
> > > >
> > > > $DefaultNetstreamDriver gtls
> > > >
> > > > $MainMsgQueueType FixedArray
> > > > $MainMsgQueueSize 1000000
> > > > $MainMsgQueueWorkerThreads 4
> > > > $MainMsgQueueDequeueBatchSize 2000
> > > >
> > > > (then the DriverCAFile/etc defines)
> > > >
> > > > $ModLoad imtcp
> > > > $InputTCPMaxSessions 1000
> > > > $InputTCPServerStreamDriverMode 1
> > > > $InputTCPServerStreamDriverAuthMode anon
> > > > $InputTCPServerRun 514
> > > >
> > > > ... the rest are just log routing rules and the module imports
> > required to
> > > > make the above lines work.
> > > >
> > > > Host is running ubuntu 12.04.
> > > >
> > > > Clients are very similar. Works over TCP, not using RELP. Don't think
> > any
> > > > of the clients are hanging but I haven't checked too closely.
> > > >
> > > > Unfortunately the hang isn't frequent enough to do any brute force
> > > > debugging. Any ideas what it could be, or what information I could
> > collect
> > > > to track it down faster? If I don't figure this out soon I'll have to
> > try
> > > > something else, and I don't want to do that :(
> > > >
> > > > Thanks!
> > > > -Dormando
> > > > _______________________________________________
> > > > rsyslog mailing list
> > > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > > > http://www.rsyslog.com/professional-services/
> > > > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> > myriad
> > > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > > > DON'T LIKE THAT.
> > > >
> > > _______________________________________________
> > > rsyslog mailing list
> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > > http://www.rsyslog.com/professional-services/
> > > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > DON'T LIKE THAT.
> > >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > DON'T LIKE THAT.
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
> LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to