On Mon, Jan 13, 2014 at 9:22 AM, dormando <[email protected]> wrote:

> There's no forwarder from the central server, if I understand what you
> were saying. Also given the strace of the select (which is selecting for
> nothing and waiting forever?) is it checking for something inbetween the
> select runs?
>
>
OK, the configs I think look good. What exactly it is waiting for is hard
to tell without context. Can you gather a debug log while it is in this
state. Details are here:

http://www.rsyslog.com/doc/troubleshoot.html

If a full debug log is too much to do, you may want to use debug on demand:

http://www.rsyslog.com/how-to-use-debug-on-demand/

If that's also impossible, at least many more strace lines PLUS information
on what the thread name is (obtained via top) may also help us get closer.

Rainer


> Edge servers -> central server -> disk on central server.
>
> The complete server:
>
> $ModLoad imuxsock
> $IMUXSockRateLimitInterval 0
> $ModLoad imklog
> $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
>
> $FileOwner root
> $FileGroup adm
> $FileCreateMode 0640
>
> IncludeConfig /etc/rsyslog.d/*.conf
>
> (then in an included config:)
>
> $DefaultNetstreamDriver gtls
>
> $MainMsgQueueType FixedArray
> $MainMsgQueueSize 1000000
> $MainMsgQueueWorkerThreads 4
> $MainMsgQueueDequeueBatchSize 2000
>
> $DefaultNetstreamDriverCAFile ca.pem
> $DefaultNetstreamDriverCertFile cert.pem
> $DefaultNetstreamDriverKeyFile key.pem
>
> $ModLoad imtcp
> $InputTCPMaxSessions 1000
> $InputTCPServerStreamDriverMode 1
> $InputTCPServerStreamDriverAuthMode anon
> $InputTCPServerRun 514
>
> $template SomeFile,"/data/log/some/some-%$NOW%.log"
> :programname, startswith, "some" -?SomeFile
> & stop
>
> # Then two more of the above with different names
>
> $template AuthFile,"/data/log/auth/auth-%$NOW%.log"
> $template CronFile,"/data/log/cron/cron-%$NOW%.log"
> $template DaemonFile,"/data/log/daemon/daemon-%$NOW%.log"
> $template KernFile,"/data/log/kern/kern-%$NOW%.log"
> $template MailFile,"/data/log/mail/mail-%$NOW%.log"
> $template UserFile,"/data/log/user/user-%$NOW%.log"
> $template SyslogFile,"/data/log/syslog/syslog-%$NOW%.log"
> auth,authpriv.*         -?AuthFile
> cron.*              -?CronFile
> daemon.*            -?DaemonFile
> kern.*              -?KernFile
> mail.*              -?MailFile
> user.*              -?UserFile
> *.*;auth,authpriv,daemon,kern,mail,user,cron.none       -?SyslogFile
>
>
> Then for the client:
>
> $ModLoad imuxsock
> $IMUXSockRateLimitInterval 0
> $ModLoad imklog
> $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
>
> $FileOwner root
> $FileGroup adm
> $FileCreateMode 0640
>
> IncludeConfig /etc/rsyslog.d/*.conf
>
> (then in an included config:)
>
> $DefaultNetstreamDriverCAFile ca.pem
>
> $SystemLogRateLimitInterval 2
> $SystemLogRateLimitBurst 10000
>
> $ActionQueueType LinkedList
> $ActionQueueSize 40000
> $ActionQueueDiscardMark 38000
> $ActionQueueTimeoutEnqueue 0
> $ActionResumeRetryCount -1
>
> # set up the action
> $DefaultNetstreamDriver gtls
> $ActionSendStreamDriverMode 1
> $ActionSendStreamDriverAuthMode anon
> *.* @@(o)logs.somesite.net:514
>
> auth,authpriv.*         /var/log/auth.log
> *.*;auth,authpriv.none      -/var/log/syslog
> #cron.*             /var/log/cron.log
> daemon.*            -/var/log/daemon.log
> kern.*              -/var/log/kern.log
> lpr.*               -/var/log/lpr.log
> mail.*              -/var/log/mail.log
> user.*              -/var/log/user.log
> mail.info           -/var/log/mail.info
> mail.warn           -/var/log/mail.warn
> mail.err            /var/log/mail.err
> *.=debug;\
>     auth,authpriv.none;\
>     news.none;mail.none -/var/log/debug
> *.=info;*.=notice;*.=warn;\
>     auth,authpriv.none;\
>     cron,daemon.none;\
>     mail,news.none      -/var/log/messages
>
>
> There's nothing the server should ever be blocking on so far as I can
> tell? It doesn't seem to self recover when it gets into this state, and no
> logs are being written.
>
> On Mon, 13 Jan 2014, Rainer Gerhards wrote:
>
> > You should post the complete config, so that we do not need to guess. My
> > *guess* is that you run a TCP forwarded on the main queue, the remote
> > machine blocks and the forwarded backs up.
> >
> > Rainer
> >
> >
> > On Mon, Jan 13, 2014 at 9:00 AM, dormando <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > I've been experiencing a periodic hang in rsyslog 7.4.7 on a central
> > > logging node that we have.
> > >
> > > Everything runs fine for anywhere betwen 6-48 hours, then one thread
> will
> > > hang at 100% CPU, with strace showing:
> > > select(1, NULL, NULL, NULL, {82024, 797024}^C <unfinished ...>
> > >
> > > The selects are always similar. "1, NULL, NULL, NULL" and an increasing
> > > timeout.
> > >
> > > The configuration is minimal.. using the old syntax since we just
> recently
> > > upgraded and I'm lazy.
> > >
> > > $DefaultNetstreamDriver gtls
> > >
> > > $MainMsgQueueType FixedArray
> > > $MainMsgQueueSize 1000000
> > > $MainMsgQueueWorkerThreads 4
> > > $MainMsgQueueDequeueBatchSize 2000
> > >
> > > (then the DriverCAFile/etc defines)
> > >
> > > $ModLoad imtcp
> > > $InputTCPMaxSessions 1000
> > > $InputTCPServerStreamDriverMode 1
> > > $InputTCPServerStreamDriverAuthMode anon
> > > $InputTCPServerRun 514
> > >
> > > ... the rest are just log routing rules and the module imports
> required to
> > > make the above lines work.
> > >
> > > Host is running ubuntu 12.04.
> > >
> > > Clients are very similar. Works over TCP, not using RELP. Don't think
> any
> > > of the clients are hanging but I haven't checked too closely.
> > >
> > > Unfortunately the hang isn't frequent enough to do any brute force
> > > debugging. Any ideas what it could be, or what information I could
> collect
> > > to track it down faster? If I don't figure this out soon I'll have to
> try
> > > something else, and I don't want to do that :(
> > >
> > > Thanks!
> > > -Dormando
> > > _______________________________________________
> > > rsyslog mailing list
> > > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > > http://www.rsyslog.com/professional-services/
> > > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
> myriad
> > > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> > > DON'T LIKE THAT.
> > >
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com/professional-services/
> > What's up with rsyslog? Follow https://twitter.com/rgerhards
> > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
> >
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you
> DON'T LIKE THAT.
>
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to