Re: [rsyslog] rsyslog 5.8.5 problems

david Tue, 22 May 2012 10:59:53 -0700

On Tue, 22 May 2012, Juan Jose Pavlik wrote:

2012/5/22 Tomas Heinrich <[email protected]>

Hi. Just some quick notes.

Upgrade if you can. 5.8.5 is way too old.



I'm running OpenSuSE 12.1 and that's the rsyslog version that comes with
it, i don't like using software out of the repositories. What version
should i try??? I've heard something about bugs in this version..., this is
a good start.

rsyslog improves rapidly, unfortunantly the distros upgrade very slowlyand don't backport many of the fixes.


the current version 5 release is 5.8.11, you should upgrade to that.

On 05/22/2012 03:56 PM, Juan Jose Pavlik wrote:

Right after the queues filled up, it stoped sending logs to the second log
server too.


My guess was that is hangs on one action which fills the main queue which
slows message processing. But forwarding would be the one to suspect if
other outputs are just plain files.



I thought that too, that's way i dissabled the database writing (the db is
in a remote server), i don't think that the other rsyslog is the one
slowing it down. Maybe is a network problem...?

do you have batch writes enabled to the database server? It's verypossible that rsyslog is just unable to write it's output fast enough andthat causes everything to stall. Configuring batch writes (and adjustingthe batch size) will allow you to insert multiple logs in one transaction.In past tests I've been able to write 100+ logs in a transaction atapproximatly the same transactions/sec rate as writing single logs pertransaction.

What is the rate of logs that you are getting? I just write to flat filesand post process them (on a one minute cron cycle), but I routinly handle30K logs/second and have handled >92K logs in a single second.

I would need more details of your systems to point out other possibleissues. For example, if you have your database on the same filesystem thatrsyslog is writing to, and that filesystem is ext3, there's a chance thatthe fsync calls that your database is performing is stalling all I/O tothe filesystem. Ext3 is pathalogicly bad at performing fsyncs and canstall all I/O for up to 30 seconds at a time. This sort of delay can causea huge backlog to build up in rsyslog.

If you look at the different rsyslog threads (the 'H' option in top), youmay see that one of them is going haywire trying to do something when youhave trouble, or you may find that they all go to zero cpu when you haveproblems. what they do will indicate different sorts of problems. Based onwhat you are describing, I would guess they go to zero CPU, but you shouldcheck.


David Lang

 in my centralized logging server and im getting some troubles i'd really

love to figure out. I've around 170 servers/switches/otherthings
logging on
this server, most of them just send auth.* logs, some apaches sending
the
access and error logs, and switches sending warns and errors. Sometimes
the
rsyslog queues get complettly filled up and it stops writing logs to
disk,
this is the exact logs of what happened:

Stops completely or just writes them incredibly slowly?


It writes them incredibly slow, right.


 Once *size* reaches 10000 (the default max as far as i know) things get


complicated, rsyslog starts to drop logs and misbehave. The rsyslog

Dropping is a default action in case of congestion, do you really see some
misbehavior?

What i see is (i've munin graphs of the server):

-disk writing goes down, to almost zero.
-rsyslog queues starts to grow dramatically fast.


 configuration write a per host files into /var/log/servidores/, it also

sends some logs to another rsyslog server and a postgress database
running
in another server. 2 weeks ago, i disabled sending logs to the postgress
databse, because i had this same problem and we lost too many hours of
logs. Most of the servers are sending logs by TCP and a few servers and
other devices use UDP.

Is there a way i can avoid this problem? should i increase the mainqueue
size? use other queues? Any help will be great. Thanks

Any particular size of a queue (or available ram) is finite. If you can
identify the output that blocks the processing, put it in a separate queue
and configure enqueuing to have a short timeout. This should mitigate the
issue to some degree, not to be an ideal solution.


How can i identify the blocking proccess? any idea?


e.g.
$ActionQueueTimeoutEnqueue <milisec>

This could also be a bug, but I can't recall all the issues all the way
back to 5.8.5. Do you encrypt the forwarded logs?

I'm not encrypting logs, i think it's a bug too.

Tomas
______________________________**_________________
rsyslog mailing list
http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog>
http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/>
What's up with rsyslog? Follow https://twitter.com/rgerhards

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards

Re: [rsyslog] rsyslog 5.8.5 problems

Reply via email to