On Tue, 22 May 2012, Juan Jose Pavlik wrote:
2012/5/22 Tomas Heinrich <[email protected]>
Hi. Just some quick notes.
Upgrade if you can. 5.8.5 is way too old.
I'm running OpenSuSE 12.1 and that's the rsyslog version that comes with
it, i don't like using software out of the repositories. What version
should i try??? I've heard something about bugs in this version..., this is
a good start.
rsyslog improves rapidly, unfortunantly the distros upgrade very slowly
and don't backport many of the fixes.
the current version 5 release is 5.8.11, you should upgrade to that.
On 05/22/2012 03:56 PM, Juan Jose Pavlik wrote:
Right after the queues filled up, it stoped sending logs to the second log
server too.
My guess was that is hangs on one action which fills the main queue which
slows message processing. But forwarding would be the one to suspect if
other outputs are just plain files.
I thought that too, that's way i dissabled the database writing (the db is
in a remote server), i don't think that the other rsyslog is the one
slowing it down. Maybe is a network problem...?
do you have batch writes enabled to the database server? It's very
possible that rsyslog is just unable to write it's output fast enough and
that causes everything to stall. Configuring batch writes (and adjusting
the batch size) will allow you to insert multiple logs in one transaction.
In past tests I've been able to write 100+ logs in a transaction at
approximatly the same transactions/sec rate as writing single logs per
transaction.
What is the rate of logs that you are getting? I just write to flat files
and post process them (on a one minute cron cycle), but I routinly handle
30K logs/second and have handled >92K logs in a single second.
I would need more details of your systems to point out other possible
issues. For example, if you have your database on the same filesystem that
rsyslog is writing to, and that filesystem is ext3, there's a chance that
the fsync calls that your database is performing is stalling all I/O to
the filesystem. Ext3 is pathalogicly bad at performing fsyncs and can
stall all I/O for up to 30 seconds at a time. This sort of delay can cause
a huge backlog to build up in rsyslog.
If you look at the different rsyslog threads (the 'H' option in top), you
may see that one of them is going haywire trying to do something when you
have trouble, or you may find that they all go to zero cpu when you have
problems. what they do will indicate different sorts of problems. Based on
what you are describing, I would guess they go to zero CPU, but you should
check.
David Lang
in my centralized logging server and im getting some troubles i'd really
love to figure out. I've around 170 servers/switches/otherthings
logging on
this server, most of them just send auth.* logs, some apaches sending
the
access and error logs, and switches sending warns and errors. Sometimes
the
rsyslog queues get complettly filled up and it stops writing logs to
disk,
this is the exact logs of what happened:
Stops completely or just writes them incredibly slowly?
It writes them incredibly slow, right.
Once *size* reaches 10000 (the default max as far as i know) things get
complicated, rsyslog starts to drop logs and misbehave. The rsyslog
Dropping is a default action in case of congestion, do you really see some
misbehavior?
What i see is (i've munin graphs of the server):
-disk writing goes down, to almost zero.
-rsyslog queues starts to grow dramatically fast.
configuration write a per host files into /var/log/servidores/, it also
sends some logs to another rsyslog server and a postgress database
running
in another server. 2 weeks ago, i disabled sending logs to the postgress
databse, because i had this same problem and we lost too many hours of
logs. Most of the servers are sending logs by TCP and a few servers and
other devices use UDP.
Is there a way i can avoid this problem? should i increase the mainqueue
size? use other queues? Any help will be great. Thanks
Any particular size of a queue (or available ram) is finite. If you can
identify the output that blocks the processing, put it in a separate queue
and configure enqueuing to have a short timeout. This should mitigate the
issue to some degree, not to be an ideal solution.
How can i identify the blocking proccess? any idea?
e.g.
$ActionQueueTimeoutEnqueue <milisec>
This could also be a bug, but I can't recall all the issues all the way
back to 5.8.5. Do you encrypt the forwarded logs?
I'm not encrypting logs, i think it's a bug too.
Tomas
______________________________**_________________
rsyslog mailing list
http://lists.adiscon.net/**mailman/listinfo/rsyslog<http://lists.adiscon.net/mailman/listinfo/rsyslog>
http://www.rsyslog.com/**professional-services/<http://www.rsyslog.com/professional-services/>
What's up with rsyslog? Follow https://twitter.com/rgerhards
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards