a couple of questions and comments

On Fri, 25 Feb 2011, Robert Gabriel wrote:

Hello all,

I'm very new to rsyslog, so please bear with me...

I have managed to get something going, below is what I wanted.

1. High throughput (> 200K).

200K what? 200K bytes/min, messages/min, bytes/sec, message/sec, ???

Using multiple queues.


2. Memory-type disk-assisted multiple queues (multiple action queues?).

How does $ActionQueue* fit in with multiple rulesets?
Is it possible with my existing setup?
I would like to spool to disk if any of the rulesets cannot forward by
TCP for whatever reason.
This reliability is quite important.

I see there is $RulesetCreateMainQueue available since 5.3.5+
but I cannot use this for now as I cannot sacrifice stability
on the RHEL/CentOS platform and 5.6.x packages won't be available
for quite some time (CentOS 6 is only on rsyslog-4.4.2-3.el6.i686.rpm!).

sticking with the RHEL/CentOS versions is going to really hurt you, the performance improvements since that version are very significant and RedHat is never going to upgrade RHEL 6.0 to a newer version (part of their 'stability' claim), so you are going to be stuck on that version until the next release of RHEL.

4. Single/multiple files (option to switch on/off to troubleshoot).

In the long run, what seems to better? We have had a debate over this
but maybe users can offer real world experience please?

personally, I hate nor being able to find something because logging wasn't enabled, so I lean towards the 'put everything in one file' mode.

5. Compression.

Compressed RELP seemed to work in the forwarding part (no errors in
rsyslog restart) but is this a desired config?
Should we be doing compression elsewhere if possible like in an ASIC
in our Juniper firewall to save
precious bandwidth over our WAN link?
Has anyone tried OpenSSH compression and local port forwarding?

text like rsyslog does compress well in bulk, but individual messages are seldom large enough to be worth compressing. a compressed channel may help, but it may also delay the messages as the compression is trying to fill it's buffer before compressing and sending it.


6. Expression and property-based filters.

Which is faster/preferred?

property based filters are significantly faster.

7. Store, filter and forward.

What is better in terms of OS throughput, single or multiple files?
We are looking at very high throughput like > 200K.
If I have ten ruleset inputs all writing to the same file what happens?
I'm interested in how concurrent writes are implemented.

there are versions where concurrent writes are not handled well at all, I don't know the particular version you are using.

The relay will forward the separate TCP streams onto Splunk data
inputs as TCP ports and indexed.
We want to separate the streams into separate indexes to enable easier
analysis of different sourcetypes.
Normalisation could be done as far back as the log source, on the
collector or in Splunk itself.
The end point for the data will be in AlienVault to do SIEM and event
correlation.

how many different sources do you have?

there is a lot of overhead in splunk per index, and each index really wants it's own chunk of memory so that it can efficiently update itself as new logs are inserted, this is something on the orderof 10G per index by default.

splunk is pretty good about being able to search for subsets within an index. Instead of making each host it's own index, consider splitting the indexes by type (juniper vs linux).

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com

Reply via email to