On Mon, 3 Jun 2013, Roberto Giordani wrote:


Hello,
I'm working on a project where 20 servers RHEL 5.8 (with rsyslog 5.8.12) has 20 input files on input and send about 10.000 messages for second to one rsyslog server 7.2 version

The network is gigabit between client-->server and this is the daily nmon network traffic graphs


The first configuration was with queue file, but is was too slow, so I've used the LinkedList queue.

in my testing memory was the fastest (around 1000x as fast as a full disk queue)

The current client settings are
$InputFileName /file1.log
$InputFileTag TEST1
$InputFileStateFile file1
$InputFileSeverity debug
$InputFileFacility local6
$InputRunFileMonitor
$InputFilePersistStateInterval 10
....
....
....
############################################
$MaxMessageSize 9000

$MainMsgQueueType LinkedList
$MainMsgQueueSize 1000000
$MainMsgQueueWorkerThreads 20
$MainMsgQueueDequeueBatchSize 5000
$MainMsgQueueSaveOnShutdown on

$ActionQueueType LinkedList
$ActionQueueSize 2000000
$ActionQueueWorkerThreads 50
$ActionQueueDequeueBatchSize 5000
$ActionQueueSaveOnShutdown on
###########################################

$ActionResumeRetryCount -1
$ActionQueueTimeoutEnqueue 1
$ActionSendResendLastMsgOnReconnect on
$ActionQueueCheckpointInterval 1

local6.debug @@10.10.1.10:10514

The server settings are
##############################################
# Provides TCP syslog reception
$ModLoad imtcp
$InputTCPServerRun 10514


#### GLOBAL DIRECTIVES ####

# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat

# File syncing capability is disabled by default. This feature is usually not required,
# not useful and an extreme performance hit
$ActionFileEnableSync off

$umask 0007

if $fromhost-ip == "10.10.1.7" and $syslogfacility-text == "local6" and $syslogseverity-text == "debug" and $syslogtag == "TEST1" then action(type="omfile" DirOwner="user1" DirCreateMode="0750" FileCreateMode="0444" File="/rsyslog-data/file1.log")

if $fromhost-ip == ........

###############################

My questions are:
1)how to find the right combination of
Main queue ---> Action Queue---> receiver queue on rsyslog server ?

why do you need an Action Queue? it's only needed if you have some reason to think that that particular action is going to fall behind and you want to decouple it from the rest of the outputs.


2)Is it possible to increase incoming messages and handle quickly?

Yes, I've processed 380K logs/sec (gig-e limit) and others with 10G networks have processed 1M logs/sec

3)Why the output log on Centralized Rsyslog are still behind the source log on the client?

lots of possible reasons, but first off what are you looking at that's saying that the central servers are behind?

start off by seeing if you are having problems reading the data from the file. setup a test where you just read from the file and write back out to the file (I'd suggest starting without any queue directives at all)

then if that's fast enough, change it to send remotely and report what's happening

I've activated the pstats module to understand the queue status on clients and server.
After 6 hours running  the client has this report
.........
2013-06-03T22:23:16.708288+02:00 app01 rsyslogd-pstats: action 9 queue: size=2000000 enqueued=9327675 full=6831507 maxqsize=20
00000
2013-06-03T22:23:16.708297+02:00 app01 rsyslogd-pstats: main Q: size=2 enqueued=9648448 full=0 maxqsize=20395 2013-06-03T22:23:46.708367+02:00 app01 rsyslogd-pstats: imuxsock: submitted=323414 ratelimit.discarded=0 ratelimit.numratelimi
ters=29974
2013-06-03T22:23:46.708382+02:00 app01 rsyslogd-pstats: action 9 queue: size=2000000 enqueued=9340578 full=6844410 maxqsize=20
00000
2013-06-03T22:23:46.708390+02:00 app01 rsyslogd-pstats: main Q: size=2164 enqueued=9666464 full=0 maxqsize=20395 2013-06-03T22:24:16.708923+02:00 app01 rsyslogd-pstats: imuxsock: submitted=328198 ratelimit.discarded=0 ratelimit.numratelimi
ters=29986
2013-06-03T22:24:16.708941+02:00 app01 rsyslogd-pstats: action 9 queue: size=2000000 enqueued=9355649 full=6859481 maxqsize=20
00000
2013-06-03T22:24:16.708949+02:00 app01 rsyslogd-pstats: main Q: size=1364 enqueued=9686593 full=0 maxqsize=20395 2013-06-03T22:24:46.709300+02:00 app01 rsyslogd-pstats: imuxsock: submitted=333070 ratelimit.discarded=0 ratelimit.numratelimi
ters=29997
2013-06-03T22:24:46.709316+02:00 app01 rsyslogd-pstats: action 9 queue: size=2000000 enqueued=9365276 full=6869108 maxqsize=20
00000
2013-06-03T22:24:46.709323+02:00 app01 rsyslogd-pstats: main Q: size=2123 enqueued=9702047 full=0 maxqsize=20395 2013-06-03T22:25:16.709807+02:00 app01 rsyslogd-pstats: imuxsock: submitted=337951 ratelimit.discarded=0 ratelimit.numratelimi
ters=30009
2013-06-03T22:25:16.709823+02:00 app01 rsyslogd-pstats: action 9 queue: size=2000000 enqueued=9379492 full=6883324 maxqsize=20
00000
2013-06-03T22:25:16.709832+02:00 app01 rsyslogd-pstats: main Q: size=2 enqueued=9719723 full=0 maxqsize=20395 2013-06-03T22:25:46.709942+02:00 app01 rsyslogd-pstats: imuxsock: submitted=343014 ratelimit.discarded=0 ratelimit.numratelimi
ters=30021
2013-06-03T22:25:46.709980+02:00 app01 rsyslogd-pstats: action 9 queue: size=2000000 enqueued=9389640 full=6893472 maxqsize=20
00000


and the rsyslog server the following stats
.......
2013-06-03T23:05:15.898682+02:00 fsp01 rsyslogd-pstats: main Q: size=9978 enqueued=5032165 full=156941 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:06:56.157199+02:00 fsp01 rsyslogd-pstats: main Q: size=9987 enqueued=5065134 full=157971 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:08:30.657673+02:00 fsp01 rsyslogd-pstats: main Q: size=9972 enqueued=5096315 full=158942 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:10:09.895850+02:00 fsp01 rsyslogd-pstats: main Q: size=9986 enqueued=5129162 full=159969 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:11:42.488505+02:00 fsp01 rsyslogd-pstats: main Q: size=9973 enqueued=5159935 full=160933 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:13:23.213800+02:00 fsp01 rsyslogd-pstats: main Q: size=9973 enqueued=5193246 full=161973 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:14:58.833570+02:00 fsp01 rsyslogd-pstats: main Q: size=9970 enqueued=5224922 full=162962 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:16:35.184133+02:00 fsp01 rsyslogd-pstats: main Q: size=9975 enqueued=5256863 full=163960 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:18:13.992958+02:00 fsp01 rsyslogd-pstats: main Q: size=9991 enqueued=5289392 full=164977 discarded.full=0 discarded.nf=0 maxqsize=10000 2013-06-03T23:19:52.464473+02:00 fsp01 rsyslogd-pstats: main Q: size=9942 enqueued=5322013 full=165996 discarded.full=0 discarded.nf=0 maxqsize=10000

4)Why on the client enqueued value never decrease and full= is always different of 0?

well, once you have enqueued something, you don't go back in time and not enqueue it, so unless you restart rsyslog, that number is going to just keep climbing.

I'm planning to distribute the 20 client to 2 process of rsyslog server on different port on the same server but I think your help about the right combination of action queue on client and main queue on server.

On rsyslog documentation I've found a lot of info about to handle input queue (# thread, max messages and DequeuBatch) but I dont' know how to rsyslog should better work on receiver server. I've seen that the rsyslog 7.2 process on server use only one CPU each time, while the server has 4CPU and 12GB ram on 64bit O.S.
5)Some option during configuration process can change this behavior ?
$ ./configure --prefix=/usr/local/rsyslog7 --enable-gnutls --enable-imfile --enable-omruleset --enable-imptcp --enable-mmnormalize --enable-usertools --enable-imdiag --enable-diagtools --enable-impstats

I think you are trying too hard.

rsyslog will use one thread for each input, and one thread for each output by default (and possibly some others along the way if it sees a need), the fact that you aren't seeing it use more than 100% CPU indicates that you probably have some other bottleneck.

On the recieving side, have you disabled DNS lookups? that has a big hit in many cases.

when you look at top of the central box, hit 'H' to have it display the different threads as different lines, then you will see how much cpu is used for each one. After that you can track down which one is the bottleneck

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to