did you capture top during this time?
since you are writing to NFS, iostat doesn't give us any useful info
at this point the queues is not showing that it ever filled up:
Thu Sep 26 15:30:40 2013: imuxsock: submitted=6 ratelimit.discarded=0
ratelimit.numratelimiters=6
Thu Sep 26 15:30:40 2013: action 1: processed=72558080 failed=0
Thu Sep 26 15:30:40 2013: action 2: processed=14243880 failed=0
Thu Sep 26 15:30:40 2013: action 3: processed=15743143 failed=0
Thu Sep 26 15:30:40 2013: action 4: processed=11382168 failed=0
Thu Sep 26 15:30:40 2013: action 5: processed=9334973 failed=0
Thu Sep 26 15:30:40 2013: action 6: processed=11212392 failed=0
Thu Sep 26 15:30:40 2013: action 7: processed=11048225 failed=0
Thu Sep 26 15:30:40 2013: action 8: processed=11177134 failed=0
Thu Sep 26 15:30:40 2013: imudp(*:514): submitted=160186128
Thu Sep 26 15:30:40 2013: imudp(*:514): submitted=0
Thu Sep 26 15:30:40 2013: main Q: size=3487134 enqueued=160186144 full=0
discarded.full=0 discarded.nf=0 maxqsize=7742133
now, one thing is that it looks like at the time you cut off the report, rsyslog
has not finished processing the message, there are still size=3487134 messages
in the queue, which is ~10 seconds worth of data
this shows everything working fine without a problem (once the logs are in
rsyslog), the peak queue size is about double what the current size is at the
end of the test, but it's still only about 1/3 the max queue size. This means
that we are no longer in the situation where the output and queue management is
out problem (which we were at one point earlier where the queues were filling
up)
how many messages did you send in this test? Do you know what the logs that
Spirent is sending look like over the wire? (if not, could you capture a sample
of each log with the template RSYSLOG_DebugFormat so that we can see exactly
what is being sent)
There are two different performance numbers that are interesting to know
1. what is the max rate that rsyslog can accept messages without loosing any,
assuming there is available queue.
2. what is the max rate that rsyslog can process and output messages
With a lossy protocol like UDP, the first rate needs to be significantly higher
than the second rate. In the real world, logs are very bursty, but if you
average the log throughput over even a 10 second window, the average logs/sec
drops drastically.
when benchmarking, you need to make sure that the test runs long enough to get
the system to a steady state. Given the fact that the max queue size is still
increasing at the end of your sample time, I'm not sure the test is running long
enough to show this
the grep testing that you are doing could be showing either one (depending on if
the timestamp that you are using for your grep is in the log message provided to
rsyslog, or if this is the timestamp that rsyslog is putting on the message.
given that this latest run shows no sign of the queue filling up, I suspect that
this timestamp is created by rsyslog. do you know exactly what the spirent is
sending out? I suspect it's not including the timestamp
At the risk of confusing things, in future tests, please do the following.
stop rsyslog, rm all output files, start rsyslog, start top, start generating
traffic (known number of packets at a known rate), wait until the system goes
idle, if you can, note how long after the spirent stops sending traffic it takes
for rsyslog to become idle, stop top, stop rsyslog.
if the test is running for multiple minutes, we may want to setup a process that
rolls the log files so we can see for sure how many are output in each minute,
but let's hold off on discussing the details of this a bit.
report samples from top during the flood, and after spirent stopped sending
messages.
count the total number of log messages in these files and compare them to what
should have been there (i.e. what spirent sent)
go ahead and do some sample greps for timestamps, or even better
cut -c -16 * |sort -n |uniq -c
This will show per-second counts. you can then decide if you want to send us the
full dump, or just a sample :-)
some of this you can do retroactively for this last test, some you must plan
ahead for :-)
There is one imudp option that you are not using, batchsize, this requires a
kernel >= 2.6.34 and glibc >= 2.12, so if you are using anything older this
won't make any difference. What OS version are your running again?
If you are running something new enough, run this same config with
batchsize="10" right after TimeRequery="10" and let's see what difference that
makes (changing nothing else in the config)
David Lang
~On Thu, 26 Sep 2013, Robert wrote:
Hello Rainer, David,
The spirent is set to send out 300k fps. Attached are the pstats, mpstats,
iostats., and its writing locally, this is what I get back:
[rcortiz@simon test]$ ls
f_ad f_fw f_mail f_networks f_pix f_router f_shib f_vm
[rcortiz@simon test]$ sudo grep 15:20:00 * | wc -l
259910
[rcortiz@simon test]$ sudo grep 15:20:10 * | wc -l
280150
pasting in the config again here (since your mail client mangles quoted text,
loosing line breaks)
#### MODULES ####
module(load="imuxsock") # needs to be done just once Roberto 8-9-13
module(load="imklog")
module(load="impstats" interval="30" log.syslog="off"
log.file="/local_path/pstats")
# Provides UDP syslog reception
module(load="imudp"
TimeRequery="10"
SchedulingPolicy="fifo"
SchedulingPriority="10") # needs to be done just once Roberto 8-9-13
input(type="imudp" port="514") # Roberto 8-9-13
#### GLOBAL DIRECTIVES ####
# Use imprecise-old-style timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$MainMsgQueueType FixedArray
$MainMsgQueueSize 20000000
$MainMsgQueueWorkerThreads 2
$MainMsgQueueDequeueBatchSize 1000
$InputUDPMaxSessions 40000000
#### RULES ####
:hostname, contains, "fdfw" {
action (type="omfile"
DirCreateMode="0700"
FileCreateMode="0644"
File="/local_path/f_fw"
FlushOnTXEnd="on"
IOBufferSize="8k"
)
}
:hostname, contains, "mail" {
action (type="omfile"
DirCreateMode="0700"
FileCreateMode="0644"
File="/local_path/f_mail"
FlushOnTXEnd="on"
IOBufferSize="8k"
)
}
:hostname, contains, "shib" {
action (type="omfile"
DirCreateMode="0700"
FileCreateMode="0644"
File="/local_path/f_shib"
FlushOnTXEnd="on"
IOBufferSize="8k"
)
}
:hostname, contains, "pdc" {
action (type="omfile"
DirCreateMode="0700"
FileCreateMode="0644"
File="/local_path/f_ad"
FlushOnTXEnd="on"
IOBufferSize="8k"
)
}
:hostname, contains, "networks" {
action (type="omfile"
DirCreateMode="0700"
FileCreateMode="0644"
File="/local_path/f_networks"
FlushOnTXEnd="on"
IOBufferSize="8k"
)
}
:hostname, contains, "rout" {
action (type="omfile"
DirCreateMode="0700"
FileCreateMode="0644"
File="/local_path/f_router"
FlushOnTXEnd="on"
IOBufferSize="8k"
)
}
:hostname, contains, "vm" {
action (type="omfile"
DirCreateMode="0700"
FileCreateMode="0644"
File="/local_path/f_vm"
FlushOnTXEnd="on"
IOBufferSize="8k"
)
}
:hostname, contains, "pix" {
action (type="omfile"
DirCreateMode="0700"
FileCreateMode="0644"
File="/local_path/f_pix"
FlushOnTXEnd="on"
IOBufferSize="8k"
)
}
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.