Thanks everyone for the input unfortunately I'm still stuck,
I have been monitoring the interface and the packets using multiple tools
including iostat, mpstat, sar, tcpdump
Currently I am emulating 150k mps into this server, and I look at the traffic
on the server's interface using this command:
#tcpdump -i eth2.10 -nn | cut -c 1-8 | uniq -c
it usually gives me this output:
147393 15:17:08
147350 15:17:09
147121 15:17:10
146842 15:17:11
146994 15:17:12
147337 15:17:13
144745 15:17:14
as soon as I start the rsyslog service I get this:
131449 15:17:15
130728 15:17:16
129504 15:17:17
131348 15:17:18
130638 15:17:19
128985 15:17:20
133200 15:17:21
132211 15:17:22
my iotop or top -H show rsyslog as threading on all 8 threads that I have
configured on the .conf file
my iostat gives me this, if I am reading this right, my io should not be
bottleneck? :
09/17/2013 03:15:14 PM
avg-cpu: %user %nice %system %iowait %steal %idle
15.35 0.00 4.93 0.14 0.00 79.58
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 5.00 0.00 5.00 0.00 0.04 16.00 0.04 7.20 4.00 2.00
dm-0 0.00 0.00 0.00 10.00 0.00 0.04 8.00 0.08 7.60 2.00 2.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09/17/2013 03:15:15 PM
avg-cpu: %user %nice %system %iowait %steal %idle
15.45 0.00 3.86 0.14 0.00 80.54
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 6.00 0.00 4.00 0.00 0.04 20.00 0.02 5.25 4.00 1.60
dm-0 0.00 0.00 0.00 10.00 0.00 0.04 8.00 0.06 5.90 1.60 1.60
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09/17/2013 03:15:16 PM
avg-cpu: %user %nice %system %iowait %steal %idle
14.65 0.00 5.35 0.14 0.00 79.86
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 349.00 1.00 36.00 0.00 1.50 83.46 2.17 58.62 3.08 11.40
dm-0 0.00 0.00 1.00 385.00 0.00 1.50 8.00 28.71 74.38 0.30 11.40
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09/17/2013 03:15:17 PM
avg-cpu: %user %nice %system %iowait %steal %idle
13.04 0.00 5.59 0.29 0.00 81.09
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 10.00 0.00 6.00 0.00 0.06 21.33 0.05 8.00 4.33 2.60
dm-0 0.00 0.00 0.00 16.00 0.00 0.06 8.00 0.20 12.38 1.62 2.60
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
09/17/2013 03:15:18 PM
avg-cpu: %user %nice %system %iowait %steal %idle
9.08 0.00 9.22 0.14 0.00 81.56
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 7.00 0.00 5.00 0.00 0.05 19.20 0.05 10.60 5.80 2.90
dm-0 0.00 0.00 0.00 12.00 0.00 0.05 8.00 0.14 12.08 2.42 2.90
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
my mpstat on all threads shows this, and if I am reading this correct my cpus
do not seem to show that they are running out cpu power? :
03:17:13 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:17:14 PM all 14.33 0.00 0.96 0.14 0.00 2.20 0.00 0.00 82.37
03:17:14 PM 0 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.00 99.01
03:17:14 PM 1 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 98.00
03:17:14 PM 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
03:17:14 PM 3 0.00 0.00 0.99 0.00 0.00 0.00 0.00 0.00 99.01
03:17:14 PM 4 57.47 0.00 0.00 0.00 0.00 0.00 0.00 0.00 42.53
03:17:14 PM 5 26.15 0.00 1.54 0.00 0.00 16.92 0.00 0.00 55.38
03:17:14 PM 6 23.08 0.00 2.20 0.00 0.00 3.30 0.00 0.00 71.43
03:17:14 PM 7 19.05 0.00 1.19 0.00 0.00 2.38 0.00 0.00 77.38
03:17:14 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:17:15 PM all 22.98 0.00 19.49 0.13 0.00 6.85 0.00 0.00 50.54
03:17:15 PM 0 16.33 0.00 25.51 0.00 0.00 0.00 0.00 0.00 58.16
03:17:15 PM 1 12.37 0.00 25.77 0.00 0.00 0.00 0.00 0.00 61.86
03:17:15 PM 2 15.31 0.00 22.45 0.00 0.00 0.00 0.00 0.00 62.24
03:17:15 PM 3 15.31 0.00 22.45 0.00 0.00 0.00 0.00 0.00 62.24
03:17:15 PM 4 50.00 0.00 19.57 0.00 0.00 1.09 0.00 0.00 29.35
03:17:15 PM 5 37.50 0.00 15.28 0.00 0.00 13.89 0.00 0.00 33.33
03:17:15 PM 6 31.91 0.00 18.09 0.00 0.00 4.26 0.00 0.00 45.74
03:17:15 PM 7 10.64 0.00 6.38 0.00 0.00 38.30 0.00 0.00 44.68
03:17:15 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:17:16 PM all 34.08 0.00 47.29 0.00 0.13 11.23 0.00 0.00 7.27
03:17:16 PM 0 31.63 0.00 63.27 0.00 0.00 0.00 0.00 0.00 5.10
03:17:16 PM 1 27.37 0.00 67.37 0.00 0.00 0.00 0.00 0.00 5.26
03:17:16 PM 2 34.04 0.00 60.64 0.00 0.00 0.00 0.00 0.00 5.32
03:17:16 PM 3 34.74 0.00 60.00 0.00 0.00 0.00 0.00 0.00 5.26
03:17:16 PM 4 44.83 0.00 44.83 0.00 0.00 0.00 0.00 0.00 10.34
03:17:16 PM 5 45.74 0.00 29.79 0.00 1.06 17.02 0.00 0.00 6.38
03:17:16 PM 6 38.95 0.00 38.95 0.00 0.00 11.58 0.00 0.00 10.53
03:17:16 PM 7 15.62 0.00 14.58 0.00 0.00 60.42 0.00 0.00 9.38
03:17:16 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:17:17 PM all 32.85 0.00 48.04 0.00 0.00 12.57 0.00 0.00 6.54
03:17:17 PM 0 30.30 0.00 65.66 0.00 0.00 0.00 0.00 0.00 4.04
03:17:17 PM 1 29.17 0.00 66.67 0.00 0.00 0.00 0.00 0.00 4.17
03:17:17 PM 2 33.68 0.00 60.00 1.05 0.00 0.00 0.00 0.00 5.26
03:17:17 PM 3 35.71 0.00 59.18 0.00 0.00 0.00 0.00 0.00 5.10
03:17:17 PM 4 39.36 0.00 43.62 0.00 1.06 5.32 0.00 0.00 10.64
03:17:17 PM 5 27.96 0.00 19.35 0.00 0.00 46.24 0.00 0.00 6.45
03:17:17 PM 6 35.79 0.00 38.95 0.00 0.00 16.84 0.00 0.00 8.42
03:17:17 PM 7 30.93 0.00 27.84 0.00 0.00 31.96 0.00 0.00 9.28
03:17:17 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:17:18 PM all 32.95 0.00 49.43 0.13 0.00 10.98 0.00 0.00 6.51
03:17:18 PM 0 28.57 0.00 65.31 1.02 0.00 0.00 0.00 0.00 5.10
03:17:18 PM 1 32.63 0.00 62.11 0.00 0.00 0.00 0.00 0.00 5.26
03:17:18 PM 2 28.57 0.00 65.31 0.00 0.00 0.00 0.00 0.00 6.12
03:17:18 PM 3 27.55 0.00 67.35 0.00 0.00 0.00 0.00 0.00 5.10
03:17:18 PM 4 37.11 0.00 34.02 0.00 0.00 20.62 0.00 0.00 8.25
03:17:18 PM 5 26.80 0.00 20.62 0.00 0.00 47.42 0.00 0.00 5.15
03:17:18 PM 6 38.00 0.00 40.00 0.00 0.00 14.00 0.00 0.00 8.00
03:17:18 PM 7 44.44 0.00 40.40 0.00 0.00 8.08 0.00 0.00 7.07
03:17:18 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:17:19 PM all 33.72 0.00 49.68 0.00 0.00 10.04 0.00 0.00 6.56
03:17:19 PM 0 29.29 0.00 65.66 0.00 0.00 0.00 0.00 0.00 5.05
03:17:19 PM 1 29.90 0.00 63.92 0.00 0.00 0.00 0.00 0.00 6.19
03:17:19 PM 2 29.90 0.00 64.95 0.00 0.00 0.00 0.00 0.00 5.15
03:17:19 PM 3 29.90 0.00 64.95 0.00 0.00 0.00 0.00 0.00 5.15
03:17:19 PM 4 30.53 0.00 35.79 0.00 0.00 24.21 0.00 0.00 9.47
03:17:19 PM 5 26.32 0.00 20.00 0.00 0.00 47.37 0.00 0.00 6.32
03:17:19 PM 6 42.86 0.00 43.88 0.00 0.00 5.10 0.00 0.00 8.16
03:17:19 PM 7 50.00 0.00 37.76 0.00 0.00 4.08 0.00 0.00 8.16
I'm also monitoring the cpu with this: sar -u 1 1800, this shows me my cpu
summary
03:15:11 PM CPU %user %nice %system %iowait %steal %idle
03:15:12 PM all 13.66 0.00 2.11 2.82 0.00 81.41
03:15:13 PM all 16.46 0.00 6.00 0.42 0.00 77.13
03:15:14 PM all 15.21 0.00 5.07 0.14 0.00 79.58
03:15:15 PM all 15.55 0.00 3.85 0.14 0.00 80.46
03:15:16 PM all 14.69 0.00 5.23 0.14 0.00 79.94
03:15:17 PM all 13.04 0.00 5.59 0.29 0.00 81.09
03:15:18 PM all 9.06 0.00 9.35 0.14 0.00 81.44
03:15:19 PM all 7.75 0.00 12.25 0.28 0.00 79.72
03:15:20 PM all 9.79 0.00 10.62 0.28 0.00 79.31
03:15:21 PM all 11.37 0.00 7.04 0.27 0.00 81.33
03:15:22 PM all 12.80 0.00 6.06 0.13 0.00 81.00
03:15:23 PM all 15.18 0.00 3.12 0.27 0.00 81.44
03:15:24 PM all 15.49 0.00 2.45 0.27 0.00 81.79
03:15:25 PM all 15.38 0.00 2.06 0.14 0.00 82.42
netstat -su gives shows me that I have a lot of recieve errors in with udp
IcmpMsg:
InType0: 123461
InType3: 22
InType8: 59981
OutType0: 59981
OutType3: 95631
OutType8: 123482
Udp:
312075363 packets received
175019229 packets to unknown port received. <- Also I am curious as to why it
doesn't show the predefined 514 port?
11595319 packet receive errors
2255 packets sent
RcvbufErrors: 1552363
UdpLite:
IpExt:
InMcastPkts: 54
OutMcastPkts: 32
InBcastPkts: 39932
InOctets: 135829844880
OutOctets: 121832587
InMcastOctets: 11926
OutMcastOctets: 6174
InBcastOctets: 3995607
By all this I am thinking that there is something not right with my config
file: ( many of these are commented out)
# Include all config files in /etc/rsyslog.d/
$IncludeConfig /etc/rsyslog.d/*.conf
# Set Buffer Size - default is 4k
$OMFileAsyncWriting on
# $OMFileFlushOnTXEnd on
# $OMFileFlushInterval 1
# $OMFileZipLevel 9
$OMFileIOBufferSize 1000k # modified 9-18-13
#Turn on Main Ruleset
#$RulesetCreateMainQueue on
# Set Main Message Queue Size - default is 10000
$MainMsgQueueType FixedArray #LinkedList
$MainMsgQueueSize 200000000
$MainMsgQueueWorkerThreads 8
# $MainMsgQueueWorkerTimeoutThreadShutdown -1
$MainMsgQueueDequeueBatchSize 1000
# $MainMsgQueueSaveOnShutdown on
$InputUDPMaxSessions 40000000
# $ActionOmrulesetRulesetName somename
$ActionQueueWorkerThreads 8
$ActionQueueSize 10000000
$ActionQueueType FixedArray #LinkedList - use asynchronous processing
Sorry for the long email
Roberto
----- Original Message -----
From: Xuri Nagarin
Sent: 09/12/13 06:49 PM
To: rsyslog-users
Subject: Re: [rsyslog] performance tweaking
Do you have packet drop issues? You didn't say whether you are
receiving/sending over TCP/UDP but you can check these stats: - "netstat -s"
and look for collapsed packets line if you are using TCP. If the number
increments then the application (rsyslog) that is supposed to pick them up
isn't keeping up. - "netstat -s" and look for errors under the UDP stats if you
are using UDP. If the number keeps increasing then you are not consuming
packets fast enough. - Run "netstat -an | grep port" under "watch" where port
is the tcp/udp port you are receiving/sending over. If you see either Recv-Q or
Send-Q number not going to zero then you have a bottleneck in the application.
Bottleneck can be CPU, disk IO bandwdith, network or receiving entity. CPU will
be a bottleneck in a multi-core system if rsyslog isn't threading well.
Usually, rsyslog threads very well, in my experience. Memory is also usually
not a problem because modern servers have lots of RAM and rsyslog isn't a
particular
ly memory hungry app. You can also look for blocked threads in
/proc/pid/net/{udp|tcp}. The "tx_queue" or "rx_queue" fields can provide
information on what threads are causing drops. Your other friends are iotop and
mpstat. On Thu, Sep 12, 2013 at 10:54 AM, Robert Ortiz <[email protected]> wrote:
> Thanks everyone for all the help, I don't seem to be dropping any more >
packets at 150k mps, but I am seeing when I am doing a raw tcpdump to the >
interface, whenever I start the rsyslog service the tcpdump drops a >
significant amount of packets, I modified my sysctl.conf to : > >
net.core.rmem_default = 2097152 > net.core.wmem_default = 2097152 >
net.core.rmem_max = 10485760 > net.core.wmem_max = 10485760 > > > the next
phase of testing is sending logs to multiple locations, I have > been looking
around on how to make this happen, but I cannot seem to find > any
documentation, is rsyslog capable of sending logs to multiple locations? > >
Thanks > > > > Robert. > _________________
______________________________ > rsyslog mailing list >
http://lists.adiscon.net/mailman/listinfo/rsyslog >
http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow
https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts
are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and
DO NOT POST if you > DON'T LIKE THAT. >
_______________________________________________ rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow
https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts
are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO
NOT POST if you DON'T LIKE THAT.
Robert.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.