On 07/10/2015 21:42, [email protected] wrote: > YyyyYYuIIIIIU > Sent from my Verizon Wireless BlackBerry
Hmmmmmmmmmmmmmm, interesting reply. I'm wondering if it has something to do with: 1. verizon 2. dodgy 3g 3. crapberry. oops, sorry: blackberry Or maybe it's because y, u and i are in a row on the keyboard, shift and enter are adjacent, and you have a over-friendly cat? :-) > > -----Original Message----- > From: Alan McKinnon <[email protected]> > Date: Wed, 7 Oct 2015 20:39:42 > To: <[email protected]> > Reply-to: [email protected] > Subject: Re: [gentoo-user] strange TCP timeout errors > > On 07/10/2015 17:55, Grant wrote: >>>>>>> I've attached a PNG from Munin showing the TCP timeout errors on my >>>>>>> Gentoo server over the past month. The data is expressed in timeouts >>>>>>> per second and that rate is shown to be steadily increasing over the >>>>>>> past month. That seems strange to me. Munin doesn't show any other >>>>>>> data point increasing like this over the time period. Any ideas? >>>>>>> >>>>>>> - Grant >>>>>>> >>>>>> >>>>>> weird - does it reset on an interface restart or reboot? >>>>> >>>>> this would be my test #1 >>>> >>>> >>>> I rebooted and the rate of errors has dropped off to almost nothing. >>>> >>>> >>>>>> Can you verify its not an artefact within munin (how?) >>>>> >>>>> In theory, a misconfigured graph can do this. Munin can draw many >>>>> different types of graph, including cumulative values. Even for a data >>>>> type like this which is X events per unit time, if you tell munin to add >>>>> them all up, it will do so and graph it. >>>>> >>>>> Qucik test is to look at the graph config. >>>> >>>> >>>> This graph lives in the "network" section of the munin web interface. >>>> There is no matching section in /etc/munin/plugin-conf.d/munin-node so >>>> it should be be using the default config. >>>> >>>> Any ideas based on this new info? >>> >>> A few :-) >>> >>> >>> I can't find the plugin that delivers that graph though. Maybe I just >>> don't have it, maybe it comes from contrib/ >>> >>> What's your USE for munin? >> >> >> USE="apache cgi http mysql ssl syslog -asterisk -dhcpd -doc -ipmi >> -ipv6 -irc -java -memcached -minimal -postgres (-selinux) {-test}" >> >> >>> What do you have in "ls -al /etc/munin/plugins/" ? > > > It's as I thought - your data is accurate but rrd has been given a > completely wrong method to derive the graphs. > > Munin graphs for section "Network" do not have to be in a file called > "network" - it's just a category and the plugin defines what web-page > section it must be in. In your case, the relevant plugin is > netstat_multi which doesn't often get installed. It's data source is > "netstat -s" so grep that output for "timeout" to see it. > > Timeouts are cumulative counters, they do not get less till they wrap > around. So to scale them, the plugin gets the rrd file to subtract > previous reading from current reading and divide by the time interval to > get the timeouts/sec. This is all done inside rrd when the data files > are updated (it's quite a lot of magic) > > That plugin sets the graph type to DERIVE > (/etc/munin/plugins/netstat_multi around line 190. I feel it should be > GAUGE or COUNTER. > > The proper reference on rrd is > http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html > and the munin docs are > https://munin.readthedocs.org/en/latest/index.html > > You must edit the plugin file and IIRC recreate the rrd, you will lose > all past info (can't be helped). > > > [snip ls output] > > >> P.S. Any other good plugins you'd recommend? > > http://gallery.munin-monitoring.org/ > > Monitoring is highly site-specific so recommendations aren't usually > worth much, but that gallery has LOTS of contributed plugins > -- Alan McKinnon [email protected]

