Ok. Alex suggested that I add some code to mrtg that wrote the absolute poll values (that is, the actual counter # at the time of the poll) as well as the current timestamp to a file and try to correlate the spikes to values in this new file.
To do so, I added the code to the section of MRTG titled "update the RRD" (line 439 in my version, 2.9.12a): (I'm using RRD...) #Schuyler's debug edits... open(SKY1, ">>\/tmp\/$router\-debug"); print SKY1 scalar localtime($time); print SKY1 "\t$time\t$inlast\t$outlast\n"; #End Schuyler's edits... (Note: I'm not a Perl expert - but hey it works GREAT and I didn't even have to use a hash! :) ) And sure enough, I got a hit tonight! It looks like MRTG is doing a poll, and reporting the following data: ( Current Date timestamp inocts outoctets ) Thu Jun 28 17:26:31 2001 993763591 -1 -1 So, why the heck is it reporting a negative number??? I would think that if the poll is unsuccessful (for whatever reason - timeout?) then it would report a zero for the absolute value! This error is pretty easy to trap (I'll let my mods run for a little while and see if the next spike is associated with a -1 absolute), but I guess my question is how should we figure out what is causing the -1 poll? What happens if I force it to zero when I get a negative #? will MRTG assume a counter wrap (bad) or know that there was a non-response (it happens with UDP, after all!) that it should just zero out the average for that interval or (better) report the previous interval? Anyway - I'll be glad to put this one to bed. Management was losing faith in ol' MRTG! :) Schuyler -----Original Message----- From: Schuyler Bishop [mailto:[EMAIL PROTECTED] Sent: Wednesday, June 27, 2001 3:58 PM To: 'Alex van den Bogaerdt' Cc: [EMAIL PROTECTED] Subject: [mrtg] Re: Spikes in MRTG graphs and logs... I haven't received any responses and I'm hoping this isn't because it's a bug someone hasn't pinned down yet. I'll help in finding out what's going on as much as I can. I'm going to put a script into the crontab that runs at the same time MRTG does that pings each device I'm monitoring to see if there's a correlation between reachability (I'm monitoring across an upstream ISP's network) and these spikes. Any other suggestions as to how I can track this one down? Thanks! Schuyler -----Original Message----- From: Schuyler Bishop [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 30, 2001 11:11 AM To: 'Alex van den Bogaerdt'; Schuyler Bishop Cc: [EMAIL PROTECTED] Subject: [mrtg] Re: Spikes in MRTG graphs and logs... All, Note that I think I might have pinned down the difference. It seems that with the new version of the config, I'm using IP addresses as the identifier in MRTG, while in the old version I was using interface numbering. I've verified that interface numbering still produces spikes, even with the new version of MRTG. Again, this seems to be related to using interface numbering instead of IP address numbering in cfgmaker. Excerpt from the two similar cfgmaker lines: New with Interface numbering: (spikes still happen) ./cfgmaker --ifref=nr --ifdesc=alias --subdirs=HOSTNAME --community=public --output=../cfgs/test1.cfg --global workdir:/usr/local/apache/htdocs/mrtg/test --global 'options[_]: growright,bits' New with IP address numbering: (spikes don't happen) ./cfgmaker --ifref=ip --ifdesc=alias --subdirs=HOSTNAME --community=public --output=../cfgs/test.cfg --global workdir:/usr/local/apache/htdocs/mrtg/test --global 'options[_]: growright,bits' Schuyler -----Original Message----- From: Alex van den Bogaerdt [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 22, 2001 6:42 PM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: [mrtg] Re: Spikes in MRTG graphs and logs... Schuyler Bishop wrote: > > Per request, here are the relevant codes with appropriate removals of > private info. :) Note that the IPs have been changed to protect the > innocent. Sure. I even removed more and see just one real difference. The option line looks different but isn't (in this case). All other lines were just textual differences (if at all different). > Old config: (2.8.12) > Target[10.10.10.10.6]: 6:[EMAIL PROTECTED] > New Config: (2.9.12a) > Target[10.10.10.10_10.20.10.10]: /10.20.10.10:[EMAIL PROTECTED]: What should happen is that MRTG browses the MIB for an interface with this IP address, and use the aquired interface number to get the statistics. The question that comes to mind right away: Is the interface number for the interface with IP address 10.20.10.10 indeed 6 at the moment, in other words: could it have changed over time (due to a reboot or so). Even if the traffic looks more or less the same, it may be that you are monitoring a different interface ? -- __________________________________________________________________ / [EMAIL PROTECTED] [EMAIL PROTECTED] \ | work private | | My employer is capable of speaking therefore I speak only for myself | +----------------------------------------------------------------------+ | Technical questions sent directly to me will be nuked. Use the list. | +----------------------------------------------------------------------+ | http://faq.mrtg.org/ | | http://rrdtool.eu.org --> tutorial | +----------------------------------------------------------------------+ -- Unsubscribe mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/mrtg FAQ http://faq.mrtg.org Homepage http://www.mrtg.org WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi -- Unsubscribe mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/mrtg FAQ http://faq.mrtg.org Homepage http://www.mrtg.org WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi -- Unsubscribe mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/mrtg FAQ http://faq.mrtg.org Homepage http://www.mrtg.org WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi -- Unsubscribe mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/mrtg FAQ http://faq.mrtg.org Homepage http://www.mrtg.org WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi -- Unsubscribe mailto:[EMAIL PROTECTED] Help mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/mrtg-developers
