Package: mrtg
Version: 2.16.2-3
Severity: important

With lenny, if MRTG can't run for a device (because the connection to the
device is broken), mrtg-childs run in parallel and for a long time before
timing out and leaving the .log-files corrupt.

Errors like the following get logged now and then:

SNMP Error:
no response received
SNMPv1_Session (remote host: "xxxx" [10.240.49.6].161)
                  community: "public"
                 request ID: 177908811
                PDU bufsize: 8000 bytes
                    timeout: 2s
                    retries: 5
                    backoff: 1)
 at /usr/share/perl5/SNMP_util.pm line 492

SNMPGET Problem for 1.3.6.1.4.1.9.2.1.57.0 1.3.6.1.4.1.9.2.1.58.0 sysUptime 
sysName on pub...@xxxx::::::v4only
 at /usr/bin/mrtg line 2207
2010-10-20 04:36:02: WARNING: skipping because at least the query for 
1.3.6.1.4.1.9.2.1.57.0 on  zicke.im.wurst-wasser.net did not su
2010-10-20 04:36:02: WARNING: no data for 
1.3.6.1.4.1.9.2.1.57&1.3.6.1.4.1.9.2.1.58:pub...@xxxx. Skipping furthe
2010-10-19 05:46:02: WARNING: could not match&get xxxx/ifInOctets for Descr 
'Dialer1'

The more interesting part is that the log files get corrupted and not 
neccessarily these from the unreachable devices.

The following messages appear:

2010-10-20 08:16:02: WARNING: could not match&get xxxx/ifInOctets for Descr 
'eth0'
2010-10-20 09:25:10, Rateup WARNING: /usr/bin/rateup Can't remove xxxx.1.old 
updating log file
2010-10-20 09:25:10, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.1.log to 
leela.1.old updating log file
2010-10-20 09:25:10, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.1.tmp to 
leela.1.log updating log file
2010-10-20 09:25:10, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.1.tmp to 
leela.1.log updating log file
2010-10-20 09:25:11, Rateup WARNING: /usr/bin/rateup could not read the primary 
log file for xxxx.1
2010-10-20 09:25:11, Rateup ERROR: /usr/bin/rateup found that xxxx.1's log file 
time of 1287551163 was greater than now (1287550862)
ERROR: Let's not do the time warp, again. Logfile unchanged.
2010-10-20 09:25:11, Rateup WARNING: /usr/bin/rateup could not read the primary 
log file for xxxx.1
2010-10-20 09:25:11, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.1.log to 
leela.1.old updating log file
2010-10-20 09:25:11, Rateup WARNING: /usr/bin/rateup Can't remove xxxx.1.old 
updating log file
2010-10-20 09:25:12, Rateup ERROR: /usr/bin/rateup found that xxxx.1's log file 
time of 1287553563 was greater than now (1287551763)
ERROR: Let's not do the time warp, again. Logfile unchanged.
2010-10-20 09:25:12, Rateup ERROR: /usr/bin/rateup found that xxxx.1's log file 
time of 1287553563 was greater than now (1287552665)
ERROR: Let's not do the time warp, again. Logfile unchanged.
2010-10-20 09:25:12, Rateup ERROR: /usr/bin/rateup found that xxxx.1's log file 
time of 1287553862 was greater than now (1287546362)
ERROR: Let's not do the time warp, again. Logfile unchanged.
2010-10-20 09:25:13, Rateup ERROR: /usr/bin/rateup found that xxxx.1's log file 
time of 1287553862 was greater than now (1287553263)
ERROR: Let's not do the time warp, again. Logfile unchanged.
2010-10-20 09:25:13, Rateup WARNING: /usr/bin/rateup Can't remove xxxx.2.old 
updating log file
2010-10-20 09:25:13, Rateup ERROR: /usr/bin/rateup found that xxxx.1's log file 
time of 1287553862 was greater than now (1287547562)
ERROR: Let's not do the time warp, again. Logfile unchanged.
2010-10-20 09:25:14, Rateup ERROR: /usr/bin/rateup found that xxxx.2's log file 
time of 1287551163 was greater than now (1287546362)
ERROR: Let's not do the time warp, again. Logfile unchanged.
2010-10-20 09:25:14, Rateup ERROR: /usr/bin/rateup found that xxxx.1's log file 
time of 1287553862 was greater than now (1287549963)
ERROR: Let's not do the time warp, again. Logfile unchanged.
2010-10-20 09:25:13, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.2.tmp to 
leela.2.log updating log file
2010-10-20 09:25:13, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.2.log to 
leela.2.old updating log file
2010-10-20 09:25:13, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.2.tmp to 
leela.2.log updating log file
2010-10-20 09:25:13, Rateup WARNING: /usr/bin/rateup Can't remove xxxx.2.old 
updating log file
2010-10-20 09:25:13, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.2.log to 
leela.2.old updating log file
2010-10-20 09:25:13, Rateup WARNING: /usr/bin/rateup Can't rename xxxx.2.tmp to 
leela.2.log updating log file
2010-10-20 09:25:14, Rateup WARNING: /usr/bin/rateup could not read the primary 
log file for xxxx.2
2010-10-20 09:25:14, Rateup WARNING: /usr/bin/rateup The backup log file for 
xxxx.2 was invalid as well
2010-10-20 09:25:13, ERROR: StepTime does not match Avc      900. Please Report 
this.
2010-10-20 09:25:13, ERROR: StepTime does not match Avc     -600. Please Report 
this.
2010-10-20 09:25:13, ERROR: StepTime does not match Avc    -2100. Please Report 
this.
2010-10-20 09:25:13, ERROR: StepTime does not match Avc    -3600. Please Report 
this.
2010-10-20 09:25:13, ERROR: StepTime does not match Avc    -5100. Please Report 
this.
2010-10-20 09:25:13, ERROR: StepTime does not match Avc    -6600. Please Report 
this.
2010-10-20 09:25:13, ERROR: StepTime does not match Avc    -8100. Please Report 
this.
2010-10-20 09:25:13, ERROR: StepTime does not match Avc    -9600. Please Report 
this.
2010-10-20 09:25:13, ERROR: StepTime does not match Avc   -11100. Please Report 
this.

And so on.

Config snippet:

------------------8<-----------------

# Global configuration
WorkDir: /var/www/stats
Refresh: 300
Interval: 5
WriteExpires: No
Forks: 4

Target[xxxx.1]: \Dialer1:pub...@xxxx
MaxBytes[xxxx.1]: 125000000
Title[xxxx.1]: Stats: eth0 (1000MBit Strang)
PageTop[xxxx.1]: <h2>Traffic Analysis for Port Dialer1</h2><p>

------------------8<-----------------

Only resolution so far to delete both the corrupted .log and .old and have all
historical data gone. Or to restore the files from the last known good backup
and have a hole in the data from backup-time to restore-time.

It would be vital to lock the .log-files somehow, so writes get serialized
under any circumstances. Mrtg runs as root (why?) and thus may write into
/var/www/stats.

I remember having messages from mrtg in the log about another mrtg-instance
running and exiting in sarge and etch but not in lenny.

I expect this error to happen, too, if mrtg needs longer than 5 mins to collect
its data.


-- System Information:
Debian Release: 5.0.6
  APT prefers proposed-updates
  APT policy: (500, 'proposed-updates'), (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.26-2-686 (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/bash

Versions of packages mrtg depends on:
ii  debconf [debcon 1.5.24                   Debian configuration management sy
ii  libc6           2.7-18lenny4             GNU C Library: Shared libraries
ii  libgd2-xpm      2.0.36~rc1~dfsg-3+lenny1 GD Graphics Library version 2
ii  libpng12-0      1.2.27-2+lenny4          PNG library - runtime
ii  libsnmp-session 1.12-1                   Perl support for accessing SNMP-aw
ii  perl            5.10.0-19lenny2          Larry Wall's Practical Extraction 
ii  perl-modules    5.10.0-19lenny2          Core Perl modules
ii  zlib1g          1:1.2.3.3.dfsg-12        compression library - runtime

mrtg recommends no packages.

Versions of packages mrtg suggests:
ii  apache2-mpm-prefork [htt 2.2.9-10+lenny8 Apache HTTP Server - traditional n
ii  elinks [www-browser]     0.11.4-3        advanced text-mode WWW browser
ii  links [www-browser]      2.1pre37-1.1    Web browser running in text mode
ii  lynx-cur [www-browser]   2.8.7dev9-2.1   Text-mode WWW Browser with NLS sup
pn  mrtg-contrib             <none>          (no description available)
ii  netsurf [www-browser]    1.2-1           Small portable web browser with CS

-- debconf information excluded



-- 
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to