Just doing some post-crash detective work on our web server and have
attributed it to a possible race condition combined with not enough swap
space. The symptoms were the memory usage and load went through the roof 
and eventually the kernel starts killing off processes at random to fix 
the situation. 
(avg 116 vs 0.4 normally) - nice of sendmail to report that in the logs 
:-)

in /etc/cron.daily there's an 'rpm' and 'autorpm' job - autorpm is not 
supplied with RedHat but's a standard add-on we typically use for keeping 
systems up-to-date (instead of the official up2date utility - or apt-rpm).

cron.daily kicks off at the usual 4am and starts autorpm which has

# Start AutoRPM and tell it to wait up to 2 hours before actually
# looking for updates (backgrounds the process to avoid delaying
# other cron jobs)
/usr/sbin/autorpm --notty "auto --delay=7200" &

Now I looked in the code and to see what they mean by 'up to 2 hours' and 
yeah, it's a random time between 0 and 2 hours effectively.
Also kicked off at 4am is the rpm job which has
rpm -qa --qf '%{name}-%{version}-%{release}.%{arch}.rpm\n' 2>&1 \
        | sort > /var/log/rpmpkgs

Now as you know... rpm doesen't like 2 things accessing it's database at 
the same time (at least writing it)... which may possibly explain this 
email:-
***********************************************
AutoRPM 3.3 on <censored hostname> started Fri Feb 13 04:02:02 EST 
2004
 
Delaying 4476 seconds...
Reading Auto-Ignore list... Done.
 
Comparing to locally installed RPMs
 
ERROR: rpm -qa is hanging!  Running rpm --rebuilddb to fix... 


I'm thinking maybe we'll just comment out the line in autorpm which calls 
--rebuilddb and just deal with it manually if it ever occurs. Also, maybe 
autorpm can possibly check if other things are accessing the rpm database 
and/or by default delay a minimum of 10 minutes or something.

-- 
---<GRiP>--- 
Grant Parnell - senior consultant
EverythingLinux services - the consultant's backup & tech support.
Web: http://www.everythinglinux.com.au/services      
We're also busybits.com.au and linuxhelp.com.au and elx.com.au.
Phone 02 8752 6622 to book service or discuss your needs.

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to