Stephen,

On Dec 17, 2010, at 7:02 PM, Stephen John Smoogen wrote:

> On Thu, Dec 16, 2010 at 14:28, Ken Schumacher <[email protected]> wrote:
>> Greetings,
>> 
>> I have a repeatable problem on at least one of our SLF 4.4 systems.  It 
>> seems that running commands like 'yum --check-update' seem to run into some 
>> sort of memory leak.  The yum output gets to the point of saying "Reading 
>> repository metadata in from local files" and a top listing on a another 
>> window shows the memory use simply climbing.  The original window will not 
>> respond to a Ctrl-C.
> 
> 1) various versions of Yum does not respond to Ctrl-C because doing so
> can cause the rpm package database to be left in a bad place.

That's inconvenient in my current situation, but I understand the thinking 
behind it.  I can work around this by having a second window open allowing me 
to 'kill -15' the yum process once it gets into this bad state.

> 2) Yum will use a lot of memory depending on how much is installed. Of
> course a lot is subjective and needs to be quantified. [100 mb was a
> lot on one system and nothing on another.]

I wait about 60 CPU seconds before killing the yum process.  According to 
'top', at that point it is using 100% of one CPU and it has already allocated 
itself 2 GB of memory.  On this cluster head node, that is just a bit over 10% 
of the node's memory, but I am concerned about letting it go on consuming 
memory for fear of interfering with other services on the node.

I have checked the version of the yum and yum.conf RPMs on this node and 
compared to other systems we maintain.  We have other systems running those 
same versions without this memory consumption problem.  I have run yum using 
the '-d 5' flag to get some verbose debug output.  The last output before this 
memory consumption starts says:

   Reading repository metadata in from local files
   Setting up Package Sacks

> 3) 4.4 is really old. 4.8 is standard now and 4.9 will be out of the
> door by summer (it will also probably be the last 4.x series like the
> 3.9 was the last of the 3 series.)

The node was originally installed with the LTS 4.4 release (Wilson).  Until 
recently, we have been running daily yum updates against the node, so all the 
necessary errata and security updates have been applied.  Being a cluster head 
node, we can't jump the node up to a 5.x release without proper planning and 
scheduling of downtime, etc.  Our user base expects the release to remain 
stable, so such upgrades are carefully considered.

> 
>> We have had to disable the cron.daily yum update on the nodes because it was 
>> causing problems every night when it runs.  FWIW, I did try to run a 'yum 
>> clean all' command.  That runs fine, but the next attempt to run a 
>> check-update suffers the same memory issues.
>> 
>> I've searched through the linux-users and scientific-linux-users archives 
>> and have not found anything like this reported already.  Has anyone seen 
>> this?

I appreciate the comments, but I still have not been able to determine the core 
problem here.  I am unable to run 'yum update' on these nodes until I figure 
out what is causing this "memory leak" problem.  Any other suggestions would be 
appreciated.  

My next step will be to compare the full list of RPMs installed on the nodes 
having this problem against similar nodes where I am not seeing the problem.

Ken S. 

>> 
>> Ken S.
>> 
>> ==============================================================
>> Ken Schumacher  <[email protected]>  (o) 630-840-4579 (f) 630-840-3109
>> Computing Div/HPC  LQCD Group   Loc: WH8E   http://www.usqcd.org/fnal/
>> Fermi National Accelerator Lab; PO Box 500 MS 120 Batavia, IL 60510-0500
>> 
> 
> 
> 
> -- 
> Stephen J Smoogen.
> "The core skill of innovators is error recovery, not failure avoidance."
> Randy Nelson, President of Pixar University.
> "Let us be kind, one to another, for most of us are fighting a hard
> battle." -- Ian MacLaren

==============================================================
Ken Schumacher  <[email protected]>  (o) 630-840-4579 (f) 630-840-3109
Computing Div/HPC  LQCD Group   Loc: WH8E   http://www.usqcd.org/fnal/
Fermi National Accelerator Lab; PO Box 500 MS 120 Batavia, IL 60510-0500

Reply via email to