Linux on 390 Port <[email protected]> wrote on 08/31/2006 10:50:32 AM:

> On Thursday, 08/31/2006 at 10:16 EST, James Melin
> <[EMAIL PROTECTED]> wrote:
> > I'm trying to discover what might possibly be running at 4:00 AM on all
> my
> > penguins that might cause excessive cumulative VM paging (we're over
> > committed real by 12-18%) at one time.
> >
> > Obviously something appears to be configured on all systems to do the
> same
> > thing at the same time.
>
> > I see nothing that I can think of that would cause excessive VM paging,
> nor do
> > I see any after the fact evidence that my Linuxes were doing excessive
> > paging. I would expect to have seen massive swapping internally to Linux
> before
> > a VM paging event that would come close to exhausting the page pool
> > because of excessive real and xstor usage.
>
> If CP can hold the all of the Linux guests' active pages in real memory,
> then Linux can swap furiously while CP is not paging at all.  But if there
> is not enough real memory to hold all the needed pages used by all of the
> active guests, plus all the memory that CP needs for himself, then CP will
> start to page.  Linux swapping increases as processes within Linux start
> to contend for memory *within Linux*.  The two aren't directly related.
>

The point I was trying to make is that I'm not seeing anything within the 
various linux guests that happens at 4:00 AM (that's being logged properly,
mind or in the cron daily or cron hourly) that also doesn't happen at various 
times per day, or that should be a huge impactor on memory usage. The
ntp process update I see is the only thing I can see in the log that has 
commonality in the 4:00-5:00 interval. I'll be setting up some snapshot
traps. You have any feel for what 9 instances of NTPD being launched to get the 
system time skew corrected might to to overhead?

> The problem you experienced is *exactly* why having all the guests' cron
> jobs fire at the same time is a Bad Idea.  It's like opening a can of cat
> food in a room full of hungry cats.  They all come running towards you at
> the same time, but you have only one can of catfood.

Yes, I know it's a bad idea. Don't have thing that are terribly consequential 
firing in crontab. We moved the one workload in the crontab that I
thought was the obvious candidate for this to a spread spectrum thing (RMFPM 
Archive) and it didn't ameliorate the problem. So more sleuthing to do.

This one is being non-obvious. I may just have to watch the system live some 
morning to see what it is doing if I cant set snapshots and get anythign
meaningful.

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to