Re: Swap oddities

Marcy Cortes Mon, 29 Oct 2007 21:10:13 -0800

 
Thanks Vic!

JVM heap size and garbage collection seem to be under control.  Believe
me, this is well looked at by both us and IBM's finest since it is a
huge 15 IFL at peak app :)... We've had our share there of these and
badly written code before...  The generational GC, new with 6.1, seems
to be a *phenomonal* difference (and I don't say that lightly never
believing that perf knobs at the app level save you much of anything at,
saving 13% in CPU and shaving 100ms off of response times (on a 500ms
response time transaction).  Unbelievable...  So I gotta believe they
are close to as good as it gets for your average programmer... Course
there's always the outlier/different transaction that could be coming in
and gumming up the whole system.. ...  And of course WAS support says
all of their leaks are fixed now :)) ---- (and there is some significant
ones there apparently in fixpacks less than the current 11 if you study
WAS 6. support site:)..  They are saying this is a "native memory" leak,
not in the JVM heap, so tracing that is needed is totally invasive and
therefore nearly impossible in our env.  And that there is a possibility
that it will *stablize* at maybe 3-5 Gig thus telling us what the
virtual memory size should be.... (hard to believe when it was so happy,
even overcommitted and probably needing 1.3G,  on WAS 5/sles8 at 1.5
Gig, but you know 64bit is bigger :) ) We're leaving some up with larger
swap sizes now pending the "stabilization" or near crash, whichever
comes first.

Swapoff does appear to cause some long pauses, so can't do that in
production :(  Can't afford to lose even 1 second because that results
in ATMs not reaching back end systems...  We recycle weekly anyway for
DR reasons...so for now... We just need to make it 7 days without loss
of response time.

It probably does make more sense to keep adding vdisks & vm paging
volumes rather than dedicated disk for swapping.  At least they'll all
share that way ( clustered app with a few servers on each lpar)...

Now on the otherhand.. Our test environment with probably 35 out of 100
running WAS6 it becomes not an aberration but the norm for the load we
have there unfortunately.  Luckily the paging system is so robust (I
think we hit 20K per sec to DASD in todays Monday morning fun).  More
experimention is definitely needed there!

Marcy Cortes 

"This message may contain confidential and/or privileged information. If
you are not the addressee or authorized to receive this for the
addressee, you must not use, copy, disclose, or take any action based on
this message or any information herein. If you have received this
message in error, please advise the sender immediately by reply e-mail
and delete this message. Thank you for your cooperation."

-----Original Message-----
From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of
Vic Cross
Sent: Monday, October 29, 2007 7:29 PM
To: [email protected]
Subject: Re: [LINUX-390] Swap oddities

On Tue, 30 Oct 2007 06:19:33 am Marcy Cortes wrote:
> I'm not sure it is working as designed.

I never said it was a good design -- and perhaps I should have read your
earlier messages prior to saying that. :)  It does depend on your point
of view though -- it's another one of these aspects that belies Linux's
single-system non-resource-sharing heritage.  In a non-shared
environment, keeping swap pages hanging around on disk is a good design
point in that it can realistically save costly I/O.  It's not so good
for us though.  :)

> Eventually, when we use up our
> swap, WAS crashes OOM (that's *our* real issue, at least our biggest 
> one anyway :).

Yes... and that's not going to be solved by CMM or creating different
swap VDISKs or anything like that.  The earlier hints about JVM heap
size and garbage collection and so on will be useful here.  I guess the
application is being checked for leaks as well -- or do your developers
write perfect code first-time-every-time too? ;-P

> But if we are able to swapoff/swapon and recover that space without 
> crashing WAS that kind a says to me that it didn't need it anyway - 
> course I haven't tried that whilst workload was running through...  
> Maybe it is destructive.

It might be, but as long as your Linux has more free virtual memory than
the amount of pages in use on the device you want to remove, you
*should* be able to do a swapoff without impact (things might get a
little sluggish for a few seconds while kswapd shuffles things around
though).  It would be nice to be able to tell accurately just how much
swap space is being used on a device -- /proc/meminfo is system-wide.
SwapCached in /proc/meminfo is a helpful indicator that counts the swap
space "hanging around" (you could try http://www.linuxweblog.com/meminfo
among heaps of other places for more info about what the numbers from
meminfo mean); if this number is low compared to your total available
swap then you're not likely to get much benefit from swapoff/swapon
cycles.

> We plan to experiment some with the vm.swapiness and see if that
helps.
> I guess in the very least, we can add enough vdisks and enough VM 
> paging packs to get through week without a recycle until we figure 
> this out as long as response time & cpu savings remain this good with
6.1.

Good plan, although vm.swappiness is only likely to delay your swap
usage rather than eliminate it entirely (if something is asking for that
much memory, at some point it's going to have to get it from somewhere).
Of course If it delays heavy swapping long enough to get you through the
week then that's a win.

While you've got this WAS issue you are *possibly* justified in throwing
a DASD swap device at the end of your line of VDISKs (I emphasise
possibly because I don't want to offend Rob et al too much).  Perhaps
the last thing you want would be to just keep adding VDISKs and VM page
packs until your VM paging system is consumed by leaked Linux memory.
You could do a nightly swapoff/swapon of some of the VDISKs to flush
things out and reduce the activity to the DASD swap.  I guess what I'm
saying is that you could think about this WAS problem as an abberation
rather than the normal operating mode for your system -- don't
jeopardise your entire environment for the sake of one problem system,
and be prepared to let best-practice slide a bit while you get the issue
sorted.  Of course you're in a much better position than me to decide if
your paging environment needs such protection.

I also transposed my client's problem onto your shop -- I thought you
were concerned about the number of pages allocated to VDISKs.  That's
why I mentioned the stuff about DELETE/DEFINE of your VDISK swaps.

Best of luck with the issue!

Cheerio,
Vic Cross

--
This message has been scanned for viruses and dangerous content by
MailScanner, and is believed to be clean.

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions, send
email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit http://www.marist.edu/htbin/wlvindex?LINUX-390

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Re: Swap oddities

Reply via email to