Re: Limit number of frames of real storage per job

Andrew Rowley Thu, 06 Aug 2015 18:00:51 -0700

I have seen the advice to avoid garbage collection in batch from IBMersbefore. I don't understand it, and I am curious to know where it iscoming from. I doubt it is endorsed by the JVM developers. I suspect itmight just be that suddenly we can measure memory management overhead,where it is more difficult in other languages.

Garbage collection is Java's way of returning unused memory for reuse.You could reduce memory management overhead of a batch C++ program byremoving all delete statements, and increasing the virtual storageavailable until it never ran out. You COULD, but no-one would recommendit as good practice. Overallocating the heap to avoid garbage collectionis basically the same thing.

Applications tend to evolve and grow over time. If you deliberately setup your application to avoid GC, you may be in for a rude shock when theapplication grows and one day GC is triggered.

There can also be performance advantages from GC. GC moves objectstogether in storage, making it much more likely that your applicationdata will be in the processor caches. If GC keeps your data in processorcache it will perform much better than if it's scattered across a GB ofstorage.


On the other points:

Stess testing - memory management is usually an important factor inapplication performance, so I'm not sure how valid any stress test thatavoided garbage collection would be. (Processor cache effects etc. asmuch as GC overhead).

Memory leaks - this applies to any language - if the memory isn'treleased, of course you need enough virtual storage to support itbetween restarts.

Page outs - paging Java out is very different to other applications dueto the GC memory access pattern. Yes, any inactive application will besubject to page out (maybe? I have seen some information about pagefixed pages for Java - I don't know anything about it though). What youdon't want is portions of the heap paged out from an active application.When Java performs a GC it is going to touch every page in the heap* -so if you have 200MB paged out and an innocent 50 byte memory allocationtriggers GC it has to wait for 200MB of pages to be paged in one by onebefore the allocation completes (assuming Java/zOS don't recognize andoptimize this page-in pattern). This is different to other languageswhere pages will be paged in one at a time as required, and only if theyhave active data.

I'm not saying to economize on real storage, on the contrary. Theoriginal poster asked about testing Java applications with a shortage ofreal storage - my response is that the performance will probably beunacceptable and it's not worth testing - just make sure you DO haveenough real storage for the application.


On this side track of heap size and garbage collection my advice is:

1) Do not fear garbage collection. It is part of a normal Javaapplication. It does need to be carefully tuned for response timesensitive applications, but for these applications any paging of theJava heap will likely be disastrous.2) Do not allocate a heap so large that you risk paging instead of GC.Paging is far worse than GC for Java performance.n.b. the definition of "a too large heap" is a moving target. I wouldsay it is enough storage inactive for long enough that parts of the heapmight be paged out. It would be unusual for a few hundred MB in a normalbatch job to be an issue.


Regards

Andrew Rowley
Black Hill Software

* My understanding of what happens. I'm happy to be corrected by someonewith more knowledge of GC strategies and internals.



On 6/08/2015 14:49, Timothy Sipples wrote:

I agree with Andrew Rowley's advice so long as it's properly understood to
be *general* advice -- "rules of thumb." There are some very interesting
exceptions. (Aren't there always? :-))

Regarding making the Java heap "too large," there are some use cases --
Java batch, notably -- where you really do want to make the heap "too
large," or at least slightly too large. If the JVM is transitory, and if
you can avoid any/all garbage collection during the transitory life of the
program, that might be a perfectly wonderful, optimal outcome. "It
depends." Another potential scenario is stress testing, perhaps during the
initial phases, when you're trying to understand the performance and
scalability characteristics of an application before allowing garbage
collection to "interfere" with your assessments. (Maybe you don't have the
best measurement tools?) Or you're simply trying to determine how much is
"too much," so you start with "too much" in your testing.

Maybe you have a defective application that's got a memory leak, and
garbage collection eventually cannot accomplish anything. The application
instance then abends. But to avoid restarting the application instance too
frequently you throw "too much" memory at the application instance(s) until
you and/or the vendor can fix the leak. (Been there.) (It depends on your
point of view what "too much" means in these cases. Theoretically such a
defective application requires an infinite amount of heap, so it can never
have "too much.")

There are situations when it can be perfectly reasonable to page out.
Examples: development and test environments, and cloned execution instances
when you don't need all the clones running but would like to have some
paged in as demand warrants. Basically anything/everything that is highly
transient, with temporary and occasional demand, but you want to avoid full
startup. It's really "thrashing" that you want to avoid. Though paging
might be necessary to produce thrashing, it's not sufficient.

All that said, I see way too many cases of operators/sysprogs/managers
perversely trying to economize on memory, some perhaps remembering the
"good old days" when "Hello World!" required only a few bytes. For better
or worse, that hasn't been true for at least a couple decades. Suck it up
and spend the memory. :-)


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Re: Limit number of frames of real storage per job

Reply via email to