Right! Now we're getting somewhere... The amount of CPU time spent in GC depends on the amount of garbage etc, but CPU time required for it does not get less when a parallel GC spreads the effort over multiple virtual CPUs (deliberate understatement - adding communication between threads will increase cost). And if you engage more virtual CPUs in the GC processing, those virtual CPU will not run application threads at that time. So even when GC does not block, it then sucks out all CPU and you still don't run...
I know I simplified GC by saying that "all stops during GC" since there's clearly parts of the process that can be done while shop is open. I would expect the parallel GC threads to run with low priority so that they take over once the application thread has blocked itself. Such a setup would not benefit from more virtual CPUs than can run at the same time. yes too simplified and prone to cause some undue grief. Things you must know when using GC - remember this is not z/OS RSM or VSM we are relating to here. 1)Throughput is the percentage of total time not spent in garbage collection, considered over long periods of time. 2) Pauses are the times when an application appears unresponsive because garbage collection is occurring. On web servers pauses during garbage collection may be tolerable, or simply obscured by network latencies. But in interactive short pauses are negative 3)Footprint is the working set of a process, measured in pages and cache lines 4) Promptness is the time between when an object becomes dead and when the memory becomes available 5) Generations the most likely impact on performance whether 2 3 or for real or virtual CPUs.You have young old small and large. The sizing of one generation does not affect the collection frequency and pause times for another generation. A very large young generation may maximize throughput, but does so at the expense of footprint, promptness, and pause times. There is no one right way to size generations. The best choice is determined by the way the application uses memory as well as user requirements. For this reason the JVMs choice of a garbage collection are not always optimal, and may be overridden by the user in the form of command line options. I normally run a trace of the collector to see its effectiveness and judge it against response and user experience. Tweaking as we go along. TO just look at CPU activity, blocking and threads will get nowhere since the JVM is its own entity and should be treated with respect. You have a lot riding on understanding it. Richard (Gaz) Gasiorowski Solution Architect CSC 3170 Fairview Park Dr., Falls Church, VA 22042 845-889-8533|Work|845-392-7889 Cell|[email protected]|www.csc.com This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose. From: Rob van der Heij <[email protected]> To: [email protected] Date: 04/26/2011 05:34 AM Subject: Re: multipl cpu's On Tue, Apr 26, 2011 at 12:34 AM, rodgerd <[email protected]> wrote: > On Mon, 25 Apr 2011 22:53:21 +0200, Rob van der Heij > <[email protected]> wrote: > >> >> If it is non-blocking, why would one be concerned about the elapsed >> time of GC and what would be the interest of having multiple threads >> working in parallel on GC. Or would you really have allocation rates >> so high that a single thread could not keep with it? Or should we >> conclude that current implementations don't deploy such algorithms? > > Irrespective of whether a given GC algorithm is blocking or not, I would > be concerned about elapsed time since it's CPU spent on GC, not on > application tasks. Whether the JVM logs GC elapsed times isn't a > particularly useful metric as to whether pauses are happening or not. Right! Now we're getting somewhere... The amount of CPU time spent in GC depends on the amount of garbage etc, but CPU time required for it does not get less when a parallel GC spreads the effort over multiple virtual CPUs (deliberate understatement - adding communication between threads will increase cost). And if you engage more virtual CPUs in the GC processing, those virtual CPU will not run application threads at that time. So even when GC does not block, it then sucks out all CPU and you still don't run... I know I simplified GC by saying that "all stops during GC" since there's clearly parts of the process that can be done while shop is open. I would expect the parallel GC threads to run with low priority so that they take over once the application thread has blocked itself. Such a setup would not benefit from more virtual CPUs than can run at the same time. Rob -- Rob van der Heij Velocity Software http://www.velocitysoftware.com/ ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For more information on Linux on System z, visit http://wiki.linuxvm.org/ ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For more information on Linux on System z, visit http://wiki.linuxvm.org/
