I will also mention that a lot of time when we benchmark object-heavy
algorithms in JRuby (which is most algorithms), the bottleneck is not
in GC as much as it is in allocation rates. Kirk Pepperdine ran some
numbers for me a few months back and calculated that for numeric
benchmarks we were basically saturating the memory pipeline to
*acquire* objects, even though GC times were negligible. It sounds
like you're cranking through a lot of data...could you simply be
asking too much of memory? What sort of application is it? Is it using
one of the object-heavy languages like JRuby or Clojure?

On Wed, May 5, 2010 at 1:28 PM, Charles Oliver Nutter
<[email protected]> wrote:
> Ok, my next thought would be that your young generation is perhaps too
> big? I'm sure you've probably tried choking it down and letting GC run
> more often against a smaller young heap?
>
> If GC times for young gen are getting longer, something has to be
> changing. Finalizers? Weak/SoftReferences? You say you're not getting
> to the point of CMS running, but a large young gen can still take a
> long time to collect. Do less more often?
>
> On Wed, May 5, 2010 at 9:25 AM, Matt Fowles <[email protected]> wrote:
>> Charles~
>>
>> I settled on that after having run experiments varying the survivor
>> ratio and tenuring threshold.  In the end, I discovered that >99.9% of
>> the young garbage got caught with this and each extra young gen run
>> only reclaimed about 1% of the previously surviving objects.  So it
>> seemed like the trade off just wasn't winning me anything.
>>
>> Matt
>>
>> On Tue, May 4, 2010 at 6:50 PM, Charles Oliver Nutter
>> <[email protected]> wrote:
>>> Why are you using -XX:MaxTenuringThreshold=0? That's basically forcing
>>> all objects that survive one collection to immediately be promoted,
>>> even if they just happen to be a slightly longer-lived young object.
>>> Using 0 seems like a bad idea.
>>>
>>> On Tue, May 4, 2010 at 4:46 PM, Matt Fowles <[email protected]> wrote:
>>>> All~
>>>>
>>>> I have a large app that produces ~4g of garbage every minute and am
>>>> trying to reduce the size of gc outliers.  About 99% of this data is
>>>> garbage, but almost anything that survives one collection survives for
>>>> an indeterminately long amount of time.  We are currently using the
>>>> following VM and options
>>>>
>>>>
>>>> java version "1.6.0_16"
>>>> Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
>>>> Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)
>>>>
>>>> -verbose:gc
>>>> -Xms32g -Xmx32g -Xmn4g
>>>> -XX:+UseParNewGC
>>>> -XX:ParallelGCThreads=4
>>>> -XX:+UseConcMarkSweepGC
>>>> -XX:ParallelCMSThreads=4
>>>> -XX:MaxTenuringThreshold=0
>>>> -XX:SurvivorRatio=20000
>>>> -XX:CMSInitiatingOccupancyFraction=60
>>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>> -XX:+CMSParallelRemarkEnabled
>>>> -XX:MaxGCPauseMillis=50
>>>> -Xloggc:gc.log
>>>>
>>>>
>>>>
>>>> As you can see from the GC log, we never actually reach the point
>>>> where the CMS kicks in (after app startup).  But our young gens seem
>>>> to take increasingly long to collect as time goes by.
>>>>
>>>> One of the major metrics we have for measuring this system is latency
>>>> as measured from connected clients and as measured internally.  You
>>>> can see an attached graph of latency vs time for the clients.  It is
>>>> not surprising the the internal latency (green and labeled 'sb') is
>>>> not as large as the network latency.  I assume this is in part because
>>>> VM safe points are less likely to occur within our internal timing
>>>> markers.  But, one can easily see how the external latency
>>>> measurements (blue and labeled 'network') display the same steady
>>>> increase in times.
>>>>
>>>> My hope is to be able to tweak young gen size and trade off GC
>>>> frequency with pause length; however, the steadily increasing GC times
>>>> are proving to persist regardless of the size that I make the young
>>>> generation.
>>>>
>>>> Has anyone seen this sort of behavior before?  Are there more switches
>>>> that I should try running with?
>>>>
>>>> Obviously, I am working to profile the app and reduce the garbage load
>>>> in parallel.  But if I still see this sort of problem, it is only a
>>>> question of how long must the app run before I see unacceptable
>>>> latency spikes.
>>>>
>>>> Matt
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "JVM Languages" group.
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to 
>>>> [email protected].
>>>> For more options, visit this group at 
>>>> http://groups.google.com/group/jvm-languages?hl=en.
>>>>
>>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "JVM Languages" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to 
>>> [email protected].
>>> For more options, visit this group at 
>>> http://groups.google.com/group/jvm-languages?hl=en.
>>>
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "JVM Languages" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to 
>> [email protected].
>> For more options, visit this group at 
>> http://groups.google.com/group/jvm-languages?hl=en.
>>
>>
>

-- 
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en.

Reply via email to