> On node "172.16.107.46", I see the following:
>
> 21:53:27.192+0100: 1335393.834: [GC 1335393.834: [ParNew (promotion failed): 
> 319468K->324959K(345024K), 0.1304456 secs]1335393.964: [CMS: 
> 6000844K->3298251K(8005248K), 10.8526193 secs] 6310427K->3298251K(8350272K), 
> [CMS Perm : 26355K->26346K(44268K)], 10.9832679 secs] [Times: user=11.15 
> sys=0.03, real=10.98 secs]
> 21:53:38,174 GC for ConcurrentMarkSweep: 10856 ms for 1 collections, 
> 3389079904 used; max is 8550678528
>
> I have not yet tested the "XX:+DisableExplicitGC" switch.
>
> Is the right thing to do to decrease the CMSInitiatingOccupancyFraction 
> setting?

* Increasing the total heap size can definitely help; the only kink is
that if you need to increase the heap size unacceptably much, it is
not helpful.
* Decreasing the occupancy trigger can help yes, but you will get very
much diminishing returns as your trigger fraction approaches the
actual live size of data on the heap.
* I just re-checked your original message - you're on Cassandra 0.7? I
*strongly* suggest upgrading to 1.x. In general that holds true, but
also specifically relating to this are significant improvements in
memory allocation behavior that significantly reduces the probability
and/or frequency of promotion failures and full gcs.
* Increasing the size of the young generation can help by causing less
promotion to old-gen (see the cassandra.in.sh script or equivalent of
for Windows).
* Increasing the amount of parallel threads used by CMS can help CMS
complete it's marking phase quicker, but at the cost of a greater
impact on the mutator (cassandra).

I think the most important thing is - upgrade to 1.x before you run
these benchmarks. Particularly detailed tuning of GC issues is pretty
useless on 0.7 given the significant changes in 1.0. Don't even bother
spending time on this until you're on 1.0, unless this is about a
production cluster that you cannot upgrade for some reason.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Reply via email to