On 7/22/2013 6:45 AM, Markus Jelsma wrote:
> You should increase your ZK time out, this may be the issue in your case. You 
> may also want to try the G1GC collector to keep STW under ZK time out.

When I tried G1, the occasional stop-the-world GC actually got worse.  I
tried G1 after trying CMS with no other tuning parameters.  The average
GC time went down, but when it got into a place where it had to do a
stop-the-world collection, it was worse.

Based on the GC statistics in jvisualvm and jstat, I didn't think I had
a problem.  The way I discovered that I had a problem was by looking at
my haproxy load balancer -- sometimes requests would be sent to a backup
server instead of my primary, because the ping request handler was
timing out on the LB health check.  The LB was set to time out after
five seconds.  When I went looking deeper with the GC log and some other
tools, I was seeing 8-10 second GC pauses.  G1 was showing me pauses of
12 seconds.

Now I use a heavily tuned CMS config, and there are no more LB switches
to a backup server.  I've put some of my own information about my GC
settings on my personal Solr wiki page:

http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning

I've got an 8GB heap on my systems running 3.5.0 (one copy of the index)
and a 6GB heap on those running 4.2.1 (the other copy of the index).

Summary: Just switching to the G1 collector won't solve GC pause
problems.  There's not a lot of G1 tuning information out there yet.  If
someone can come up with a good set of G1 tuning parameters, G1 might
become better than CMS.

Thanks,
Shawn

Reply via email to