Hi Peter,

 We were logging the GC output as per this before, have since
 taken it out, but will put it back in I think.

 Apropos logging - I've found that with RMI to our boxes at
 EC2 I've had to do the ugly thing with this:
 -Djava.rmi.server.hostname=<external / public IP address>

 .. which then renders nodetool useless, as it can't talk to
 the localhost or internal IP address / hostname any more.


 The clocks - I'm pretty sure these are very accurate, but
 will investigate this tomorrow morning just in case there's
 some drift happening.


 We think we might have cracked the underlying problem
 though, and it might be similar to the 'behind the scenes
 swap thing' (sadly I suspect that such things might actually
 be happening -- plus I thought that memory overcommit wasn't
 possible with Xen - only with VMware - but I guess they
 could have done all kinds of things with Xen by now over
 there.)

 There's a spinlock problem that's been identified elsewhere
 where the JVM mis-detects the number of cores it has
 running - based on the underlying architecture - and so
 we've reverted to parallel GC and forced the number of
 threads:

        -XX:+UseParallelGC
        -XX:MaxGCPauseMillis=100
        -XX:ParallelGCThreads=3"

 It *seems* to be working a bit better at the moment, but I'll
 be more comfortable with feeling optimistic after a night's
 worth of jobs have been thrown at it :)

 j.

Reply via email to