Dom, That is a GREAT play by play... Thanks!
Mark A. Kruger, CFG, MCSE (402) 408-3733 ext 105 www.cfwebtools.com www.coldfusionmuse.com www.necfug.com -----Original Message----- From: Dominic Watson [mailto:[email protected]] Sent: Thursday, November 19, 2009 9:09 AM To: cf-talk Subject: Re: Debugging Out of Memory errors, SeeFusion Memory / Active requests graph An update on this, for my own reference as well as for people googling. I have, following various people's advice, changed the memory profile of our application quite dramatically and I will summarize what I changed and the tools I used to find the problems. I shall properly blog this later, no time now. Firstly, after installing and using JConsole, I figured it was high time I learned about JVM and its memory spaces. I found the following articles very useful: http://www.slideshare.net/gengmao/inside-the-jvm-memory-management-and-troub leshooting http://www.javaworld.com/javaworld/jw-01-2002/jw-0111-hotspotgc.html I had also followed Paul's advice and increased New Gen space in the heap, which did have an immediate effect, just not as large as I had hoped - more on that in a bit. Next I found error logs in the same directory as the jvm.config file (mine was {colfusion}/runtime/bin), the part dump of which is seen below. It turned out this was a reasonably common error, and not just to do with CF: http://www.talkingtree.com/blog/index.cfm/2006/4/28/Understanding-HotSpot-in -Plain-English This article recommended using the -Xint JVM option to turn off Hotspot optimization. I did this and the effect was pretty disastrous on performance (the effect can vary a lot depending on the nature of your application, apparently). I searched some more and found an article suggesting that to avoid the error, free 'Survivor' space needed increasing - this seemed in line with the state of memory at the time of my crash so I went ahead and started tweaking, bringing me back to Paul's earlier advice but finally seeing me use more dramatic settings: -Xmn384m -XX:SurvivorRatio=1 -XX:TargetSurvivorRatio=25 This says, use 384Mb for new gen, of which 2/3rds will be for Survivor space; also, the GC will aim to keep survivor space at least 75% empty (no more than 25% full). I ended with these figures after a whole heap of teaking that had previously brought my memory spikes down. This had the benefit of clearly showing little jumps in Old Gen memory every 10 seconds or so. I tracked these jumps down using JConsole to spot the time of the jump and SeeFusion to see what templates were requested at those times. My tracking led me to the same page that appeared in my error log, a verity search. I googled and found: "Operations on collections (e.g., CFSearch and CFIndex) can cause growth over time." http://kb2.adobe.com/cps/175/tn_17517.html No workarounds, that's just the way it is. I noted the size of memory being eaten by these searches (up to and around 15MB) and ended with the JVM settings quoted above, the idea being that there should be enough new gen memory available to cope with these verity searches (so that object don't just get dumped into Old gen all the time). Finally, and most dramatically, I found some code that allowed a blank search against verity. When I ran it, I found it ate 100-150Mb or RAM each time. Using subversion, I found out when this became possible and why. It turns out this code was committed and deployed the same day we started seeing the crash reports on our servers and that it was an unneccessary change, fixing the symptom of a bug but not the cause. I reverted the change. So now I have a very different JVM memory configuration on one of our live servers and have committed a code change that will alleviate some horrid memory abuse to all three. I have JConsole monitoring all three servers and can see the modified server memory useage starkly contrasted against the other two. I shall keep them running for a while and see how things go... Anyways, enough already. Dominic 2009/11/18 Dominic Watson <[email protected]> > Right, so changing Eden memory size had a slight effect on the memory > useage pattern (spikes only going up to around 80% instead of 95%), > but did not stop the crashing. > > Getting to grips with JConsole has been really useful / educating. I > have been monitoring all three of our live servers overnight and one > of the servers bombed this morning; I was then able to see what was > going on in memory when the bomb occurred: > > * Old Gen - normal, using far less than peak > * Perm gen - normal, sub 90MB > * Eden - normal, far less than the 192MB allocated > * Survivor - had been flat at max (20MB) for around 20 seconds. Normal > shape of this graph for the server is very spikey, peaking very > briefly and very often > > I then discovered the hs_err log for the crash (yes, I'm a server > debugging > noob) and it told me the following: > > # > # An unexpected error has been detected by Java Runtime Environment: > # > # java.lang.OutOfMemoryError: requested 1160376 bytes for Chunk::new. > Out of swap space? > # > # Internal Error (allocation.cpp:218), pid=11476, tid=12428 # Error: > Chunk::new # # Java VM: Java HotSpot(TM) Server VM (11.3-b02 mixed > mode windows-x86) # If you would like to submit a bug report, please > visit: > # http://java.sun.com/webapps/bugreport/crash.jsp > # The crash happened outside the Java Virtual Machine in native code. > # See problematic frame for where to report the bug. > # > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:328510 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

