Well, Barry, this may all be too much to be trying to cover in an email, but here goes.
One caution I'll offer is that you do want to be careful just taking "a" thread dump and trying to make use of it. First, you really need to compare two in a row, because it's showing what java methods were running in each thread at a moment in time. That's not useful. What matters is, at the next stack trace/thread dump, is it at the same method. And that's complicated as well by your never knowing, for sure, that a given thread is still running the very same template that it was in the last thread dump. Second, as readers will note, it makes for a HUGE email. The thread includes every thread in the entire Java environment, most of which do not concern us. You really just want the CF threads (typically jrpp-, for external web server requests, or web- for internal web server requests, or cfthread- for cfthread requests.) Third, even among all the available CF threads, only some are actually running at the time of your request for the thread dump, so one has to wade through them. A far more valuable thing to do is instead to view the "hanging" threads interactively within FusionReactor, since you have it installed (as I can tell from the thread dump). With FR, you can look at the running requests at any time and, on confirming that one's been running for a few seconds, you can click on it to get a stack trace only of that request. More important, you can then hit the refresh button (on the page) to refresh the stack trace to confirm if the request is indeed still handing on a given method (which shows later what line of CFML was responsible for the java code that's executing at that moment. That's where all this comes together: if you can see that a request is hung for an extended period of time at a given line of code, that's your smoking gun. (One gotcha when doing that sort of a refresh is that the request could have ended when you try to stack trace it a second time. Fortunately, FusionReactor will tell you that, whereas SeeFusion will instead just presume to show you whatever's running on that thread, whether it's the same or a newly running request, which could be very misleading.) Going back to your example, I noticed that at least 2 of those that were in a native method that was related to running CF code were either processing a CFINCLUDE or a CFINVOKE. And again, in each case they were checking the getlastmodifieddate. I'm betting that your problems would go away if you enabled trusted cache, so that CF didn't check this on every page request. (But to be clear, it wouldn't stop ALL such checks. If the template cache isn't large enough for the volume of templates loaded into it, then old ones will get flushed and new requests for files not in the cache would go through this process again.) More important, all this doesn't really solve the real root problem, where for some reason when it DOES need to do file access for the CFML source (to the SAN) it does take a long time. You'll need to study that. It could even be a networking issue between the CF server and the SAN device (I've seen the same happen with extended ping times between a CF server and the database server.) When such interactions happen possibly hundreds of times a second, even more than a dozen milliseconds of ping time can be disastrous. Hope that's helpful. For those interested in more on interpreting stack traces and thread dumps, I gave a talk on that very subject at this year's cf.objective. I may repeat it on the CFMeetup, but until then I do have the slides and some notes at my site, carehart.org/presentations. I'm also considering coming down to this year's cf.objective ANZ and could offer the session there, as well, and perhaps also a day-long pre-conference session on CF server troubleshooting. I'm sure Mark and the organizers would welcome hearing here if you thought this would be an interesting topic (from those of you with the fortitude to have read this far!) /charlie > -----Original Message----- > From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf > Of > BarryC > Sent: Monday, May 17, 2010 11:59 PM > To: cfaussie > Subject: [cfaussie] Re: Coldfusion 9 and Windows server 2008 64bit > > Hi Charlie, > > Sandbox security is off (according to CF Administrator), but that's > what I originally thought as well due to all the > security.AccessController.doPriveleged calls. Unless there are other > parameters at a config file level that are overriding the CF Admin > options and invoking security related stuff? > CF9 was installed from scratch. > > The trusted cache is off by default - we do not have that turned on. > > I've done a test already to eliminate the Network File Store by > setting up a local copy on the server of the files and testing against > that, and the results were the same. > I'll take a look at the tool you suggested anyhow - thanks. > > Here is a thread dump, most of them are similar to this so this should > be a good representation of what's going on. > I'm only pointing the finger at Native method / OS related stuff > because in previous thread dumps in our previous environment (CF 7) > most of the stuff in thread dumps was code related (loops, queries, > scope access) > > -- You received this message because you are subscribed to the Google Groups "cfaussie" group. To post to this group, send email to cfaus...@googlegroups.com. To unsubscribe from this group, send email to cfaussie+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en.