Well, Barry, this may all be too much to be trying to cover in an email, but 
here
goes. 

One caution I'll offer is that you do want to be careful just taking "a" thread 
dump
and trying to make use of it. 

First, you really need to compare two in a row, because it's showing what java 
methods
were running in each thread at a moment in time. That's not useful. What 
matters is,
at the next stack trace/thread dump, is it at the same method. And that's 
complicated
as well by your never knowing, for sure, that a given thread is still running 
the very
same template that it was in the last thread dump.

Second, as readers will note, it makes for a HUGE email. The thread includes 
every
thread in the entire Java environment, most of which do not concern us. You 
really
just want the CF threads (typically jrpp-, for external web server requests, or 
web-
for internal web server requests, or cfthread- for cfthread requests.) 

Third, even among all the available CF threads, only some are actually running 
at the
time of your request for the thread dump, so one has to wade through them.

A far more valuable thing to do is instead to view the "hanging" threads 
interactively
within FusionReactor, since you have it installed (as I can tell from the thread
dump). With FR, you can look at the running requests at any time and, on 
confirming
that one's been running for a few seconds, you can click on it to get a stack 
trace
only of that request. 

More important, you can then hit the refresh button (on the page) to refresh 
the stack
trace to confirm if the request is indeed still handing on a given method 
(which shows
later what line of CFML was responsible for the java code that's executing at 
that
moment. That's where all this comes together: if you can see that a request is 
hung
for an extended period of time at a given line of code, that's your smoking gun.

(One gotcha when doing that sort of a refresh is that the request could have 
ended
when you try to stack trace it a second time. Fortunately, FusionReactor will 
tell you
that, whereas SeeFusion will instead just presume to show you whatever's 
running on
that thread, whether it's the same or a newly running request, which could be 
very
misleading.)

Going back to your example, I noticed that at least 2 of those that were in a 
native
method that was related to running CF code were either processing a CFINCLUDE 
or a
CFINVOKE. And again, in each case they were checking the getlastmodifieddate. 
I'm
betting that your problems would go away if you enabled trusted cache, so that 
CF
didn't check this on every page request. (But to be clear, it wouldn't stop ALL 
such
checks. If the template cache isn't large enough for the volume of templates 
loaded
into it, then old ones will get flushed and new requests for files not in the 
cache
would go through this process again.)

More important, all this doesn't really solve the real root problem, where for 
some
reason when it DOES need to do file access for the CFML source (to the SAN) it 
does
take a long time. You'll need to study that. It could even be a networking issue
between the CF server and the SAN device (I've seen the same happen with 
extended ping
times between a CF server and the database server.) When such interactions 
happen
possibly hundreds of times a second, even more than a dozen milliseconds of 
ping time
can be disastrous.

Hope that's helpful. For those interested in more on interpreting stack traces 
and
thread dumps, I gave a talk on that very subject at this year's cf.objective. I 
may
repeat it on the CFMeetup, but until then I do have the slides and some notes 
at my
site, carehart.org/presentations. 

I'm also considering coming down to this year's cf.objective ANZ and could 
offer the
session there, as well, and perhaps also a day-long pre-conference session on CF
server troubleshooting. I'm sure Mark and the organizers would welcome hearing 
here if
you thought this would be an interesting topic (from those of you with the 
fortitude
to have read this far!)

/charlie


> -----Original Message-----
> From: cfaussie@googlegroups.com [mailto:cfaus...@googlegroups.com] On Behalf 
> Of
> BarryC
> Sent: Monday, May 17, 2010 11:59 PM
> To: cfaussie
> Subject: [cfaussie] Re: Coldfusion 9 and Windows server 2008 64bit
> 
> Hi Charlie,
> 
> Sandbox security is off (according to CF Administrator), but that's
> what I originally thought as well due to all the
> security.AccessController.doPriveleged calls. Unless there are other
> parameters at a config file level that are overriding the CF Admin
> options and invoking security related stuff?
> CF9 was installed from scratch.
> 
> The trusted cache is off by default - we do not have that turned on.
> 
> I've done a test already to eliminate the Network File Store by
> setting up a local copy on the server of the files and testing against
> that, and the results were the same.
> I'll take a look at the tool you suggested anyhow - thanks.
> 
> Here is a thread dump, most of them are similar to this so this should
> be a good representation of what's going on.
> I'm only pointing the finger at Native method / OS related stuff
> because in previous thread dumps in our previous environment (CF 7)
> most of the stuff in thread dumps was code related (loops, queries,
> scope access)
> 
> 


-- 
You received this message because you are subscribed to the Google Groups 
"cfaussie" group.
To post to this group, send email to cfaus...@googlegroups.com.
To unsubscribe from this group, send email to 
cfaussie+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/cfaussie?hl=en.

Reply via email to