The server hangs, this technique should help us discover what process is
hanging, how critical is it to execute this task in a timely manner,
i.e. if a thread is hung, it is hung and if we don't get to it for say 5
or 10 minutes we should still get the proper debug, no?

==================== 
Ronald West
Senior Applications Developer
PaperThin
617-471-4440 x219
[EMAIL PROTECTED]
 
CMSWatch recently ranked PaperThin's CommonSpot Content Server "Best
Overall Value" among leading content management solutions. Find out more
at www.paperthin.com.

-----Original Message-----
From: Steven Erat [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 24, 2003 4:46 PM
To: CF-Linux
Subject: RE: Too Many Files Open

In the stack trace you first sent, ColdFusion had compiled the page and
was trying to get a file handle to write out the actual class file to
/WEB-INF/cfclasses/.  This isn't really a runtime error because the
server never even got to run the requested template.   ColdFusion would
continue being able to serve pages that had already been requested and
compiled.

You mention crashing... Can you elaborate on how you define crashing?
Did the server stop responding or did the process stop/restart ?   I
think that you may have other issues going on, such as bottlenecks or
hung threads.

If it stopped responding, but was still up, then a thread dump is the
way to go.  
See:
http://www.macromedia.com/v1/handlers/index.cfm?ID=23523&Method=Full

A way to confirm that CF is still running is to call cfstat from a
terminal:  
/opt/coldfusionmx/bin/cfstat.sh 1

Then you should see output in columns.  This output is described in:
ftp://ftp2.allaire.com/support/cfstat.pdf . The fact that the columns
are outputting data is a sign that the server is running.  In fact, I'd
expect that you might see Reqs Running fixed at the value for
Simultaneous Requests in the CFAdmin, and Reqs Q'd would be growing.

When you get a thread dump, look for "Full Thread Dump" in the generated
output, which delimits the start of the dump.  Then look for thread id's
starting with "jrpp".  Of the jrpp threads, some of them should mention
a specific coldfusion template and line number.

For example, here one of the jrpp threads that was running during a
stack dump while CFMX was "hung".  This is a Windows example, but the
same applies to other platforms:

"jrpp-2928" prio=5 tid=0x2cedb280 nid=0x1294 runnable
[0x368cf000..0x368cfdbc]
        at java.net.SocketInputStream.socketRead(Native Method)
        at java.net.SocketInputStream.read(Unknown Source)
        at
macromedia.util.UtilSocketDataProvider.getArrayOfBytes(Unknown Source)
        at
macromedia.util.UtilBufferedDataProvider.cacheNextBlock(Unknown Source)
        at
macromedia.util.UtilBufferedDataProvider.getArrayOfBytes(Unknown Source)
        at
macromedia.jdbc.sqlserver.SQLServerDepacketizingDataProvider.signalStart
OfPacket(Unknown Source)
        at
macromedia.util.UtilDepacketizingDataProvider.getArrayOfBytes(Unknown
Source)
        at
macromedia.util.UtilIntelligentBufferingDataProvider.getArrayOfBytes(Unk
nown Source)
        at macromedia.util.UtilByteOrderedDataReader.readString(Unknown
Source)
        at
macromedia.jdbc.sqlserver.tds.TDSRequest.getReturnedValue(Unknown
Source)
        at
macromedia.jdbc.sqlserver.tds.TDSRequest.getColumnDataForRow(Unknown
Source)
        at macromedia.jdbc.sqlserver.tds.TDSRequest.getRow(Unknown
Source)
        at
macromedia.jdbc.sqlserver.tds.TDSRequest.completeRowProcessing(Unknown
Source)
        at
macromedia.jdbc.sqlserver.SQLServerImplResultSet.close(Unknown Source)
        at macromedia.jdbc.base.BaseResultSet.close(Unknown Source)
        at coldfusion.sql.Executive.getRowSet(Unknown Source)
        at coldfusion.sql.Executive.executeQuery(Unknown Source)
        at coldfusion.sql.Executive.executeQuery(Unknown Source)
        at coldfusion.sql.SqlImpl.execute(Unknown Source)
        at coldfusion.tagext.sql.QueryTag.doEndTag(Unknown Source)
        at
cffoobar2ecfm1150536956.runPage(C:\Inetpub\wwwroot\foobar.cfm:7)
        at coldfusion.runtime.CfJspPage.invoke(Unknown Source)
        at coldfusion.tagext.lang.IncludeTag.doStartTag(Unknown Source)
        at coldfusion.runtime.CfJspPage._emptyTag(Unknown Source)
        at
cfindex2ecfm2109127569.runPage(C:\Inetpub\wwwroot\index.cfm:87)
        at coldfusion.runtime.CfJspPage.invoke(Unknown Source)
        at coldfusion.tagext.lang.IncludeTag.doStartTag(Unknown Source)
        at coldfusion.filter.CfincludeFilter.invoke(Unknown Source)
        at coldfusion.filter.ApplicationFilter.invoke(Unknown Source)
        at coldfusion.filter.RequestMonitorFilter.invoke(Unknown Source)
        at coldfusion.filter.PathFilter.invoke(Unknown Source)
        at coldfusion.filter.ExceptionFilter.invoke(Unknown Source)
        at coldfusion.filter.BrowserDebugFilter.invoke(Unknown Source)
        at coldfusion.filter.ClientScopePersistenceFilter.invoke(Unknown
Source)
        at coldfusion.filter.BrowserFilter.invoke(Unknown Source)
        at coldfusion.filter.GlobalsFilter.invoke(Unknown Source)
        at coldfusion.filter.DatasourceFilter.invoke(Unknown Source)
        at coldfusion.CfmServlet.service(Unknown Source)
        at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:106)
        at
jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42)
        at
jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:241
)
        at
jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:527
)
        at
jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:
198)
        at
jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable(ThreadPool.j
ava:348)
        at
jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java
:451)
        at
jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable(ThreadPool.jav
a:294)
        at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)

In the above example jrpp thread from a thread dump, you see a reference
to two ColdFusion templates (file names changed to protect the
innocent).  Specifically, you see:

.....
at cffoobar2ecfm1150536956.runPage(C:\Inetpub\wwwroot\foobar.cfm:7)
.....
at cfindex2ecfm2109127569.runPage(C:\Inetpub\wwwroot\index.cfm:87)
at coldfusion.runtime.CfJspPage.invoke
.....

The trace is read from the bottom up.  Everything from the bottom until
the first CFJspPage.invoke call is the basics of just starting any CFMX
template.  Then you see C:\Inetpub\wwwroot\index.cfm:87 which means that
code in index.cfm was executing, and either the start of a tag or the
end of a tag was on line 87, or thereabouts. Often the line is a closing
tag.  

The template index.cfm was then calling something in foobar.cfm, which
had an tag staring/ending on line 7.    So if you want to know exactly
what that thread was doing when the thread dump was taken, find foo.cfm
and look at line 7 and the lines just above it.

If you keep reading up the stack trace, you'll next see
coldfusion.tagext.sql.QueryTag.doEndTag.  This confirms what I found on
line 7 of foo.cfm... a closing query tag.  Further up the line you see
references to 
macromedia.jdbc.sqlserver and at java.net.SocketInputStream.socketRead.

So this thread was hung reading or waiting for a result set from a query
to SQL Server.  In fact, by counting all the jrpp threads which had
specific mentions of a line of code (search for ".cfm:" to pick these
out in your own thread dump), I was able to account for each of the
running threads, which was maxed out at the value of "Simultaneous
Requests".  Once you get the hang of it you can quickly find exactly
which threads are the running threads.  

In this example, I found that the 10 jrpp running threads were all hung
on the same thing, a query to sql server.  It was found that there was a
firewall between the CFMX server and SQL Server, and that firewall was
occasionally having a hiccup.  This problem was resolved by moving the
database inside the DMZ on a different NIC and subnet, which removed the
firewall from the picture.  After that had been done, the server stopped
hanging several times per day and went on stay up for at least a week
straight from last I heard about this.

In the end, if it sounds like the server is not responding but still
running, then perform a kill -3 (pid), wait a minute or two, then do it
again.  You should have two "Full Thread Dump"s.  Examine each thread
dump as I showed here.  If you find that the threads running in the
first dump are the same as the second dump (as indicated by the jrpp id
number), then those threads were running/hanging for the whole time
frame.  See what the threads were executing and what templates and line
numbers were actually running.  Look for patterns of activity, just like
the above thread dump which were all hung on a certain database.






-----Original Message-----
X-Sybari-Space: 00000000 00000000 00000000 00000000
From: Ronald West [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 24, 2003 3:25 PM
To: CF-Linux
Subject: RE: Too Many Files Open

One more thing ... 

Could this cause the server to crash if it happened a lot during one
day?

==================== 
Ronald West
Senior Applications Developer
PaperThin
617-471-4440 x219
[EMAIL PROTECTED]
 
CMSWatch recently ranked PaperThin's CommonSpot Content Server "Best
Overall Value" among leading content management solutions. Find out more
at www.paperthin.com.

-----Original Message-----
From: Ronald West [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 24, 2003 3:15 PM
To: CF-Linux
Subject: RE: Too Many Files Open

Thanks again, Steve - big help

==================== 
Ronald West
Senior Applications Developer
PaperThin
617-471-4440 x219
[EMAIL PROTECTED]
 
CMSWatch recently ranked PaperThin's CommonSpot Content Server "Best
Overall Value" among leading content management solutions. Find out more
at www.paperthin.com.

-----Original Message-----
From: Steven Erat [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 24, 2003 3:03 PM
To: CF-Linux
Subject: RE: Too Many Files Open

>>>  Is there still a document available with the Kernel recommendations
for MX on Linux?


How's this?:   
http://www.macromedia.com/v1/handlers/index.cfm?ID=23524&Method=Full

I would say that for the best support you should  stick with the default
kernel on the
supported distributions.  See recommended hardware/software in that
technote.
                        

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=14
Subscription: 
http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=14
Structure your ColdFusion code with Fusebox. Get the official book at 
http://www.fusionauthority.com/bkinfo.cfm

                                Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.14
                                

Reply via email to