Re: [ACFUG Discuss] Out of Memory?!?
All, Again, thanks for the help. I found the real jrun logs (which is something I speculated about initially.) With CentOS/CF9 the jrun logs are not under */opt/jrun4/ *but they are actually located at */opt/coldfusion9/runtime/logs/*. Needless to say, the jrun logs were helpful. It appears as if nearly all of the 503 responses are due to incomplete file uploads. I am a little surprised that these errors are now being kicked out by jrun when in the past they made it to coldfusion. But because it seems that Java is not handling the error condition, I believe the reason why they no longer make it to coldfusion might correspond to when I changed from the JVM installed with coldfusion to a newer JVM installed separately. (I changed JVM's to get the latest security patches.) Because the number of problems is less than 1% of the files that are uploaded, and it appears that our resources are not maxed out when it happens, I am assuming most of the blame for the problems is due to a network error on the sender side. (Many of our end-users are spread out across the country and use home internet connections of varying reliability.) I made the suggestion to the site owner that I rewrite our old upload dialog (a plain form allowing up to 6 files to be uploaded at once) with an ajax based solution that will single thread the files one at a time. I am hopeful that by limiting to a single file at a time, there will be fewer network errors. Of course there is still the few 503 errors that happen without uploading data... There are not many and at the moment none are in the Jrun logs. I am hopeful that some of the changes I made in the last few days (including incoporating some of Charlie's suggest request changes.) might have helped. But my pragmatism trumps optimism when it comes to computers, so I convinced the owner to get a Fusion Reactor subscription. If/when the situation happens again, I'll have a few more tools to use to help figure out what happened. FYI, here is the log entry that led me to the upload determination... 08/13 12:41:48 error unexpected end of part java.io.IOException: unexpected end of part at com.oreilly.servlet.multipart.PartInputStream.fill(PartInputStream.java:96) at com.oreilly.servlet.multipart.PartInputStream.read(PartInputStream.java:179) at com.oreilly.servlet.multipart.PartInputStream.read(PartInputStream.java:152) at com.oreilly.servlet.multipart.FilePart.write(FilePart.java:257) at com.oreilly.servlet.multipart.FilePart.writeTo(FilePart.java:215) at coldfusion.filter.FormScope.fillForm(FormScope.java:253) at coldfusion.filter.FusionContext.SymTab_initForRequest(FusionContext.java:408) at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:33) at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:22) at coldfusion.filter.CachingFilter.invoke(CachingFilter.java:62) at coldfusion.filter.RequestThrottleFilter.invoke(RequestThrottleFilter.java:126) at coldfusion.CfmServlet.service(CfmServlet.java:201) at coldfusion.bootstrap.BootstrapServlet.service(BootstrapServlet.java:89) at jrun.servlet.FilterChain.doFilter(FilterChain.java:86) at coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServletFilter.java:42) at coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46) at jrun.servlet.FilterChain.doFilter(FilterChain.java:94) at jrun.servlet.FilterChain.service(FilterChain.java:101) at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:106) at jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42) at jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:286) at jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543) at jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:203) at jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable(ThreadPool.java:320) at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:428) at jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable(ThreadPool.java:266) at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66) 08/13 12:41:48 error (JRun Service: ProxyService [jrun.servlet.jrpp.JRunProxyService@6988843a]) JRunPRoxyServer.invokeRunnable: java.lang.IllegalStateException at jrun.servlet.JRunResponse.getWriter(JRunResponse.java:205) at jrun.servlet.JRunResponse.sendError(JRunResponse.java:597) at jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:328) at jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543) at jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:203) at
Re: [ACFUG Discuss] Out of Memory?!?
Thanks all for the insight... And just as Charlie predicted, the event happened again without tripping the alert. One benefit, even though I was not actively watching what happened, I did have the server monitor running. The event happened when the server was only using 440MB, with 1.2GB free in the jvm allocation. So it definitely is not a memory issue. (It also happened in between GC cycles, so that isn't the issue either.) As for the possibility of the CPU, I won't discount this, but I doubt it would be from CF. We do use CFDocument/CFPDF which I know grab resources, but normally those pages are during the morning, and it actually happened twice last night at a time when I would not expect it. I'll have to gather more information. I'm starting to think that the cause may be outside CF. I'm going to try to look at all the system logs and try to piece together exactly what was happening at the time of the event. Another question though... Fusion Reactor monitors the entire system, not just CF right? (i.e. it can track running system processes, not just what CF is doing) If this is true, this may be the next step if my efforts are fruitless. Thanks, Frank On 08/09/2013 12:55 AM, Charlie Arehart wrote: Like you, I would think this is not memory related. I think that's just a really old error message, from the days when even the then Macromedia engineers could only throw up their hands and guess when something was amiss. I recently saw this error message happening for a client where we found (since they were on IIS) that the jrun_iis6_wildcard.logs (in [ColdFusion9]\runtime\lib\wsconfig\nn\LogFiles) had indications of errors also. I realize you're on Apache, and you say you looked at all the logs, but did you check out those logs in that wsconfig dir and its subdirs? It's just a stab in the dark whether any log messages there (around the same time) will be useful. I would focus on something making the CF instance not responsive. I know you said you raised the simult request threads from 10 to 40, and it seemed fine at 10. But maybe you have new load, or a new problem that makes requests hang. As Ajas said, FR (or as you're using it, the CF Server Monitor) can show you any running requests (the CFSM only show them if you turn on start monitoring). If you can be on when it happens you may be surprised what you find. If all 10 (or now 40) are hung, even if only for a while, that could lead to the error---not that CF's down, but the connector thinks it can't be reached. And as you noted in a later message, turning on the alerts will help (in either CFSM, again where start monitoring must be enabled for them to work, or in FR, or SeeFusion), as that will give you info even when you can't be watching the monitors. Since you're using the CFSM, and you say you configured the alerts, did you confirm that you get the email they send? There's no test feature. What I do is set the memory alert to below the current memory used, which should trigger an alert within a few minutes. But then I turn that alert off. I find it useless, since the JVM (since 1.5) can often let used memory climb to the max before deciding to do a major GC, so you can get those memory alerts when there's no real problem, if indeed a GC at that point would have collected a lot of not really used memory. But I do recommend that slow server alert in the CFSM, or the running requests alert in FR. For almost everyone, if you have many requests running at once, that's a canary in the coal mine indicating that problems may be afoot. The question then is whether the alert shows many slow requests. If it just shows many fast ones, then that is just a sign of a lot of traffic, and if it's being handled fast, you need to increase the number of max simult requests, and the alert level in whatever monitor you're using. And be careful about setting the other values in request tuning so low (web services, flash remoting, and remote cfcs). There's never a harm in them being more than you need. But if they are less than you need, that could be where a bottleneck happens. I know you say you don't serve web services, but I've seen shows have their own cf pages calling their own CFCs as web services. And if that request limit was low, then that becomes a single threading bottleneck. Or maybe you DO have code calling CFCs remotely (via ajax). Or about flash remoting, the monitor (and FR) use those, and your own code may (even if unexpectedly). Again, why constrict them? If you don't use them, there's no harm in them being larger (like 5 or 10, each). Finally, note that you could have cf requests using either cfthread or reporting, and there are limits for each of those (configurable in the admin). And though you are not using CF Standard, I'll say for other readers that they could have all this sort of problem caused by using some tag that is itself single-threaded in CF
RE: [ACFUG Discuss] Out of Memory?!?
Good to hear. (Well, that's a sad part of my business: sometimes it helps to have another failure, but if it's new info that confirms or denies something, at least it's an advancement to the solution.) As for your question, FR is mainly a monitor of CF (or whatever java based server process you install it into, whether a CFML server like CF, Railo, or Open BlueDragon, or a java servlet engine/jee server like jboss, tomcat, jetty, resin, etc.). Now, it also happens to watch the system-wide CPU, in addition to that within the instance, but that's all that it watches outside the instance it monitors. Of course, I don't want to short-change FR. I love it and work with it almost daily, and help people use it to solve problems that may have plagued them for years. But that's the frank answer (pardon the pun) to your question. /charlie From: ad...@acfug.org [mailto:ad...@acfug.org] On Behalf Of Frank Moorman Sent: Friday, August 09, 2013 2:38 PM To: discussion@acfug.org Subject: Re: [ACFUG Discuss] Out of Memory?!? Thanks all for the insight... And just as Charlie predicted, the event happened again without tripping the alert. One benefit, even though I was not actively watching what happened, I did have the server monitor running. The event happened when the server was only using 440MB, with 1.2GB free in the jvm allocation. So it definitely is not a memory issue. (It also happened in between GC cycles, so that isn't the issue either.) As for the possibility of the CPU, I won't discount this, but I doubt it would be from CF. We do use CFDocument/CFPDF which I know grab resources, but normally those pages are during the morning, and it actually happened twice last night at a time when I would not expect it. I'll have to gather more information. I'm starting to think that the cause may be outside CF. I'm going to try to look at all the system logs and try to piece together exactly what was happening at the time of the event. Another question though... Fusion Reactor monitors the entire system, not just CF right? (i.e. it can track running system processes, not just what CF is doing) If this is true, this may be the next step if my efforts are fruitless. Thanks, Frank - To unsubscribe from this list, manage your profile @ http://www.acfug.org?fa=login.edituserform For more info, see http://www.acfug.org/mailinglists Archive @ http://www.mail-archive.com/discussion%40acfug.org/ List hosted by http://www.fusionlink.com -
Re: [ACFUG Discuss] Out of Memory?!?
Thats good progress, good to know. About FR, all I can say is, like Neo says in Matrix, iKnowKungFoo, I can say myself about FR, I know troubleshooting/performance optimization. :-) Ajas Mohammed / iUseDropbox(http://db.tt/63Lvone9) http://ajashadi.blogspot.com We cannot become what we need to be, remaining what we are. No matter what, find a way. Because thats what winners do. You can't improve what you don't measure. Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives. On Fri, Aug 9, 2013 at 3:32 PM, Charlie Arehart char...@carehart.orgwrote: Good to hear. (Well, that’s a sad part of my business: sometimes it “helps” to have another failure, but if it’s new info that confirms or denies something, at least it’s an advancement to the solution.) As for your question, FR is mainly a monitor of CF (or whatever java based server process you install it into, whether a CFML server like CF, Railo, or Open BlueDragon, or a java servlet engine/jee server like jboss, tomcat, jetty, resin, etc.). Now, it also happens to watch the system-wide CPU, in addition to that within the instance, but that’s all that it watches “outside” the instance it monitors. Of course, I don’t want to short-change FR. I love it and work with it almost daily, and help people use it to solve problems that may have plagued them for years. But that’s the frank answer (pardon the pun) to your question. /charlie *From:* ad...@acfug.org [mailto:ad...@acfug.org] *On Behalf Of *Frank Moorman *Sent:* Friday, August 09, 2013 2:38 PM *To:* discussion@acfug.org *Subject:* Re: [ACFUG Discuss] Out of Memory?!? ** ** Thanks all for the insight... And just as Charlie predicted, the event happened again without tripping the alert. One benefit, even though I was not actively watching what happened, I did have the server monitor running. The event happened when the server was only using 440MB, with 1.2GB free in the jvm allocation. So it definitely is not a memory issue. (It also happened in between GC cycles, so that isn't the issue either.) As for the possibility of the CPU, I won't discount this, but I doubt it would be from CF. We do use CFDocument/CFPDF which I know grab resources, but normally those pages are during the morning, and it actually happened twice last night at a time when I would not expect it. I'll have to gather more information. I'm starting to think that the cause may be outside CF. I'm going to try to look at all the system logs and try to piece together exactly what was happening at the time of the event. Another question though... Fusion Reactor monitors the entire system, not just CF right? (i.e. it can track running system processes, not just what CF is doing) If this is true, this may be the next step if my efforts are fruitless. Thanks, Frank - To unsubscribe from this list, manage your profile @ http://www.acfug.org?fa=login.edituserform For more info, see http://www.acfug.org/mailinglists Archive @ http://www.mail-archive.com/discussion%40acfug.org/ List hosted by FusionLink http://www.fusionlink.com -
Re: [ACFUG Discuss] Out of Memory?!?
FYI... This is what the user gets on their end: Server Error The server encountered an internal error and was unable to complete your request. Application server is busy. Either there are too many concurrent requests or the server is still starting up. Also, I have not received any CF template errors at the times the 503 errors occur, nor any java.outofmemory errors etc. The server is running and available except for the one or two requests. Looking in the apache access log, There is no pattern to the pages that were requested, (but I know that I need to dig deeper and see what pages were requested right before the errors occur.) Thanks in advance for any help, Frank On 08/08/2013 07:41 PM, Frank Moorman wrote: All, I'm trying to figure out and determine a Jrun Out of Memory error. I get the following in my logs: [Thu Aug 08 14:40:14 2013] [notice] jrApache[2937: 31182] returning error page for JRun too busy or out of memory [Thu Aug 08 15:50:09 2013] [notice] jrApache[1787: 63699] returning error page for JRun too busy or out of memory It doesn't happen often, (maybe once or occasionally twice a business day) but as everyone understands, users aren't happy when it happens to them. This is a linux box, 64bit Centos 6, CF9 Enterprise, 64bit jvm version 1.7. (The jvm was installed separately from CF for security and coldfusion uses it.) I doubt it is actually an out of memory condition (though I could be wrong) The server has 6GB of physical memory and another 6GB of swap. It rarely needs to use swap. (i.e. I have not observed it.) The jvm is given significant memory to use as well. It is using a 64bit jvm with the settings of 1GB min JVM heap, as well as a 3GB max. When I look through the server monitor, it is normal to see 1 to 1.5GB allocated and between 100-750MB used. (I see a normal sawtooth pattern with the memory usage, so it looks like what I would expect from the garbage collection routing. It does spike occasionally but I have never seen it close to the 3GB max. (I've never even seen it hit 2GB used.) The server is set for 40 template requests (I recently upped it from 10 to see if that was the problem and it still occurred with the same frequency.) Flash remoting is set to 2, webservice 1, CFC 1. (These remote settings are only set for the monitor, as the server does not provide any webservices outside the running application) Jrun is set to 50 requests, and 1000 queued. (Enough to cover the CF requests.) I looked at Charlie's blog... I have checked the logs, and other than the apache error log (above) I do not see anything. I've check the system /var/log/messages, I've checked all the CF logs (I also archived everything yesterday, and the cf logs are practically empty even after today's occurrence.) I did not find any jvm abort logs that Charlie mentioned in his blog. (I checked in the CF directory mentioned as well as the system logs and the actual JVM directory) I also checked the Jrun log (in /opt/jrun4/logs/cfusion-event.log ) and was surprised because the only entries were months ago. (Because of the age of the log, I'm curious if I am looking at the right place for it.) Does anyone have any ideas on what might be happening? or something else that I should check? I have searched the web and found different ideas (even the rare add more memory) Another mentions the requests being overloaded, but I honestly do not believe that the 10 simultaneous template requests was low for the traffic for this site. After quadrupling it, with the problem still occurring, it is even less likely. I've seen some mentioning client variable storage, but the server is set to use cookies for that, not a database. While I do not use client storage, I know there are items like the last time visited etc, so I may just turn it off completely. Another one I found interested mentions a bug with MySql drivers with the Maintain Connections setting and suggested to uncheck this box. I search for this and found the bug mentioned, one site even speculated it was still a problem with CF9, but I could not find any details. Does anyone know of this issue, I've seen it mentioned, but a lack of any details other than its bad to have that checked. (The page that mentioned it did say it ate memory.) I'd love more ideas, I know these are not an easy or straight forward error. I may try removing the client storage next, but other ideas are welcome. (i.e. I'm not very convinced that the other things I found on the web will be effective.) Thanks, Frank - To unsubscribe from this list, manage your profile @ http://www.acfug.org?fa=login.edituserform For more info, see http://www.acfug.org/mailinglists Archive @ http://www.mail-archive.com/discussion%40acfug.org/ List hosted by FusionLink http://www.fusionlink.com
Re: [ACFUG Discuss] Out of Memory?!?
My first question will be do you have 1. FusionReactor or are you using CF built in monitoring. See what requests were running or are running when this happens. Ajas Mohammed / iUseDropbox(http://db.tt/63Lvone9) http://ajashadi.blogspot.com We cannot become what we need to be, remaining what we are. No matter what, find a way. Because thats what winners do. You can't improve what you don't measure. Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives. On Thu, Aug 8, 2013 at 8:23 PM, Frank Moorman stretch...@franksdomain.netwrote: FYI... This is what the user gets on their end: Server Error The server encountered an internal error and was unable to complete your request. Application server is busy. Either there are too many concurrent requests or the server is still starting up. Also, I have not received any CF template errors at the times the 503 errors occur, nor any java.outofmemory errors etc. The server is running and available except for the one or two requests. Looking in the apache access log, There is no pattern to the pages that were requested, (but I know that I need to dig deeper and see what pages were requested right before the errors occur.) Thanks in advance for any help, Frank On 08/08/2013 07:41 PM, Frank Moorman wrote: All, I'm trying to figure out and determine a Jrun Out of Memory error. I get the following in my logs: [Thu Aug 08 14:40:14 2013] [notice] jrApache[2937: 31182] returning error page for JRun too busy or out of memory [Thu Aug 08 15:50:09 2013] [notice] jrApache[1787: 63699] returning error page for JRun too busy or out of memory It doesn't happen often, (maybe once or occasionally twice a business day) but as everyone understands, users aren't happy when it happens to them. This is a linux box, 64bit Centos 6, CF9 Enterprise, 64bit jvm version 1.7. (The jvm was installed separately from CF for security and coldfusion uses it.) I doubt it is actually an out of memory condition (though I could be wrong) The server has 6GB of physical memory and another 6GB of swap. It rarely needs to use swap. (i.e. I have not observed it.) The jvm is given significant memory to use as well. It is using a 64bit jvm with the settings of 1GB min JVM heap, as well as a 3GB max. When I look through the server monitor, it is normal to see 1 to 1.5GB allocated and between 100-750MB used. (I see a normal sawtooth pattern with the memory usage, so it looks like what I would expect from the garbage collection routing. It does spike occasionally but I have never seen it close to the 3GB max. (I've never even seen it hit 2GB used.) The server is set for 40 template requests (I recently upped it from 10 to see if that was the problem and it still occurred with the same frequency.) Flash remoting is set to 2, webservice 1, CFC 1. (These remote settings are only set for the monitor, as the server does not provide any webservices outside the running application) Jrun is set to 50 requests, and 1000 queued. (Enough to cover the CF requests.) I looked at Charlie's blog... I have checked the logs, and other than the apache error log (above) I do not see anything. I've check the system /var/log/messages, I've checked all the CF logs (I also archived everything yesterday, and the cf logs are practically empty even after today's occurrence.) I did not find any jvm abort logs that Charlie mentioned in his blog. (I checked in the CF directory mentioned as well as the system logs and the actual JVM directory) I also checked the Jrun log (in /opt/jrun4/logs/cfusion-event.log ) and was surprised because the only entries were months ago. (Because of the age of the log, I'm curious if I am looking at the right place for it.) Does anyone have any ideas on what might be happening? or something else that I should check? I have searched the web and found different ideas (even the rare add more memory) Another mentions the requests being overloaded, but I honestly do not believe that the 10 simultaneous template requests was low for the traffic for this site. After quadrupling it, with the problem still occurring, it is even less likely. I've seen some mentioning client variable storage, but the server is set to use cookies for that, not a database. While I do not use client storage, I know there are items like the last time visited etc, so I may just turn it off completely. Another one I found interested mentions a bug with MySql drivers with the Maintain Connections setting and suggested to uncheck this box. I search for this and found the bug mentioned, one site even speculated it was still a problem with CF9, but I could not find any details. Does anyone know of this issue, I've seen it mentioned, but a lack of any details other than its bad to have that checked. (The page that
Re: [ACFUG Discuss] Out of Memory?!?
I'm using the built in server monitor. I do not have fusionreactor. However, that will be one of my top suggestions if I can not figure it out. (The other suggestion would probably be to get the site owner to spring for Charlie's time.) Unfortunately, I have not had my eye on the monitor at the time it happens... But I just configured the alerts section to start taking snapshots and email myself based on jvm memory usage or an unresponsive request. On 08/08/2013 11:55 PM, Ajas Mohammed wrote: My first question will be do you have 1. FusionReactor or are you using CF built in monitoring. See what requests were running or are running when this happens. Ajas Mohammed / iUseDropbox(http://db.tt/63Lvone9) http://ajashadi.blogspot.com We cannot become what we need to be, remaining what we are. No matter what, find a way. Because thats what winners do. You can't improve what you don't measure. Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives. On Thu, Aug 8, 2013 at 8:23 PM, Frank Moorman stretch...@franksdomain.net mailto:stretch...@franksdomain.net wrote: FYI... This is what the user gets on their end: Server Error The server encountered an internal error and was unable to complete your request. Application server is busy. Either there are too many concurrent requests or the server is still starting up. Also, I have not received any CF template errors at the times the 503 errors occur, nor any java.outofmemory errors etc. The server is running and available except for the one or two requests. Looking in the apache access log, There is no pattern to the pages that were requested, (but I know that I need to dig deeper and see what pages were requested right before the errors occur.) Thanks in advance for any help, Frank On 08/08/2013 07:41 PM, Frank Moorman wrote: All, I'm trying to figure out and determine a Jrun Out of Memory error. I get the following in my logs: [Thu Aug 08 14:40:14 2013] [notice] jrApache[2937: 31182] returning error page for JRun too busy or out of memory [Thu Aug 08 15:50:09 2013] [notice] jrApache[1787: 63699] returning error page for JRun too busy or out of memory It doesn't happen often, (maybe once or occasionally twice a business day) but as everyone understands, users aren't happy when it happens to them. This is a linux box, 64bit Centos 6, CF9 Enterprise, 64bit jvm version 1.7. (The jvm was installed separately from CF for security and coldfusion uses it.) I doubt it is actually an out of memory condition (though I could be wrong) The server has 6GB of physical memory and another 6GB of swap. It rarely needs to use swap. (i.e. I have not observed it.) The jvm is given significant memory to use as well. It is using a 64bit jvm with the settings of 1GB min JVM heap, as well as a 3GB max. When I look through the server monitor, it is normal to see 1 to 1.5GB allocated and between 100-750MB used. (I see a normal sawtooth pattern with the memory usage, so it looks like what I would expect from the garbage collection routing. It does spike occasionally but I have never seen it close to the 3GB max. (I've never even seen it hit 2GB used.) The server is set for 40 template requests (I recently upped it from 10 to see if that was the problem and it still occurred with the same frequency.) Flash remoting is set to 2, webservice 1, CFC 1. (These remote settings are only set for the monitor, as the server does not provide any webservices outside the running application) Jrun is set to 50 requests, and 1000 queued. (Enough to cover the CF requests.) I looked at Charlie's blog... I have checked the logs, and other than the apache error log (above) I do not see anything. I've check the system /var/log/messages, I've checked all the CF logs (I also archived everything yesterday, and the cf logs are practically empty even after today's occurrence.) I did not find any jvm abort logs that Charlie mentioned in his blog. (I checked in the CF directory mentioned as well as the system logs and the actual JVM directory) I also checked the Jrun log (in /opt/jrun4/logs/cfusion-event.log ) and was surprised because the only entries were months ago. (Because of the age of the log, I'm curious if I am looking at the right place for it.) Does anyone have any ideas on what might be happening? or something else that I should check? I have searched the web and found different ideas (even the rare add more memory) Another mentions the requests being overloaded, but I honestly do not believe that the 10 simultaneous template requests was low for the traffic for this
RE: [ACFUG Discuss] Out of Memory?!?
I'm looking at your error and seeing that it is merely reporting back that the server is busy but does not give a definitive cause. While it could be memory, I think I'd be looking for CPU usage issues. What does the CPU usage look like during this event? I ran into a similar problem a couple of years back when the company I worked for at the time went to virtual machines (albeit we were using 32 bit Windows and not Linux). A process was running that had nothing to do with Jrun or IIS/WWW that was sucking up the CPU all the way to 100% and there was no CPU left to process requests. I was getting the same 503 error that you get when you try to request a page from the server. Restarts only postponed the problem and our fix was simply to spin up a new virtual machine (a 64bit instance). We never could completely isolate the exact process that was eating up the CPU. Just a thought, hope you're able to find the problem quickly! From: ad...@acfug.org [mailto:ad...@acfug.org] On Behalf Of Ajas Mohammed Sent: Thursday, August 08, 2013 11:56 PM To: discussion@acfug.org Subject: Re: [ACFUG Discuss] Out of Memory?!? My first question will be do you have 1. FusionReactor or are you using CF built in monitoring. See what requests were running or are running when this happens. Ajas Mohammed / iUseDropbox(http://db.tt/63Lvone9) http://ajashadi.blogspot.com We cannot become what we need to be, remaining what we are. No matter what, find a way. Because thats what winners do. You can't improve what you don't measure. Quality is never an accident; it is always the result of high intention, sincere effort, intelligent direction and skillful execution; it represents the wise choice of many alternatives. On Thu, Aug 8, 2013 at 8:23 PM, Frank Moorman stretch...@franksdomain.netmailto:stretch...@franksdomain.net wrote: FYI... This is what the user gets on their end: Server Error The server encountered an internal error and was unable to complete your request. Application server is busy. Either there are too many concurrent requests or the server is still starting up. Also, I have not received any CF template errors at the times the 503 errors occur, nor any java.outofmemory errors etc. The server is running and available except for the one or two requests. Looking in the apache access log, There is no pattern to the pages that were requested, (but I know that I need to dig deeper and see what pages were requested right before the errors occur.) Thanks in advance for any help, Frank On 08/08/2013 07:41 PM, Frank Moorman wrote: All, I'm trying to figure out and determine a Jrun Out of Memory error. I get the following in my logs: [Thu Aug 08 14:40:14 2013] [notice] jrApache[2937: 31182] returning error page for JRun too busy or out of memory [Thu Aug 08 15:50:09 2013] [notice] jrApache[1787: 63699] returning error page for JRun too busy or out of memory It doesn't happen often, (maybe once or occasionally twice a business day) but as everyone understands, users aren't happy when it happens to them. This is a linux box, 64bit Centos 6, CF9 Enterprise, 64bit jvm version 1.7. (The jvm was installed separately from CF for security and coldfusion uses it.) I doubt it is actually an out of memory condition (though I could be wrong) The server has 6GB of physical memory and another 6GB of swap. It rarely needs to use swap. (i.e. I have not observed it.) The jvm is given significant memory to use as well. It is using a 64bit jvm with the settings of 1GB min JVM heap, as well as a 3GB max. When I look through the server monitor, it is normal to see 1 to 1.5GB allocated and between 100-750MB used. (I see a normal sawtooth pattern with the memory usage, so it looks like what I would expect from the garbage collection routing. It does spike occasionally but I have never seen it close to the 3GB max. (I've never even seen it hit 2GB used.) The server is set for 40 template requests (I recently upped it from 10 to see if that was the problem and it still occurred with the same frequency.) Flash remoting is set to 2, webservice 1, CFC 1. (These remote settings are only set for the monitor, as the server does not provide any webservices outside the running application) Jrun is set to 50 requests, and 1000 queued. (Enough to cover the CF requests.) I looked at Charlie's blog... I have checked the logs, and other than the apache error log (above) I do not see anything. I've check the system /var/log/messages, I've checked all the CF logs (I also archived everything yesterday, and the cf logs are practically empty even after today's occurrence.) I did not find any jvm abort logs that Charlie mentioned in his blog. (I checked in the CF directory mentioned as well as the system logs and the actual JVM directory) I also checked the Jrun log (in /opt/jrun4/logs/cfusion-event.log ) and was surprised because the only entries were months ago. (Because of the age of the log, I'm
RE: [ACFUG Discuss] Out of Memory?!?
Like you, I would think this is not memory related. I think that's just a really old error message, from the days when even the then Macromedia engineers could only throw up their hands and guess when something was amiss. I recently saw this error message happening for a client where we found (since they were on IIS) that the jrun_iis6_wildcard.logs (in [ColdFusion9]\runtime\lib\wsconfig\nn\LogFiles) had indications of errors also. I realize you're on Apache, and you say you looked at all the logs, but did you check out those logs in that wsconfig dir and its subdirs? It's just a stab in the dark whether any log messages there (around the same time) will be useful. I would focus on something making the CF instance not responsive. I know you said you raised the simult request threads from 10 to 40, and it seemed fine at 10. But maybe you have new load, or a new problem that makes requests hang. As Ajas said, FR (or as you're using it, the CF Server Monitor) can show you any running requests (the CFSM only show them if you turn on start monitoring). If you can be on when it happens you may be surprised what you find. If all 10 (or now 40) are hung, even if only for a while, that could lead to the error-not that CF's down, but the connector thinks it can't be reached. And as you noted in a later message, turning on the alerts will help (in either CFSM, again where start monitoring must be enabled for them to work, or in FR, or SeeFusion), as that will give you info even when you can't be watching the monitors. Since you're using the CFSM, and you say you configured the alerts, did you confirm that you get the email they send? There's no test feature. What I do is set the memory alert to below the current memory used, which should trigger an alert within a few minutes. But then I turn that alert off. I find it useless, since the JVM (since 1.5) can often let used memory climb to the max before deciding to do a major GC, so you can get those memory alerts when there's no real problem, if indeed a GC at that point would have collected a lot of not really used memory. But I do recommend that slow server alert in the CFSM, or the running requests alert in FR. For almost everyone, if you have many requests running at once, that's a canary in the coal mine indicating that problems may be afoot. The question then is whether the alert shows many slow requests. If it just shows many fast ones, then that is just a sign of a lot of traffic, and if it's being handled fast, you need to increase the number of max simult requests, and the alert level in whatever monitor you're using. And be careful about setting the other values in request tuning so low (web services, flash remoting, and remote cfcs). There's never a harm in them being more than you need. But if they are less than you need, that could be where a bottleneck happens. I know you say you don't serve web services, but I've seen shows have their own cf pages calling their own CFCs as web services. And if that request limit was low, then that becomes a single threading bottleneck. Or maybe you DO have code calling CFCs remotely (via ajax). Or about flash remoting, the monitor (and FR) use those, and your own code may (even if unexpectedly). Again, why constrict them? If you don't use them, there's no harm in them being larger (like 5 or 10, each). Finally, note that you could have cf requests using either cfthread or reporting, and there are limits for each of those (configurable in the admin). And though you are not using CF Standard, I'll say for other readers that they could have all this sort of problem caused by using some tag that is itself single-threaded in CF Standard, as are many tags, including cfdocument, cfpdf, and more. That could cause a low traffic site to still have hung requests. Let us know if any of that helps, or not. But yes, if it remains and you don't solve it, I am available for consulting, and with my satisfaction guarantee, you don't have to pay for time you don't feel is valuable. /charlie From: ad...@acfug.org [mailto:ad...@acfug.org] On Behalf Of Frank Moorman Sent: Thursday, August 08, 2013 7:42 PM To: discussion@acfug.org Subject: [ACFUG Discuss] Out of Memory?!? All, I'm trying to figure out and determine a Jrun Out of Memory error. I get the following in my logs: [Thu Aug 08 14:40:14 2013] [notice] jrApache[2937: 31182] returning error page for JRun too busy or out of memory [Thu Aug 08 15:50:09 2013] [notice] jrApache[1787: 63699] returning error page for JRun too busy or out of memory It doesn't happen often, (maybe once or occasionally twice a business day) but as everyone understands, users aren't happy when it happens to them. This is a linux box, 64bit Centos 6, CF9 Enterprise, 64bit jvm version 1.7. (The jvm was installed separately from CF for security and coldfusion uses it.) I doubt it is actually an out of memory condition (though I could be wrong) The server has 6GB of physical memory