Hi Johan, Does the same problem occur with mod_proxy (full http) instead of mod_proxy_ajp.
We have encountered some problems with mod_proxy_ajp that were solved by using simple mod_proxy. Cheers, buddy. André -----Original Message----- From: Johan Cwiklinski <johan.cwiklin...@ajlsm.com> Date: Wed, 22 Dec 2010 08:51:51 To: <users@cocoon.apache.org> Reply-To: users@cocoon.apache.org Subject: Tomcat6/Cocoon 2.1.10 using 100% CPU on windows Hello, I have a problem with a cocoon 2.1.10 webapp running under tomcat 6.0.26 under windows 2003 server 64 bits with oracle's JDK 1.6.0_21. This application is installed on a 'background' server, an application on another server request it via AJP using apache mod_proxy_ajp. For some reasons, the application will often eats 100% of the CPU, we then need to kill and restart tomcat. Logs says absolutely nothing :( I was not able yet to reproduce the issue on my dev environment. This application mainly use some classes we've developed on the top of cocoon that will: - search for image file in some directories on different hard disks (mainly by testing each directory + image path and looking if the file exists), - retrieve and show the image, - additionally use ImageMagick to resize, rotate, etc. The 'main' class extends cocoon's ResourceReader. Using the jvisualvm tool provided with Oracle's JDK, I can observe that: - ajp threads are sometimes running, and sometimes waiting; ok, that seems normal, - when the 100% cpu issue occurs, some ajp threads keeps running (never get back to waiting state). At the beginning, only one or two threads are affected, many more will be if we wait. I can also observe that some threads (a few ones unfortunately) will still having the normal behavior. All running threads using our class (ImageMagickReader) seems to be kind of blocked on super.setup or super.generate methods: "ajp-9009-9" - Thread t...@65 java.lang.Thread.State: RUNNABLE at java.util.HashMap.get(HashMap.java:303) at org.apache.cocoon.reading.ResourceReader.getLastModified(ResourceReader.java:242) at org.apache.cocoon.reading.ResourceReader.setupHeaders(ResourceReader.java:177) at org.apache.cocoon.reading.ResourceReader.setup(ResourceReader.java:157) at org.pleade.reading.ImageMagickReader.setup(ImageMagickReader.java:272) [...] Line 242 of ResourceReader.java is: final String systemId = (String) documents.get(request.getRequestURI()); "ajp-9009-8" - Thread t...@102 java.lang.Thread.State: RUNNABLE at java.util.HashMap.transfer(HashMap.java:484) at java.util.HashMap.resize(HashMap.java:463) at java.util.HashMap.addEntry(HashMap.java:755) at java.util.HashMap.put(HashMap.java:385) at org.apache.cocoon.reading.ResourceReader.generate(ResourceReader.java:346) at org.pleade.reading.ImageMagickReader.generate(ImageMagickReader.java:584) [...] Line 346 of ResourceReader.java is: documents.put(request.getRequestURI(), inputSource.getURI()); Those two examples are based on the first threads that will never release. I do not know if it is possible for a HashMap to be sort of corrupted ; of maybe HTTP headers? I'm not sure even if what we're seeing is is the cause or the consequence of the issue :( The same issue has been observed in the past on another server which is now running under GNU/Linux, and now seems to be ok (about two weeks under Linux, and no longer 100% CPU!). We've trying several tomcat and java versions, that changes anything. The issue can occurs after several uptime hours, or only a few minutes! If there are many connections, the issue will occurs more often; but is still present with just a few connections. I really do not know where the problem should be... Is it our code? Is it cocoon? Is it tomcat? Or more probably something one of them is doing that windows dislikes? It's difficult to know when exactly the problem happens (we've asked system administrators but get no answer) ; so I've not yet tried to log in debug mode (well, I've tried it once, but this is really verbose...). Any ideas? I do not know what to try or where to look at now :/ Many similar issues I can show over the web were related to a bug in tcnative under tomcat 5.5 ; I guess that is resolved now, I did not found any similar bug under tomcat 6. I also found a few ones with tomcat 6, but some were related to the apps, and the others were not resolved (at least there is no information on the forums/mailing lists saying it is resolved and what was the issue). You could take a look at the whole thread dump after about 10-15 minutes of 100% cpu usage: http://ouessant2.ajlsm.com/cocoon_app_cpu_issue The two threads I gave in example (ajp-9009-8 and ajp-9009-9) are ones that were started approximately when the server runs out of CPU ; and are still in the same state 10-15 minutes after. Thank you. Regards, -- Johan Cwiklinski AJLSM --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org For additional commands, e-mail: users-h...@cocoon.apache.org