Re: [OT] Re: jvm exits without trace
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Taylan, On 3/30/2010 5:51 AM, Taylan Develioglu wrote: We're still having some crashes (segfaults), they may or not be related to the previous ones. :( It seems libapr had something to do with it. [snip] # Problematic frame: # C [libapr-1.so.0+0x1c0a0] [snip] Stack: [0x41d86000,0x41da7000], sp=0x41da57d0, free space=7d0018k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libapr-1.so.0+0x1c0a0] Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.tomcat.jni.Socket.accept(J)J+0 So, this happens in APR's Socket.accept method? I suspect that if something were wrong in that method, it would be apparent to roughly 50% of the world's web servers. Do you have large electromagnetic flux sources anywhere near your servers? Your problems just don't seem to make any kind of sense whatsoever. Can you bring a server to your home and hit it with JMeter for a week or so and see if you can get it to crash? Maybe you have toxic mold in your rack and the JVM is trying to commit suicide to put itself out of its misery... - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkuzqPgACgkQ9CaO5/Lv0PD7rgCcDWhOzY93YAxoRcGvoXcx9awM TbYAoIP641cecvi4SVpaW1ZW4nWnP/gS =IV6q -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: [OT] Re: jvm exits without trace
From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: [OT] Re: jvm exits without trace So, this happens in APR's Socket.accept method? I suspect that if something were wrong in that method, it would be apparent to roughly 50% of the world's web servers. Depends. Try turning off IPv6 and see what happens. Some versions of glibc on some platforms seem to have problems with it. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
Re: [OT] Re: jvm exits without trace
[snip] # Problematic frame: # C [libapr-1.so.0+0x1c0a0] [snip] Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.tomcat.jni.Socket.accept(J)J+0 So, this happens in APR's Socket.accept method? I suspect that if something were wrong in that method, it would be apparent to roughly 50% of the world's web servers. That is TC-Native's Socket.accept method. If that happens at shutdown, then that is a known issue: https://issues.apache.org/bugzilla/show_bug.cgi?id=48584 It is already fixed in native 1.1.20. Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
2010/4/1 Konstantin Kolinko knst.koli...@gmail.com: [snip] # Problematic frame: # C [libapr-1.so.0+0x1c0a0] [snip] Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) j org.apache.tomcat.jni.Socket.accept(J)J+0 So, this happens in APR's Socket.accept method? I suspect that if something were wrong in that method, it would be apparent to roughly 50% of the world's web servers. That is TC-Native's Socket.accept method. Actually yet, it is some APR method, called from TC-Native's Socket.accept(). I was a bit confused that BZ 48584 mentions only tcnative-1.dll, but that is because APR + TC-Native are statically linked into single DLL in the Windows versions of TC-Native. The error should be happening in APR (because another thread already destroyed one of APR pools during shutdown -- see BZ 48584 for details). There are certainly some native call frames between Java class of org.apache.tomcat.jni.Socket and that native frame of libapr-1.so (because Java code calls its native counterpart, and not directly the APR), but those are not shown in the stack dump. Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
On 27 March 2010 00:22, David Kerber dcker...@verizon.net wrote: BTW, after looking back at my development notes, the jdbc-odbc bridges, while a pain to set up, have better performance than the type 4 drivers that Sybase also offers. That's frightening, given the extra layers of code for the ODBC bridge. If I were Sybase, I'd be ashamed of my code quality - or trying to implement the missing features that cause your application to behave better with the ODBC bridge. - Peter
Re: jvm exits without trace
Peter Crowther wrote: On 27 March 2010 00:22, David Kerber dcker...@verizon.net wrote: BTW, after looking back at my development notes, the jdbc-odbc bridges, while a pain to set up, have better performance than the type 4 drivers that Sybase also offers. That's frightening, given the extra layers of code for the ODBC bridge. If I were Sybase, I'd be ashamed of my code quality - or trying to implement the missing features that cause your application to behave better with the ODBC bridge. Except that with the bridge, much of the work can be done in highly optimized C code (there's a separate driver for each platform), rather than in java, which probably makes up for the extra layer. In addtion, they say that ODBC is a native interface for SQLAnywhere, so there are fewer layers than most ODBC implementations would have. D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Re: jvm exits without trace
0x07fefb09 - 0x07fefb098000 C:\Windows\system32\rasadhlp.dll VM Arguments: jvm_args: -Dcatalina.base=E:\TomcatClients\Pelican -Dcatalina.home=C:\Program Files\Apache Software Foundation\Tomcat 5.5 -Djava.endorsed.dirs=C:\Program Files\Apache Software Foundation\Tomcat 5.5\common\endorsed -Djava.io.tmpdir=E:\TomcatClients\Pelican\temp -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=C:\Program Files\Apache Software Foundation\Tomcat 5.5\conf\logging.properties -DWebSira.configFileName=E:\TomcatClients\Pelican\PelicanWebSIRA.properties vfprintf -Xms256m -Xmx256m java_command: unknown Launcher Type: generic Environment Variables: PATH=C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Program Files\Intel\DMIX;C:\Program Files\SAS\SharedFiles(32)\Formats;C:\Program Files\Sybase\SQL Anywhere 9\x64;C:\Program Files\Sybase\SQL Anywhere 9\win32;C:\Program Files\Sybase\Shared\win32 USERNAME=SAS2$ OS=Windows_NT PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 23 Stepping 10, GenuineIntel --- S Y S T E M --- OS: Windows Server 2008 Build 6002 Service Pack 2 CPU:total 4 (4 cores per cpu, 1 threads per core) family 6 model 23 stepping 10, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1 Memory: 4k page, physical 8381784k(4859168k free), swap 16815840k(13318972k free) vm_info: Java HotSpot(TM) 64-Bit Server VM (14.3-b01) for windows-amd64 JRE (1.6.0_17-b04), built on Oct 11 2009 00:46:08 by java_re with MS VC++ 8.0 time: Fri Mar 26 10:21:01 2010 elapsed time: 80912 seconds On 3/22/2010 3:33 PM, Carl wrote: Dan, 6u18 did not work for us, crashed with the same regularity as 6u17. However, 6u7 has been running for two weeks without a failure (the others would fail between 15 minutes and 10 days runtime.) Thanks, Carl - Original Message - From: Dan Armbrust daniel.armbrust.l...@gmail.com To: Tomcat Users List users@tomcat.apache.org Sent: Monday, March 22, 2010 1:17 PM Subject: Re: [OT] Re: jvm exits without trace On Tue, Mar 16, 2010 at 4:46 PM, Carl c...@etrak-plus.com wrote: My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. I have had a lot of issues finding stable JVMs since I moved our software from 1.5 to 1.6... 1.6 has been a mess for ages under our workload on CentOS. Code that ran fine under 1.5 would segfault in all sorts of random places on 1.6. I even tried the 1.7 openJDK early builds... they were even worse. I was pleased to see that the most recent release http://java.sun.com/javase/6/webnotes/6u18.html contains _tons_ of bug fixes, including lots of crash fixes. Seriously... the last few releases have contained only a handful of fixes... this release has hundreds. I'm testing it now, and so far, it looks promising. This may be the first 1.6 release that I've found that doesn't crash with my workload. Unfortunately, I've never been able to pin down a sequence of events that would cause the crash on demand, so its hard for me to verify that things are finally fixed, other than doing long term load testing and waiting for the crashes to (hopefully) not happen. Dan - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 David, On 3/26/2010 1:40 PM, David kerber wrote: It looks like we may have had one of these this AM, but under somewhat different conditions: Windows server 2008, JVM 1.6.0_17 64-bit Server VM, TC 5.5.28. No tracks in any of the TC logs, and the Windows event viewer had the singularly un-useful The Tomcat5 service terminated unexpectedly. It has done this 1 time(s). I did find this jvm error log, though, which appears to point to my jdbc driver: Yikes: whenever I see jdbcodbc I cringe! # EXCEPTION_ACCESS_VIOLATION (0xc005) at pc=0x775c8926, pid=4860, tid=4272 [snip] # # JRE version: 6.0_17-b04 # Java VM: Java HotSpot(TM) 64-Bit Server VM (14.3-b01 mixed mode windows-amd64 ) # Problematic frame: # C [ntdll.dll+0x48926] Looks like whatever ntdll.dll is. That doesn't sound like a JDBC driver cause to me. The stack trace shows all Java code except the call into ntdll.dll. Without knowing what function was being called, you can't really even speculate what the problem might be. Looks like you've broken a kernel call, though :) - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkutCL0ACgkQ9CaO5/Lv0PCUIQCfeHc5KbIL7YH45mSF0x6IsUSk vhQAnjNa9cYKr2q0bc9MgBFBkuExNN4Y =hlui -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: jvm exits without trace
From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace Yikes: whenever I see jdbcodbc I cringe! As everyone should. Looks like whatever ntdll.dll is. That doesn't sound like a JDBC driver cause to me. It's not the JDBC driver per se, but the JDBC driver called it. You pretty much have to go through ntldll.dll on all calls into the Windows kernel. The stack trace shows all Java code except the call into ntdll.dll. Java code can't call ntldll.dll directly - that has to be done from some other native code. Looks like you've broken a kernel call, though :) Just gave it a VA that's not valid for the process to access, so the Windows kernel blew up the thread with an access violation - as it should. This appears to be completely unrelated to the previously discussed JVM terminations. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
Re: jvm exits without trace
On 3/26/2010 3:19 PM, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 David, On 3/26/2010 1:40 PM, David kerber wrote: It looks like we may have had one of these this AM, but under somewhat different conditions: Windows server 2008, JVM 1.6.0_17 64-bit Server VM, TC 5.5.28. No tracks in any of the TC logs, and the Windows event viewer had the singularly un-useful The Tomcat5 service terminated unexpectedly. It has done this 1 time(s). I did find this jvm error log, though, which appears to point to my jdbc driver: Yikes: whenever I see jdbcodbc I cringe! What alternative is there, without going to a java database? This is making calls to a database server elsewhere on the network, not something embedded in the app. # EXCEPTION_ACCESS_VIOLATION (0xc005) at pc=0x775c8926, pid=4860, tid=4272 [snip] # # JRE version: 6.0_17-b04 # Java VM: Java HotSpot(TM) 64-Bit Server VM (14.3-b01 mixed mode windows-amd64 ) # Problematic frame: # C [ntdll.dll+0x48926] Looks like whatever ntdll.dll is. That doesn't sound like a JDBC driver cause to me. Other than the fact that it's part of the windows kernel/core, I don't know either. My guess was that the jodbc driver was making a bad call to the kernel. The stack trace shows all Java code except the call into ntdll.dll. Without knowing what function was being called, you can't really even speculate what the problem might be. Looks like you've broken a kernel call, though :) Yep - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: jvm exits without trace
From: David kerber [mailto:dcker...@verizon.net] Subject: Re: jvm exits without trace What alternative is there, without going to a java database? *Every* real database that I'm aware of supplies type 4 JDBC drivers; none use the mind-bogglingly bad JDBC-ODBC bridge. Choose your poison: Oracle, SQL Server, MySQL, PostgreSQL, ... - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
On 3/26/2010 3:27 PM, Caldarale, Charles R wrote: From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace Yikes: whenever I see jdbcodbc I cringe! As everyone should. Looks like whatever ntdll.dll is. That doesn't sound like a JDBC driver cause to me. It's not the JDBC driver per se, but the JDBC driver called it. You pretty much have to go through ntldll.dll on all calls into the Windows kernel. The stack trace shows all Java code except the call into ntdll.dll. Java code can't call ntldll.dll directly - that has to be done from some other native code. Looks like you've broken a kernel call, though :) Just gave it a VA that's not valid for the process to access, so the Windows kernel blew up the thread with an access violation - as it should. This appears to be completely unrelated to the previously discussed JVM terminations. Ok, I'll accept that. I really didn't know, but wanted a second pair of eyes to look at it. I thought there was a chance it might be related because the failure was nearly (though not completely) silent. Thanks for the comments! D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
On 3/26/2010 3:39 PM, Caldarale, Charles R wrote: From: David kerber [mailto:dcker...@verizon.net] Subject: Re: jvm exits without trace What alternative is there, without going to a java database? *Every* real database that I'm aware of supplies type 4 JDBC drivers; none use the mind-bogglingly bad JDBC-ODBC bridge. Choose your poison: Oracle, SQL Server, MySQL, PostgreSQL, ... Oh, yeah. Brain fart on my part... D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
On Tue, Mar 16, 2010 at 4:46 PM, Carl c...@etrak-plus.com wrote: My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. I have had a lot of issues finding stable JVMs since I moved our software from 1.5 to 1.6... 1.6 has been a mess for ages under our workload on CentOS. Code that ran fine under 1.5 would segfault in all sorts of random places on 1.6. I even tried the 1.7 openJDK early builds... they were even worse. I was pleased to see that the most recent release http://java.sun.com/javase/6/webnotes/6u18.html contains _tons_ of bug fixes, including lots of crash fixes. Seriously... the last few releases have contained only a handful of fixes... this release has hundreds. I'm testing it now, and so far, it looks promising. This may be the first 1.6 release that I've found that doesn't crash with my workload. Unfortunately, I've never been able to pin down a sequence of events that would cause the crash on demand, so its hard for me to verify that things are finally fixed, other than doing long term load testing and waiting for the crashes to (hopefully) not happen. Dan - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
Dan, 6u18 did not work for us, crashed with the same regularity as 6u17. However, 6u7 has been running for two weeks without a failure (the others would fail between 15 minutes and 10 days runtime.) Thanks, Carl - Original Message - From: Dan Armbrust daniel.armbrust.l...@gmail.com To: Tomcat Users List users@tomcat.apache.org Sent: Monday, March 22, 2010 1:17 PM Subject: Re: [OT] Re: jvm exits without trace On Tue, Mar 16, 2010 at 4:46 PM, Carl c...@etrak-plus.com wrote: My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. I have had a lot of issues finding stable JVMs since I moved our software from 1.5 to 1.6... 1.6 has been a mess for ages under our workload on CentOS. Code that ran fine under 1.5 would segfault in all sorts of random places on 1.6. I even tried the 1.7 openJDK early builds... they were even worse. I was pleased to see that the most recent release http://java.sun.com/javase/6/webnotes/6u18.html contains _tons_ of bug fixes, including lots of crash fixes. Seriously... the last few releases have contained only a handful of fixes... this release has hundreds. I'm testing it now, and so far, it looks promising. This may be the first 1.6 release that I've found that doesn't crash with my workload. Unfortunately, I've never been able to pin down a sequence of events that would cause the crash on demand, so its hard for me to verify that things are finally fixed, other than doing long term load testing and waiting for the crashes to (hopefully) not happen. Dan - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: [OT] Re: jvm exits without trace
-Original Message- From: Dan Armbrust [mailto:daniel.armbrust.l...@gmail.com] Sent: Monday, March 22, 2010 12:17 PM To: Tomcat Users List Subject: Re: [OT] Re: jvm exits without trace On Tue, Mar 16, 2010 at 4:46 PM, Carl c...@etrak-plus.com wrote: My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. I have had a lot of issues finding stable JVMs since I moved our software from 1.5 to 1.6... 1.6 has been a mess for ages under our workload on CentOS. Code that ran fine under 1.5 would segfault in all sorts of random places on 1.6. I even tried the 1.7 openJDK early builds... they were even worse. I was pleased to see that the most recent release http://java.sun.com/javase/6/webnotes/6u18.html contains _tons_ of bug fixes, including lots of crash fixes. Seriously... the last few releases have contained only a handful of fixes... this release has hundreds. I'm running 1.6.0_18 under OpenSuSE 11.0-11.2 and I've only had one problem. It looks like a GLIBC error related to IPV6. I disabled IPV6 on the machine and it's been rock solid since. I don't use native connectors, or AJP. I do about 900,000 hits with 4GB/Day spread across 3 servers and it's just rock steady. I run the tomcat instances for 2-3 weeks each before re-starting. For each machine, that's about 6 million hits with about 40 GB of transfer. George Sexton MH Software, Inc. 303 438-9585 www.mhsoftware.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: [OT] Re: jvm exits without trace
Carl- enable logging FileHandler and set level to finest e.g. org.apache.catalina.level=FINEST http://tomcat.apache.org/tomcat-6.0-doc/logging.html hth Martin Gainty __ Verzicht und Vertraulichkeitanmerkung/Note de déni et de confidentialité Diese Nachricht ist vertraulich. Sollten Sie nicht der vorgesehene Empfaenger sein, so bitten wir hoeflich um eine Mitteilung. Jede unbefugte Weiterleitung oder Fertigung einer Kopie ist unzulaessig. Diese Nachricht dient lediglich dem Austausch von Informationen und entfaltet keine rechtliche Bindungswirkung. Aufgrund der leichten Manipulierbarkeit von E-Mails koennen wir keine Haftung fuer den Inhalt uebernehmen. Ce message est confidentiel et peut être privilégié. Si vous n'êtes pas le destinataire prévu, nous te demandons avec bonté que pour satisfaire informez l'expéditeur. N'importe quelle diffusion non autorisée ou la copie de ceci est interdite. Ce message sert à l'information seulement et n'aura pas n'importe quel effet légalement obligatoire. Étant donné que les email peuvent facilement être sujets à la manipulation, nous ne pouvons accepter aucune responsabilité pour le contenu fourni. From: c...@etrak-plus.com To: users@tomcat.apache.org Subject: Re: [OT] Re: jvm exits without trace Date: Mon, 22 Mar 2010 14:33:54 -0500 Dan, 6u18 did not work for us, crashed with the same regularity as 6u17. However, 6u7 has been running for two weeks without a failure (the others would fail between 15 minutes and 10 days runtime.) Thanks, Carl - Original Message - From: Dan Armbrust daniel.armbrust.l...@gmail.com To: Tomcat Users List users@tomcat.apache.org Sent: Monday, March 22, 2010 1:17 PM Subject: Re: [OT] Re: jvm exits without trace On Tue, Mar 16, 2010 at 4:46 PM, Carl c...@etrak-plus.com wrote: My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. I have had a lot of issues finding stable JVMs since I moved our software from 1.5 to 1.6... 1.6 has been a mess for ages under our workload on CentOS. Code that ran fine under 1.5 would segfault in all sorts of random places on 1.6. I even tried the 1.7 openJDK early builds... they were even worse. I was pleased to see that the most recent release http://java.sun.com/javase/6/webnotes/6u18.html contains _tons_ of bug fixes, including lots of crash fixes. Seriously... the last few releases have contained only a handful of fixes... this release has hundreds. I'm testing it now, and so far, it looks promising. This may be the first 1.6 release that I've found that doesn't crash with my workload. Unfortunately, I've never been able to pin down a sequence of events that would cause the crash on demand, so its hard for me to verify that things are finally fixed, other than doing long term load testing and waiting for the crashes to (hopefully) not happen. Dan - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org _ The New Busy is not the old busy. Search, chat and e-mail from your inbox. http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_3
Re: [OT] Re: jvm exits without trace
On 22/03/2010 22:12, Martin Gainty wrote: Carl- enable logging FileHandler and set level to finest e.g. org.apache.catalina.level=FINEST Not recommended unless you actually want Tomcat to take over 20 minutes to start up. Added to which, that isn't going to help track down this JVM bug. Carl, I think you are safe to ignore this suggestion. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
George, I tried OpenSuSE with 6u17 and the results were about the same as Slackware 64 so the implication is that there is something in my app that is triggering the seg fault. My app is mostly jsp's running against MySQL with some AJAX and Flash (communicating through a servlet.) I do not use any native code nor do I use AJP. I do use SSL for everything. It has also failed on different hardware although the OS was the same. The argument against the problem being in my app is that the app runs fine (has for several years) on 32 bit Slackware and it now seems to be running fine on 6u7 (on Slackware 64.) Now that I appear to have a stable system that I can always go back to, I would like to get to the root cause. My thought was that I could get the source code, compile it on the server and then wait for it to fail with a core file that could give me the answer (maybe.) However, I can only find source code for the 'open' fork which may or may not be the same so I am stymied again. I guess I just have to keep stabbing in the dark with the new JVM's until one happens to work... not very scientific. Thanks, Carl - Original Message - From: George Sexton geor...@mhsoftware.com To: 'Tomcat Users List' users@tomcat.apache.org Sent: Monday, March 22, 2010 2:48 PM Subject: RE: [OT] Re: jvm exits without trace -Original Message- From: Dan Armbrust [mailto:daniel.armbrust.l...@gmail.com] Sent: Monday, March 22, 2010 12:17 PM To: Tomcat Users List Subject: Re: [OT] Re: jvm exits without trace On Tue, Mar 16, 2010 at 4:46 PM, Carl c...@etrak-plus.com wrote: My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. I have had a lot of issues finding stable JVMs since I moved our software from 1.5 to 1.6... 1.6 has been a mess for ages under our workload on CentOS. Code that ran fine under 1.5 would segfault in all sorts of random places on 1.6. I even tried the 1.7 openJDK early builds... they were even worse. I was pleased to see that the most recent release http://java.sun.com/javase/6/webnotes/6u18.html contains _tons_ of bug fixes, including lots of crash fixes. Seriously... the last few releases have contained only a handful of fixes... this release has hundreds. I'm running 1.6.0_18 under OpenSuSE 11.0-11.2 and I've only had one problem. It looks like a GLIBC error related to IPV6. I disabled IPV6 on the machine and it's been rock solid since. I don't use native connectors, or AJP. I do about 900,000 hits with 4GB/Day spread across 3 servers and it's just rock steady. I run the tomcat instances for 2-3 weeks each before re-starting. For each machine, that's about 6 million hits with about 40 GB of transfer. George Sexton MH Software, Inc. 303 438-9585 www.mhsoftware.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
Here's a hs_err file after a crash I had yesterday. We turned off some things in our code without restarting and the crashes have virtually stopped but we do still get the off one here and there where the application has not been restarted, could be that the problem lingers and builds up in time, who knows. It's a sigsegv in GCTaskThread. From the occupation in eden it looks like it happened during a scavenge (ParNew). Maybe an expert in some dark cave could shed some more light on it. On Tue, 2010-03-16 at 22:00 +0100, André Warnier wrote: Carl wrote: My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. I think we understand that, and agree. Our remarks were tongue in cheek, if that is the right expression. At the bottom of things, finding a bug in the most recent JVM would be much more globally important than finding it in your applications, particularly a bug that can cause the JVM to segfault. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
On 17 March 2010 10:16, Taylan Develioglu tdevelio...@ebuddy.com wrote: Here's a hs_err file after a crash I had yesterday. The list usually strips attachments; could you paste it? It's a sigsegv in GCTaskThread. From the occupation in eden it looks like it happened during a scavenge (ParNew). Ick. To roughly quote Dan Ingalls (who wrote several of the Smalltalk VMs, including their garbage collectors): studying the behaviour of objects in the garbage collector is like studying frogs using a blender. I've tried modifying Squeak's GC algorithm, and that pretty much sums up my experiences! I'd expect seg faults in a GC thread to occur if, for some reason, pointers were getting corrupted on the stack or in the heap, or if a concurrency issue caused a read or write to the old location of an object after it had been moved. The GC is very sensitive to such things as it traces large portions of object memory, so may well stumble across corruption before any other thread does so. - Peter
Re: [OT] Re: jvm exits without trace
On 17 March 2010 10:22, Taylan Develioglu tdevelio...@ebuddy.com wrote: Ofcourse this works better if I really attach the file. [...] Java Threads: ( = current thread ) 0x7f3d3c174000 JavaThread MSN-6488 daemon [_thread_in_native, id=28966, stack(0x42a4f000,0x42a7)] [...] Are you *absolutely sure* you pin down or copy *everything* you access from all of your native code so that the Java garbage collector cannot move it? - Peter
Re: jvm exits without trace
With parent I meant the main JVM process as opposed to forked processes or threads, sorry to confuse you there. Stracing the threads generates too much data to store so I had to settle with the parent process. To answer your other questions. The code is 100% pure java, why it causes this messy crash is still unclear but development is working to figure it out. I'll follow up when we find out more, but I'm not sure if we're likely to dig into the root cause, working around it is more of a priority right now than debugging the jvm. On Mon, 2010-03-15 at 17:08 +0100, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Taylan, On 3/15/2010 10:19 AM, Taylan Develioglu wrote: The cause for the crashes was in our own application code, we're currently investigating the exact reason. Yeah, I'd like to second Chuck's question: was it native code? A strace of the parent process shows killed by sigsegv, why or how this can happen is still unclear. So, the parent was being killed? What was the parent of the JVM? Thanks to everyone that gave their assistance. Definitely follow-up to let us all know what you've uncovered... this was certainly a weird situation. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkueW4wACgkQ9CaO5/Lv0PAdhgCfa32vlcsMI5ELCNcLSjjV+S/o FZEAnjvjXgAwxjejTXexGO//89TyeF+r =BPtZ -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
Taylan, I have had a similar problem that is yet unsolved (see the thread 'Tomcat dies suddenly'.) In my case, the death left a core file which showed the JVM stopped with a seg fault. A week ago yesterday, we switched to the Sun 1.6.0_7 JVM (from 1.6.0_17 and 1.6.0_17) (Chuck suggested this) and so far, it is running even though we have had loads and usages similar to those that caused crashes in the past. Therefore, you might consider trying that JVM. Hope I haven't jinxed myself by saying it is still up. Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org Sent: Tuesday, March 16, 2010 7:41 AM Subject: Re: jvm exits without trace With parent I meant the main JVM process as opposed to forked processes or threads, sorry to confuse you there. Stracing the threads generates too much data to store so I had to settle with the parent process. To answer your other questions. The code is 100% pure java, why it causes this messy crash is still unclear but development is working to figure it out. I'll follow up when we find out more, but I'm not sure if we're likely to dig into the root cause, working around it is more of a priority right now than debugging the jvm. On Mon, 2010-03-15 at 17:08 +0100, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Taylan, On 3/15/2010 10:19 AM, Taylan Develioglu wrote: The cause for the crashes was in our own application code, we're currently investigating the exact reason. Yeah, I'd like to second Chuck's question: was it native code? A strace of the parent process shows killed by sigsegv, why or how this can happen is still unclear. So, the parent was being killed? What was the parent of the JVM? Thanks to everyone that gave their assistance. Definitely follow-up to let us all know what you've uncovered... this was certainly a weird situation. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkueW4wACgkQ9CaO5/Lv0PAdhgCfa32vlcsMI5ELCNcLSjjV+S/o FZEAnjvjXgAwxjejTXexGO//89TyeF+r =BPtZ -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
[OT] Re: jvm exits without trace
Hi, Tomcat dies suddenly thread was exciting almost as Prison Break TV series. I couldn't wait to find out what would be solution to the problem, and I must admit that just downgrading to lower sub-sub-sub version JVM left me a bit disappointed. :) Regards, Ognjen Carl wrote: Taylan, I have had a similar problem that is yet unsolved (see the thread 'Tomcat dies suddenly'.) In my case, the death left a core file which showed the JVM stopped with a seg fault. A week ago yesterday, we switched to the Sun 1.6.0_7 JVM (from 1.6.0_17 and 1.6.0_17) (Chuck suggested this) and so far, it is running even though we have had loads and usages similar to those that caused crashes in the past. Therefore, you might consider trying that JVM. Hope I haven't jinxed myself by saying it is still up. Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org Sent: Tuesday, March 16, 2010 7:41 AM Subject: Re: jvm exits without trace With parent I meant the main JVM process as opposed to forked processes or threads, sorry to confuse you there. Stracing the threads generates too much data to store so I had to settle with the parent process. To answer your other questions. The code is 100% pure java, why it causes this messy crash is still unclear but development is working to figure it out. I'll follow up when we find out more, but I'm not sure if we're likely to dig into the root cause, working around it is more of a priority right now than debugging the jvm. On Mon, 2010-03-15 at 17:08 +0100, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Taylan, On 3/15/2010 10:19 AM, Taylan Develioglu wrote: The cause for the crashes was in our own application code, we're currently investigating the exact reason. Yeah, I'd like to second Chuck's question: was it native code? A strace of the parent process shows killed by sigsegv, why or how this can happen is still unclear. So, the parent was being killed? What was the parent of the JVM? Thanks to everyone that gave their assistance. Definitely follow-up to let us all know what you've uncovered... this was certainly a weird situation. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkueW4wACgkQ9CaO5/Lv0PAdhgCfa32vlcsMI5ELCNcLSjjV+S/o FZEAnjvjXgAwxjejTXexGO//89TyeF+r =BPtZ -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
Ognjen Blagojevic wrote: Hi, Tomcat dies suddenly thread was exciting almost as Prison Break TV series. I couldn't wait to find out what would be solution to the problem, and I must admit that just downgrading to lower sub-sub-sub version JVM left me a bit disappointed. :) Well, I wouldn't call that a true solution; more of a workaround. At some point, they're going to have to be able to update to a more current version. I'd love to see what the root cause is, if it's ever truly determined... D - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
Ognjen Blagojevic wrote: Hi, Tomcat dies suddenly thread was exciting almost as Prison Break TV series. I couldn't wait to find out what would be solution to the problem, and I must admit that just downgrading to lower sub-sub-sub version JVM left me a bit disappointed. :) +1 It sounds quite like the standard tech support solution for Windows problems : de-install, re-install; and if that does not help, push the reset button. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. In my case, it might be something in my application that is causing the crash on 64 bit Slackware as there are many people running 64 bit on Linux without a problem. If it is my application, then any 64 bit JVM I throw at it should crash. On the other hand, if the 1.6.0_7 JVM works, then it is likely a bug in one of the changes that brings the JVM to the current version. Time will tell. Thanks, Carl - Original Message - From: André Warnier a...@ice-sa.com To: Tomcat Users List users@tomcat.apache.org Sent: Tuesday, March 16, 2010 3:20 PM Subject: Re: [OT] Re: jvm exits without trace Ognjen Blagojevic wrote: Hi, Tomcat dies suddenly thread was exciting almost as Prison Break TV series. I couldn't wait to find out what would be solution to the problem, and I must admit that just downgrading to lower sub-sub-sub version JVM left me a bit disappointed. :) +1 It sounds quite like the standard tech support solution for Windows problems : de-install, re-install; and if that does not help, push the reset button. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: [OT] Re: jvm exits without trace
Carl wrote: My approach is to get something (a JVM) that works and then gradually change until it breaks. Then, I know what is causing the problem. To date, I haven't been able to get a JVM that works. I think we understand that, and agree. Our remarks were tongue in cheek, if that is the right expression. At the bottom of things, finding a bug in the most recent JVM would be much more globally important than finding it in your applications, particularly a bug that can cause the JVM to segfault. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
The cause for the crashes was in our own application code, we're currently investigating the exact reason. A strace of the parent process shows killed by sigsegv, why or how this can happen is still unclear. Thanks to everyone that gave their assistance. On Thu, 2010-03-11 at 15:40 +0100, Taylan Develioglu wrote: Hi Carl, thanks for the suggestion. I am going to try jvm 1.6.07 regardless of what I said before. Funny coincidence, I tried the ibm jvm as well and ran into a similar issue (part of our ssl implementation uses sun specific libraries). On Thu, 2010-03-11 at 12:38 +0100, Carl wrote: Taylan, I am currently trying JVM 1.6.0_7 per Chuck's suggestion and, so far (4 days), it is working. I started down the IBM JVM path but have abandoned that for now due to difficulties with the SSL implementation (somne browsers would work and some wouldn't with seemingly the same setup.) Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org Sent: Thursday, March 11, 2010 6:13 AM Subject: Re: jvm exits without trace a different kernel did not help either... On Thu, 2010-03-11 at 11:37 +0100, Taylan Develioglu wrote: Changing to JIO didn't help, the silent crashes continue. I'm changing kernel versions now. On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote: It's performing rather poorly performance wise, compared to the apr connector. The number of threads required to handle the requests has gone up significantly over the board. Stability wise, I don't have complaints yet. I'm keeping my fingers crossed. On Fri, 2010-03-05 at 10:09 +0100, Pid wrote: On 05/03/2010 08:41, Taylan Develioglu wrote: Pid, that would assume we had a working 1.6.10 version before that we replaced. That it would. We've run 1.6.10 upwards succesfully for a very long time. So I don't see the point in doing this. I must have missed that. How is the HTTP connector performing? p On Wed, 2010-03-03 at 12:00 +0100, Pid wrote: On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related
RE: jvm exits without trace
From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: Re: jvm exits without trace The cause for the crashes was in our own application code Java or native code? - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
Re: jvm exits without trace
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chuck, On 3/15/2010 10:22 AM, Caldarale, Charles R wrote: From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: Re: jvm exits without trace The cause for the crashes was in our own application code Java or native code? Woah, where did Taylan's initial claim come from? Was that posted to the list? - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkueUDsACgkQ9CaO5/Lv0PAfgwCcCpNJf13L5+IqlQQnh7v1Qktz x8gAmgLl64TZatGslp6S0RZ5n/boFRwj =kq5l -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: jvm exits without trace
From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace On 3/15/2010 10:22 AM, Caldarale, Charles R wrote: From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: Re: jvm exits without trace The cause for the crashes was in our own application code Java or native code? Woah, where did Taylan's initial claim come from? Was that posted to the list? Not sure what you mean by initial claim; today's message is here: http://marc.info/?l=tomcat-userm=126866276827939w=2 The start of the thread was on 24 Feb: http://marc.info/?l=tomcat-userm=126700346225034w=2 - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
Re: jvm exits without trace
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chuck, On 3/15/2010 11:24 AM, Caldarale, Charles R wrote: From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace On 3/15/2010 10:22 AM, Caldarale, Charles R wrote: From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: Re: jvm exits without trace The cause for the crashes was in our own application code Java or native code? Woah, where did Taylan's initial claim come from? Was that posted to the list? Not sure what you mean by initial claim; today's message is here: http://marc.info/?l=tomcat-userm=126866276827939w=2 Hmm... that particular message was flagged as spam, so it didn't appear in the thread. :( - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkueWy4ACgkQ9CaO5/Lv0PDPsQCgqA2eI8I+vaM99b48wMIeuTIk HxIAoLcagBb/WmBmgBtipsn+ka/48IX1 =Z//K -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Taylan, On 3/15/2010 10:19 AM, Taylan Develioglu wrote: The cause for the crashes was in our own application code, we're currently investigating the exact reason. Yeah, I'd like to second Chuck's question: was it native code? A strace of the parent process shows killed by sigsegv, why or how this can happen is still unclear. So, the parent was being killed? What was the parent of the JVM? Thanks to everyone that gave their assistance. Definitely follow-up to let us all know what you've uncovered... this was certainly a weird situation. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkueW4wACgkQ9CaO5/Lv0PAdhgCfa32vlcsMI5ELCNcLSjjV+S/o FZEAnjvjXgAwxjejTXexGO//89TyeF+r =BPtZ -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
Changing to JIO didn't help, the silent crashes continue. I'm changing kernel versions now. On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote: It's performing rather poorly performance wise, compared to the apr connector. The number of threads required to handle the requests has gone up significantly over the board. Stability wise, I don't have complaints yet. I'm keeping my fingers crossed. On Fri, 2010-03-05 at 10:09 +0100, Pid wrote: On 05/03/2010 08:41, Taylan Develioglu wrote: Pid, that would assume we had a working 1.6.10 version before that we replaced. That it would. We've run 1.6.10 upwards succesfully for a very long time. So I don't see the point in doing this. I must have missed that. How is the HTTP connector performing? p On Wed, 2010-03-03 at 12:00 +0100, Pid wrote: On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun
Re: jvm exits without trace
a different kernel did not help either... On Thu, 2010-03-11 at 11:37 +0100, Taylan Develioglu wrote: Changing to JIO didn't help, the silent crashes continue. I'm changing kernel versions now. On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote: It's performing rather poorly performance wise, compared to the apr connector. The number of threads required to handle the requests has gone up significantly over the board. Stability wise, I don't have complaints yet. I'm keeping my fingers crossed. On Fri, 2010-03-05 at 10:09 +0100, Pid wrote: On 05/03/2010 08:41, Taylan Develioglu wrote: Pid, that would assume we had a working 1.6.10 version before that we replaced. That it would. We've run 1.6.10 upwards succesfully for a very long time. So I don't see the point in doing this. I must have missed that. How is the HTTP connector performing? p On Wed, 2010-03-03 at 12:00 +0100, Pid wrote: On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern
Re: jvm exits without trace
Taylan, I am currently trying JVM 1.6.0_7 per Chuck's suggestion and, so far (4 days), it is working. I started down the IBM JVM path but have abandoned that for now due to difficulties with the SSL implementation (somne browsers would work and some wouldn't with seemingly the same setup.) Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org Sent: Thursday, March 11, 2010 6:13 AM Subject: Re: jvm exits without trace a different kernel did not help either... On Thu, 2010-03-11 at 11:37 +0100, Taylan Develioglu wrote: Changing to JIO didn't help, the silent crashes continue. I'm changing kernel versions now. On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote: It's performing rather poorly performance wise, compared to the apr connector. The number of threads required to handle the requests has gone up significantly over the board. Stability wise, I don't have complaints yet. I'm keeping my fingers crossed. On Fri, 2010-03-05 at 10:09 +0100, Pid wrote: On 05/03/2010 08:41, Taylan Develioglu wrote: Pid, that would assume we had a working 1.6.10 version before that we replaced. That it would. We've run 1.6.10 upwards succesfully for a very long time. So I don't see the point in doing this. I must have missed that. How is the HTTP connector performing? p On Wed, 2010-03-03 at 12:00 +0100, Pid wrote: On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1
Re: jvm exits without trace
Hi Carl, thanks for the suggestion. I am going to try jvm 1.6.07 regardless of what I said before. Funny coincidence, I tried the ibm jvm as well and ran into a similar issue (part of our ssl implementation uses sun specific libraries). On Thu, 2010-03-11 at 12:38 +0100, Carl wrote: Taylan, I am currently trying JVM 1.6.0_7 per Chuck's suggestion and, so far (4 days), it is working. I started down the IBM JVM path but have abandoned that for now due to difficulties with the SSL implementation (somne browsers would work and some wouldn't with seemingly the same setup.) Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org Sent: Thursday, March 11, 2010 6:13 AM Subject: Re: jvm exits without trace a different kernel did not help either... On Thu, 2010-03-11 at 11:37 +0100, Taylan Develioglu wrote: Changing to JIO didn't help, the silent crashes continue. I'm changing kernel versions now. On Fri, 2010-03-05 at 10:45 +0100, Taylan Develioglu wrote: It's performing rather poorly performance wise, compared to the apr connector. The number of threads required to handle the requests has gone up significantly over the board. Stability wise, I don't have complaints yet. I'm keeping my fingers crossed. On Fri, 2010-03-05 at 10:09 +0100, Pid wrote: On 05/03/2010 08:41, Taylan Develioglu wrote: Pid, that would assume we had a working 1.6.10 version before that we replaced. That it would. We've run 1.6.10 upwards succesfully for a very long time. So I don't see the point in doing this. I must have missed that. How is the HTTP connector performing? p On Wed, 2010-03-03 at 12:00 +0100, Pid wrote: On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit
Re: jvm exits without trace
Sorry I wasn't clear. I didn't mean 2172 concurrent requests. Just sessions. It hadn't occured to me that the number of sessions does not necessarily equal the number of connections (duh). the number of established connections indeed equals the number of threads. So what Chuck said was true. On Tue, 2010-03-09 at 19:29 +0100, André Warnier wrote: Taylan Develioglu wrote: Chuck, if that is true how can we explain I see only 637 busy threads on a server that is serving 2172 clients ? Woaw ! can you give us your trick ? If every connection requires its own thread there should be 2172 threads. Seriously now : when a thread is finished serving a request, there is still some time during which the response bytes are cascading through the network to the clients. I think you need to defined serving 2172 clients a bit more precisely before you can say this, no ? On Tue, 2010-03-09 at 16:40 +0100, Caldarale, Charles R wrote: From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: RE: jvm exits without trace where peak busy-threads used to be ~50 with APR, now it has become ~200 with JIO. To be expected when you have unlimited keep-alives configured. Each HTTP connection requires a separate thread with JIO, whereas the NIO and APR connectors use a single poller thread. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: jvm exits without trace
The switch is from APR to JIO. SSL practically doesn't get used. Almost all pages served are jsp or java, very little static files are served and keep-alive is on. where peak busy-threads used to be ~50 with APR, now it has become ~200 with JIO. Here are the connector definitions for reference (no executor is used): - APR: Connector port=80 protocol=org.apache.coyote.http11.Http11AprProtocol compression=1024 keepAliveTimeout=6 maxKeepAliveRequests=-1 enableLookups=false redirectPort=443 maxThreads=150 pollerSize=32768 / - JIO: Connector port=80 protocol=org.apache.coyote.http11.Http11Protocol compression=1024 connectionTimeout=1 keepAliveTimeout=6 maxKeepAliveRequests=-1 enableLookups=false redirectPort=443 maxThreads=720/ On Fri, 2010-03-05 at 19:13 +0100, Caldarale, Charles R wrote: From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace I thought he said he was using APR, not NIO. He was, but IIRC, switched away from it to see if that would affect the outages. What we don't know is what was switched to - JIO or NIO. If it's JIO, there may be a lot of threads tied up handling persistent HTTP connections, possibly causing heap or other resource problems. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: jvm exits without trace
From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: RE: jvm exits without trace where peak busy-threads used to be ~50 with APR, now it has become ~200 with JIO. To be expected when you have unlimited keep-alives configured. Each HTTP connection requires a separate thread with JIO, whereas the NIO and APR connectors use a single poller thread. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
RE: jvm exits without trace
Chuck, if that is true how can we explain I see only 637 busy threads on a server that is serving 2172 clients ? If every connection requires its own thread there should be 2172 threads. On Tue, 2010-03-09 at 16:40 +0100, Caldarale, Charles R wrote: From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: RE: jvm exits without trace where peak busy-threads used to be ~50 with APR, now it has become ~200 with JIO. To be expected when you have unlimited keep-alives configured. Each HTTP connection requires a separate thread with JIO, whereas the NIO and APR connectors use a single poller thread. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: jvm exits without trace
From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: RE: jvm exits without trace If every connection requires its own thread there should be 2172 threads. Only if the client *chooses* to maintain the keep-alive. Browsers and other clients are free to terminate the connection any time they wish. All the server can do is deny the keep-alive; it can't initiate it. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __
Re: jvm exits without trace
Taylan Develioglu wrote: Chuck, if that is true how can we explain I see only 637 busy threads on a server that is serving 2172 clients ? Woaw ! can you give us your trick ? If every connection requires its own thread there should be 2172 threads. Seriously now : when a thread is finished serving a request, there is still some time during which the response bytes are cascading through the network to the clients. I think you need to defined serving 2172 clients a bit more precisely before you can say this, no ? On Tue, 2010-03-09 at 16:40 +0100, Caldarale, Charles R wrote: From: Taylan Develioglu [mailto:tdevelio...@ebuddy.com] Subject: RE: jvm exits without trace where peak busy-threads used to be ~50 with APR, now it has become ~200 with JIO. To be expected when you have unlimited keep-alives configured. Each HTTP connection requires a separate thread with JIO, whereas the NIO and APR connectors use a single poller thread. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
Pid, that would assume we had a working 1.6.10 version before that we replaced. We've run 1.6.10 upwards succesfully for a very long time. So I don't see the point in doing this. On Wed, 2010-03-03 at 12:00 +0100, Pid wrote: On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you may be able to see the cause of the failure from the core file (the core files on my systems were in several directories so you may have to do a 'find' to locate them.) Load may be a factor but the failures generally come after the load has been heavy for a while. I am running a couple of applications and it seems the failures are more frequent when people are hitting the additional apps (the primary app is always used, the remaining apps are used sporatically.) How does this compare to what you are experiencing? Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org;p...@pidster.com Sent: Wednesday, February 24, 2010 5
Re: jvm exits without trace
On 05/03/2010 08:41, Taylan Develioglu wrote: Pid, that would assume we had a working 1.6.10 version before that we replaced. That it would. We've run 1.6.10 upwards succesfully for a very long time. So I don't see the point in doing this. I must have missed that. How is the HTTP connector performing? p On Wed, 2010-03-03 at 12:00 +0100, Pid wrote: On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you may be able to see the cause of the failure from the core file (the core files on my systems were in several directories so you may have to do a 'find' to locate them.) Load may be a factor but the failures generally come after the load has been heavy for a while. I am running a couple of applications and it seems the failures are more frequent when people are hitting the additional apps (the primary app is always used, the remaining apps are used sporatically.) How does this compare to what you are experiencing? Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org;p...@pidster.com Sent: Wednesday, February 24, 2010 5:09 AM Subject: Re: jvm exits without trace The GC log shows
Re: jvm exits without trace
It's performing rather poorly performance wise, compared to the apr connector. The number of threads required to handle the requests has gone up significantly over the board. Stability wise, I don't have complaints yet. I'm keeping my fingers crossed. On Fri, 2010-03-05 at 10:09 +0100, Pid wrote: On 05/03/2010 08:41, Taylan Develioglu wrote: Pid, that would assume we had a working 1.6.10 version before that we replaced. That it would. We've run 1.6.10 upwards succesfully for a very long time. So I don't see the point in doing this. I must have missed that. How is the HTTP connector performing? p On Wed, 2010-03-03 at 12:00 +0100, Pid wrote: On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you may be able to see the cause of the failure from the core file (the core files on my systems were in several directories so you may have to do a 'find' to locate them.) Load may be a factor but the failures generally come after the load has been heavy for a while
Re: jvm exits without trace
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Taylan, On 3/5/2010 4:45 AM, Taylan Develioglu wrote: It's performing rather poorly performance wise, compared to the apr connector. The number of threads required to handle the requests has gone up significantly over the board. That's interesting. Do you have a lot of static resources being served? - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkuRHTEACgkQ9CaO5/Lv0PAcvgCdGgsGpip78k06ca1SMwxXPzZO 4wAAoLKlLKPoY67GE4ZIRKEf3I2glzGP =l7vd -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: jvm exits without trace
From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace Do you have a lot of static resources being served? Or still allow keep-alives with the non-NIO connector? - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
Re: jvm exits without trace
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chuck, On 3/5/2010 10:07 AM, Caldarale, Charles R wrote: From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace Do you have a lot of static resources being served? Or still allow keep-alives with the non-NIO connector? I thought he said he was using APR, not NIO. I would have expected that JIO vs. APR would be roughly the same unless SSL or sendFile was getting significant use. NIO of course has a different threading model, so that would make sense. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkuRPYkACgkQ9CaO5/Lv0PCVzQCeIIVq9ytWtbuou1X+Xfgq6dm4 bW4AoIQnno0lrU3AuztXxf0kut1ZG77k =bchZ -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
On 05/03/2010 17:21, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chuck, On 3/5/2010 10:07 AM, Caldarale, Charles R wrote: From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace Do you have a lot of static resources being served? Or still allow keep-alives with the non-NIO connector? I thought he said he was using APR, not NIO. I would have expected that JIO vs. APR would be roughly the same unless SSL or sendFile was getting significant use. NIO of course has a different threading model, so that would make sense. The OP was using HTTP APR + SSL and is now trying the HTTP JIO + SSL connector. (It's not clear how much of the work is conducted over HTTPS.) We are hoping that the HTTP connector doesn't exhibit the same fault, because then we'll know that the native stuff has an issue, probably between 1.1.17 and 1.1.20. http://tomcat.apache.org/native-doc/miscellaneous/changelog.html p - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkuRPYkACgkQ9CaO5/Lv0PCVzQCeIIVq9ytWtbuou1X+Xfgq6dm4 bW4AoIQnno0lrU3AuztXxf0kut1ZG77k =bchZ -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: jvm exits without trace
From: Christopher Schultz [mailto:ch...@christopherschultz.net] Subject: Re: jvm exits without trace I thought he said he was using APR, not NIO. He was, but IIRC, switched away from it to see if that would affect the outages. What we don't know is what was switched to - JIO or NIO. If it's JIO, there may be a lot of threads tied up handling persistent HTTP connections, possibly causing heap or other resource problems. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
Re: jvm exits without trace
Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you may be able to see the cause of the failure from the core file (the core files on my systems were in several directories so you may have to do a 'find' to locate them.) Load may be a factor but the failures generally come after the load has been heavy for a while. I am running a couple of applications and it seems the failures are more frequent when people are hitting the additional apps (the primary app is always used, the remaining apps are used sporatically.) How does this compare to what you are experiencing? Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org; p...@pidster.com Sent: Wednesday, February 24, 2010 5:09 AM Subject: Re: jvm exits without trace The GC log shows plenty of heap space left in all the spaces. I purposely didn't bother replacing the variables because I figured they would not be relevant. But if you think they might provide clues they're as follows: JAVA_HEAP_SIZE=18432M JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M JAVA_PERM_SIZE=128M JAVA_STCK_SIZE=128K EDEN_SIZE is 1/6th of total heap
Re: jvm exits without trace
On 03/03/2010 09:11, Taylan Develioglu wrote: Downgrading to 1.6.0_16 did not help. I'm replacing the apr connector with http now. As Chuck mentioned in the other thread, significant changes occurred at 1.6.10, so trying the release before (1.6.7) might be necessary to establish a better determination. p On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you may be able to see the cause of the failure from the core file (the core files on my systems were in several directories so you may have to do a 'find' to locate them.) Load may be a factor but the failures generally come after the load has been heavy for a while. I am running a couple of applications and it seems the failures are more frequent when people are hitting the additional apps (the primary app is always used, the remaining apps are used sporatically.) How does this compare to what you are experiencing? Thanks, Carl - Original Message - From: Taylan Develioglutdevelio...@ebuddy.com To: Tomcat Users Listusers@tomcat.apache.org;p...@pidster.com Sent: Wednesday, February 24, 2010 5:09 AM Subject: Re: jvm exits without trace The GC log shows plenty of heap space left in all the spaces. I purposely didn't bother replacing the variables because I figured they would not be relevant. But if you think they might provide clues they're as follows: JAVA_HEAP_SIZE=18432M JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M JAVA_PERM_SIZE=128M JAVA_STCK_SIZE=128K EDEN_SIZE is 1/6th of total
Re: jvm exits without trace
Hi Chris, There's no doubt about it. The amount free is what's left after everything is taken into account, heap, jvm, jni, permgen. And trust me I'd like it to be the oom killer, but it's not. They could survive, but then I could throw away half of my ram. Not seeing any point in doing that (doesn't fix the problem). On Thu, 2010-02-25 at 22:38 +0100, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Taylann, On 2/24/2010 8:31 AM, Taylan Develioglu wrote: Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Are you sure the rest of the JVM can fit into this space? I've heard of JVMs (particularly on Windows) that take a significant chunk of memory on top of the heap space requested on the command-line. Definitely check your system logs for OOM killer, here. What happens if you cut your heap in half? Can each machine in your (probably) cluster survive with less heap space? - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkuG7cEACgkQ9CaO5/Lv0PDRtgCfd7qBww9EUP9whAf6ZlvSvl02 VnYAoK6f6GTY1vBzw3QW0phnr/53gBYG =8thi -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Taylann, On 2/24/2010 8:31 AM, Taylan Develioglu wrote: Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Are you sure the rest of the JVM can fit into this space? I've heard of JVMs (particularly on Windows) that take a significant chunk of memory on top of the heap space requested on the command-line. Definitely check your system logs for OOM killer, here. What happens if you cut your heap in half? Can each machine in your (probably) cluster survive with less heap space? - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkuG7cEACgkQ9CaO5/Lv0PDRtgCfd7qBww9EUP9whAf6ZlvSvl02 VnYAoK6f6GTY1vBzw3QW0phnr/53gBYG =8thi -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
I thought I'd add the connector definitions too, : Connector port=80 protocol=org.apache.coyote.http11.Http11AprProtocol compression=1024 keepAliveTimeout=6 maxKeepAliveRequests=-1 enableLookups=false redirectPort=443 maxThreads=150 pollerSize=32768 pollerThreadCount=4/ Connector port=443 protocol=org.apache.coyote.http11.Http11AprProtocol SSLEnabled=true enableLookups=false maxThreads=10 scheme=https secure=true SSLCertificateFile=/etc/ssl/private/something.crt SSLCertificateKeyFile=/etc/ssl/private/something.key SSLCACertificateFile=/etc/ssl/certs/ca.crt/ On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? Any help appreciated. - Tomcat 6.0.24 - TC native 1.1.18 - APR 1.3.9 - Sun JDK 6u18 - Debian Lenny, 2.6.31.10-amd64 2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR. JAVA_OPTS ( ): -verbose:gc -Djava.awt.headless=true -Dsun.net.inetaddr.ttl=60 -Dfile.encoding=UTF-8 -Djava.io.tmpdir=$TMP_DIR -Djava.library.path=/usr/local/lib -Djava.endorsed.dirs=$CATALINA_BASE/endorsed -Dcatalina.base=$CATALINA_BASE -Dcatalina.home=$CATALINA_HOME -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=$CATALINA_BASE/conf/logging.properties -XX:+PrintGCDetails -Xloggc:$CATALINA_BASE/logs/gc.log -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -Xms$JAVA_HEAP_SIZE -Xmx$JAVA_HEAP_SIZE -XX:NewSize=$JAVA_EDEN_SIZE -XX:MaxNewSize=$JAVA_EDEN_SIZE -XX:PermSize=$JAVA_PERM_SIZE -XX:MaxPermSize=$JAVA_PERM_SIZE -Xss$JAVA_STCK_SIZE -XX:+UseLargePages - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
On 24/02/2010 09:36, Taylan Develioglu wrote: I thought I'd add the connector definitions too, : Connector port=80 protocol=org.apache.coyote.http11.Http11AprProtocol compression=1024 keepAliveTimeout=6 maxKeepAliveRequests=-1 enableLookups=false redirectPort=443 maxThreads=150 pollerSize=32768 pollerThreadCount=4/ Connector port=443 protocol=org.apache.coyote.http11.Http11AprProtocol SSLEnabled=true enableLookups=false maxThreads=10 scheme=https secure=true SSLCertificateFile=/etc/ssl/private/something.crt SSLCertificateKeyFile=/etc/ssl/private/something.key SSLCACertificateFile=/etc/ssl/certs/ca.crt/ On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? Any help appreciated. - Tomcat 6.0.24 - TC native 1.1.18 - APR 1.3.9 - Sun JDK 6u18 - Debian Lenny, 2.6.31.10-amd64 2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR. JAVA_OPTS ( ): -verbose:gc -Djava.awt.headless=true -Dsun.net.inetaddr.ttl=60 -Dfile.encoding=UTF-8 -Djava.io.tmpdir=$TMP_DIR -Djava.library.path=/usr/local/lib -Djava.endorsed.dirs=$CATALINA_BASE/endorsed -Dcatalina.base=$CATALINA_BASE -Dcatalina.home=$CATALINA_HOME -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=$CATALINA_BASE/conf/logging.properties -XX:+PrintGCDetails -Xloggc:$CATALINA_BASE/logs/gc.log -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -Xms$JAVA_HEAP_SIZE -Xmx$JAVA_HEAP_SIZE -XX:NewSize=$JAVA_EDEN_SIZE -XX:MaxNewSize=$JAVA_EDEN_SIZE -XX:PermSize=$JAVA_PERM_SIZE -XX:MaxPermSize=$JAVA_PERM_SIZE -Xss$JAVA_STCK_SIZE -XX:+UseLargePages There's no actual heap size settings in the above. But you get a couple of points for trying. Google Linux Out Of Memory killer or OOM Killer and then check the server logs carefully. (e.g. /var/log/messages) p - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
The GC log shows plenty of heap space left in all the spaces. I purposely didn't bother replacing the variables because I figured they would not be relevant. But if you think they might provide clues they're as follows: JAVA_HEAP_SIZE=18432M JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M JAVA_PERM_SIZE=128M JAVA_STCK_SIZE=128K EDEN_SIZE is 1/6th of total heap. And I said there was nothing in the system logs. But you get a couple of points for trying. On Wed, 2010-02-24 at 10:44 +0100, Pid wrote: On 24/02/2010 09:36, Taylan Develioglu wrote: I thought I'd add the connector definitions too, : Connector port=80 protocol=org.apache.coyote.http11.Http11AprProtocol compression=1024 keepAliveTimeout=6 maxKeepAliveRequests=-1 enableLookups=false redirectPort=443 maxThreads=150 pollerSize=32768 pollerThreadCount=4/ Connector port=443 protocol=org.apache.coyote.http11.Http11AprProtocol SSLEnabled=true enableLookups=false maxThreads=10 scheme=https secure=true SSLCertificateFile=/etc/ssl/private/something.crt SSLCertificateKeyFile=/etc/ssl/private/something.key SSLCACertificateFile=/etc/ssl/certs/ca.crt/ On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? Any help appreciated. - Tomcat 6.0.24 - TC native 1.1.18 - APR 1.3.9 - Sun JDK 6u18 - Debian Lenny, 2.6.31.10-amd64 2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR. JAVA_OPTS ( ): -verbose:gc -Djava.awt.headless=true -Dsun.net.inetaddr.ttl=60 -Dfile.encoding=UTF-8 -Djava.io.tmpdir=$TMP_DIR -Djava.library.path=/usr/local/lib -Djava.endorsed.dirs=$CATALINA_BASE/endorsed -Dcatalina.base=$CATALINA_BASE -Dcatalina.home=$CATALINA_HOME -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=$CATALINA_BASE/conf/logging.properties -XX:+PrintGCDetails -Xloggc:$CATALINA_BASE/logs/gc.log -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -Xms$JAVA_HEAP_SIZE -Xmx$JAVA_HEAP_SIZE -XX:NewSize=$JAVA_EDEN_SIZE -XX:MaxNewSize=$JAVA_EDEN_SIZE -XX:PermSize=$JAVA_PERM_SIZE -XX:MaxPermSize=$JAVA_PERM_SIZE -Xss$JAVA_STCK_SIZE -XX:+UseLargePages There's no actual heap size settings in the above. But you get a couple of points for trying. Google Linux Out Of Memory killer or OOM Killer and then check the server logs carefully. (e.g. /var/log/messages) p - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
2010/2/24 Taylan Develioglu tdevelio...@ebuddy.com: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? There is currently a thread named Tomcat dies suddenly Look there for starters. While that is unlikely your case, most ideas of diagnosing such an issue are mentioned in the first dozen of messages of that thread. http://marc.info/?t=12632496092r=1w=2 http://marc.info/?t=12633901125r=1w=2 http://marc.info/?t=12647949758r=6w=2 http://marc.info/?t=12660960545r=1w=2 Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
Thank you Konstantin, I've read the thread you mentioned. I should have mentioned the mysterious exit happens on several different servers with different hardware and configuration. So it's very unlikely it's being caused by a hardware issue. It's also not the oom killer as I mentioned before, I already investigated those possibilities. I'm suspecting jni with tomcat native and apr now, I believe native code outside the jvm could very well cause a crash like this but my ignorance on the subject isn't helping. I've had strange behavior with libapr 1.3 and apache on machines with debian 5.0 that synchronize their clock using clock slew (ntpdate) and decreased the ntpdate frequency to see if that helps. ((as you can tell I'm getting a bit desperate) On Wed, 2010-02-24 at 11:28 +0100, Konstantin Kolinko wrote: 2010/2/24 Taylan Develioglu tdevelio...@ebuddy.com: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? There is currently a thread named Tomcat dies suddenly Look there for starters. While that is unlikely your case, most ideas of diagnosing such an issue are mentioned in the first dozen of messages of that thread. http://marc.info/?t=12632496092r=1w=2 http://marc.info/?t=12633901125r=1w=2 http://marc.info/?t=12647949758r=6w=2 http://marc.info/?t=12660960545r=1w=2 Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
Taylan Develioglu wrote: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? This sounds eerily like the problem discussed in the other current thread Tomcat dies suddenly. Maybe some of the things mentioned in that thread can give you some ideas ? In the meantime, have a look for core dumps in the Tomcat bin directory for example. Or, run Tomcat in a command window instead of as a daemon, and check if there is a Segfault ? - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you may be able to see the cause of the failure from the core file (the core files on my systems were in several directories so you may have to do a 'find' to locate them.) Load may be a factor but the failures generally come after the load has been heavy for a while. I am running a couple of applications and it seems the failures are more frequent when people are hitting the additional apps (the primary app is always used, the remaining apps are used sporatically.) How does this compare to what you are experiencing? Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org; p...@pidster.com Sent: Wednesday, February 24, 2010 5:09 AM Subject: Re: jvm exits without trace The GC log shows plenty of heap space left in all the spaces. I purposely didn't bother replacing the variables because I figured they would not be relevant. But if you think they might provide clues they're as follows: JAVA_HEAP_SIZE=18432M JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M JAVA_PERM_SIZE=128M JAVA_STCK_SIZE=128K EDEN_SIZE is 1/6th of total heap. And I said there was nothing in the system logs. But you get a couple of points for trying. On Wed, 2010-02-24 at 10:44 +0100, Pid wrote: On 24/02/2010 09:36, Taylan Develioglu wrote: I thought I'd add the connector definitions too, : Connector port=80 protocol=org.apache.coyote.http11.Http11AprProtocol compression=1024 keepAliveTimeout=6 maxKeepAliveRequests=-1 enableLookups=false redirectPort=443 maxThreads=150 pollerSize=32768 pollerThreadCount=4/ Connector port=443 protocol=org.apache.coyote.http11.Http11AprProtocol SSLEnabled=true enableLookups=false maxThreads=10 scheme=https secure=true SSLCertificateFile=/etc/ssl/private/something.crt SSLCertificateKeyFile=/etc/ssl/private/something.key SSLCACertificateFile=/etc/ssl/certs/ca.crt/ On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? Any help appreciated. - Tomcat 6.0.24 - TC native 1.1.18 - APR 1.3.9 - Sun JDK 6u18 - Debian Lenny, 2.6.31.10-amd64 2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR. JAVA_OPTS ( ): -verbose:gc -Djava.awt.headless=true -Dsun.net.inetaddr.ttl=60 -Dfile.encoding=UTF-8 -Djava.io.tmpdir=$TMP_DIR -Djava.library.path=/usr/local/lib -Djava.endorsed.dirs=$CATALINA_BASE/endorsed -Dcatalina.base=$CATALINA_BASE -Dcatalina.home=$CATALINA_HOME -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.util.logging.config.file=$CATALINA_BASE/conf/logging.properties -XX:+PrintGCDetails -Xloggc:$CATALINA_BASE/logs/gc.log -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -Xms$JAVA_HEAP_SIZE -Xmx$JAVA_HEAP_SIZE -XX:NewSize=$JAVA_EDEN_SIZE -XX:MaxNewSize=$JAVA_EDEN_SIZE -XX:PermSize=$JAVA_PERM_SIZE -XX:MaxPermSize=$JAVA_PERM_SIZE -Xss$JAVA_STCK_SIZE -XX:+UseLargePages There's no actual heap size settings in the above. But you get a couple of points for trying. Google Linux Out Of Memory killer or OOM Killer and then check the server logs carefully. (e.g. /var/log/messages) p - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h
Re: jvm exits without trace
Carl wrote: Taylan, ... How does this compare to what you are experiencing? Well, I note at least the crashes without traces in the logs, and the common usage of SSL. Is Taylan getting any Segfaults ? - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
On 24/02/2010 11:16, Taylan Develioglu wrote: Thank you Konstantin, I've read the thread you mentioned. I should have mentioned the mysterious exit happens on several different servers with different hardware and configuration. So it's very unlikely it's being caused by a hardware issue. It's also not the oom killer as I mentioned before, I already investigated those possibilities. I'm suspecting jni with tomcat native and apr now, I believe native code outside the jvm could very well cause a crash like this but my ignorance on the subject isn't helping. I've had strange behavior with libapr 1.3 and apache on machines with debian 5.0 that synchronize their clock using clock slew (ntpdate) and decreased the ntpdate frequency to see if that helps. ((as you can tell I'm getting a bit desperate) Can you disable APR, use the alternative SSL configuration or is that not possible? Also, would be it be possible to use an earlier 1.6 JVM* or perhaps even a completely different one? I can't remember, offhand, what (if any) results Carl had with other JVMs. p * Perhaps there's a subtle bug in recent releases of the JVM. On Wed, 2010-02-24 at 11:28 +0100, Konstantin Kolinko wrote: 2010/2/24 Taylan Develioglutdevelio...@ebuddy.com: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? There is currently a thread named Tomcat dies suddenly Look there for starters. While that is unlikely your case, most ideas of diagnosing such an issue are mentioned in the first dozen of messages of that thread. http://marc.info/?t=12632496092r=1w=2 http://marc.info/?t=12633901125r=1w=2 http://marc.info/?t=12647949758r=6w=2 http://marc.info/?t=12660960545r=1w=2 Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you may be able to see the cause of the failure from the core file (the core files on my systems were in several directories so you may have to do a 'find' to locate them.) Load may be a factor but the failures generally come after the load has been heavy for a while. I am running a couple of applications and it seems the failures are more frequent when people are hitting the additional apps (the primary app is always used, the remaining apps are used sporatically.) How does this compare to what you are experiencing? Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org; p...@pidster.com Sent: Wednesday, February 24, 2010 5:09 AM Subject: Re: jvm exits without trace The GC log shows plenty of heap space left in all the spaces. I purposely didn't bother replacing the variables because I figured they would not be relevant. But if you think they might provide clues they're as follows: JAVA_HEAP_SIZE=18432M JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M JAVA_PERM_SIZE=128M JAVA_STCK_SIZE=128K EDEN_SIZE is 1/6th of total heap. And I said there was nothing in the system logs. But you get a couple of points for trying. On Wed, 2010-02-24 at 10:44 +0100, Pid wrote: On 24/02/2010 09:36, Taylan Develioglu wrote: I thought I'd add the connector definitions too, : Connector port=80 protocol=org.apache.coyote.http11.Http11AprProtocol compression=1024 keepAliveTimeout=6 maxKeepAliveRequests=-1 enableLookups=false redirectPort=443 maxThreads=150 pollerSize=32768 pollerThreadCount=4/ Connector port=443 protocol=org.apache.coyote.http11.Http11AprProtocol SSLEnabled=true enableLookups=false maxThreads=10 scheme=https secure=true SSLCertificateFile=/etc/ssl/private/something.crt SSLCertificateKeyFile=/etc/ssl/private/something.key SSLCACertificateFile=/etc/ssl/certs/ca.crt/ On Wed, 2010-02-24 at 10:23 +0100, Taylan Develioglu wrote: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? Any help appreciated. - Tomcat 6.0.24 - TC native 1.1.18 - APR 1.3.9 - Sun JDK 6u18 - Debian Lenny, 2.6.31.10-amd64 2 servlets, one as ROOT. 2 HTTP connectors that use TCNative/APR. JAVA_OPTS ( ): -verbose:gc -Djava.awt.headless=true -Dsun.net.inetaddr.ttl=60 -Dfile.encoding=UTF-8
Re: jvm exits without trace
Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org Sent: Wednesday, February 24, 2010 8:31 AM Subject: Re: jvm exits without trace Hello Carl, The failures we've seen are in anywhere between 8 hours to a week of runtime. Most of them have (still) been running for almost a month without failure. There are ~100 machines. From the top of my head, I think we've had about 10+ failures now. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. But I don't know if the two are related. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. It might be useful to note that the failures happen with tomcat 6.0.20 as well as 6.0.24. As far as load concerns, I haven't had a failure on an idle machines. The machines are well loaded, but only at a fraction limit in regards to load and cpu utilization. Most memory is commited to tomcat, where a 24G machine would have 18G allocated to heap, 128M to permgen and some unspecified amount would get used by jni for apr. About 4G remains free after calculating taking into account the jvm itsself. A 16G machine would have 12G allocated to the heap. Besides the fact that our apps heavily use nio and mina I wouldn't say there's anything else noteworthy. There can be anywhere up to 1 concurrents on one machine. I had searched for coredumps, but no luck. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. On Wed, 2010-02-24 at 12:42 +0100, Carl wrote: Taylan, I am the person who started the Tomcat dies suddenly thread which I still haven't resolved. I am curious about the pattern of failures you are experiencing because they may provide some clues to my problem. In my case, the system will run for 15 minutes to 10 days before failing (most of the time it is several days to a week.) It appears to die from a seg fault in the JVM (I am using Sun 1.6.0_18 but have tried previous versions)... you may be able to see the cause of the failure from the core file (the core files on my systems were in several directories so you may have to do a 'find' to locate them.) Load may be a factor but the failures generally come after the load has been heavy for a while. I am running a couple of applications and it seems the failures are more frequent when people are hitting the additional apps (the primary app is always used, the remaining apps are used sporatically.) How does this compare to what you are experiencing? Thanks, Carl - Original Message - From: Taylan Develioglu tdevelio...@ebuddy.com To: Tomcat Users List users@tomcat.apache.org; p...@pidster.com Sent: Wednesday, February 24, 2010 5:09 AM Subject: Re: jvm exits without trace The GC log shows plenty of heap space left in all the spaces. I purposely didn't bother replacing the variables because I figured they would not be relevant. But if you think they might provide clues they're as follows: JAVA_HEAP_SIZE=18432M JAVA_EDEN_SIZE=$(($(echo $JAVA_HEAP_SIZE|sed 's/M$\|G$//')/6))M JAVA_PERM_SIZE=128M JAVA_STCK_SIZE=128K EDEN_SIZE is 1/6th of total heap. And I said there was nothing in the system logs. But you get a couple of points for trying. On Wed, 2010-02-24 at 10:44 +0100, Pid wrote: On 24/02/2010 09:36, Taylan Develioglu wrote: I thought I'd add the connector definitions too, : Connector port=80 protocol=org.apache.coyote.http11
Re: jvm exits without trace
It's possible, I'm going to try an earlier jvm first. u16 was the previous one running production, will try moving back to u16. If that fails removing APR is the next thing to try out. After that I'm going to try beating the dev team with a stick (I know you're reading this!). This is incredibly frustrating, thanks for all the help. Can you disable APR, use the alternative SSL configuration or is that not possible? Also, would be it be possible to use an earlier 1.6 JVM* or perhaps even a completely different one? I can't remember, offhand, what (if any) results Carl had with other JVMs. p * Perhaps there's a subtle bug in recent releases of the JVM. On Wed, 2010-02-24 at 11:28 +0100, Konstantin Kolinko wrote: 2010/2/24 Taylan Develioglutdevelio...@ebuddy.com: Hi, I have jvm's, running tomcat and our application, exiting mysteriously, and was wondering if anyone could give me some advice on how to debug this thing. There is nothing in catalina.out, nor our application logs, and no hotspot error file. GC log looks normal. No trace in system logs. I am left completely clueless :(, has anyone dealt with a problem like this before? There is currently a thread named Tomcat dies suddenly Look there for starters. While that is unlikely your case, most ideas of diagnosing such an issue are mentioned in the first dozen of messages of that thread. http://marc.info/?t=12632496092r=1w=2 http://marc.info/?t=12633901125r=1w=2 http://marc.info/?t=12647949758r=6w=2 http://marc.info/?t=12660960545r=1w=2 Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: jvm exits without trace
I'll be sure to post an update if u16 resolves it. Or any other progress for that matter. In the meantime don't be shy either :) On Wed, 2010-02-24 at 14:52 +0100, Carl wrote: Taylan, The failures we've seen are in anywhere between 8 hours to a week of runtime. The timing of the failures seems similar. We have also had failures with hotspot error files (hs_err) present, and the cause specified was indeed SIGSEGV indicating a page fault. I have never seen any hs_* files but have seen core files where strace showed the jvm stopped on a seg fault. We also use jdk 1.6.0_18, I'm downgrading the machines to 1.6.0_16 when the situation allows (during regular updates of the application, or a crash) to see if that helps. I have used jdk 1.6.0_17 and 1.6.0_18 with the same results... have not tried 1.6.0_16. Please post your results of this trial. Running tomcat on the foreground might show something, but then again I could be waiting for a month for it to happen. Yes, this has been part of my problem as anytime we change something, we have to wait a week for the server to fail. In one sense, I am fortunate that I have a little more flexibility than you. I have two servers (different hardware) but only need one in service at a time. Therefore, I always have one server I can test ideas on although I have never been able to develop a meaningful stress test, i.e., the only way I can test a change is to put it in production. Thanks, Carl - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org