I've got a jserv compiled into apache and I'm using it to run a servlet that does an RMI call for ever doGet that it receives. While doing a stress test with the new native thread support, it looks like I've come across a bug related to memory management or thread creation. To do the stress test, I had two computers repeatedly doing a request on the servlet. Back to back, non stop. (It was a pretty unrealistic load for what the web server would do, but I'm also showing that apache on Linux can stand up to this abuse.) After doing 1000 hits in about 69 seconds, the vm would seg fault. The first time I did a trial I had a System.gc() at the bottom of the doGet to force a clean up and I got this as the stack trace: SIGSEGV 11* segmentation violation stackbase=BF1FFE7C, stackpointer=BF1FFA10 Full thread dump: "Thread-1002" (TID:0x40bd3d98, sys_thread_t:0x8183758, state:R, thread_t: t@1040375, sp:0x0 threadID:0x0, stack_base:0x0, stack_size:0x0) prio=5 "Thread-1001" (TID:0x40bd40d0, sys_thread_t:0x8109698, state:R, thread_t: t@1039350, sp:0x40fff334 threadID:0x1d1d, stack_base:0x40fffe7c, stack_size:0x200000) prio=10 java.lang.System.gc(System.java) EngineSnoopServlet.doGet(EngineSnoopServlet.java:125) javax.servlet.http.HttpServlet.service(HttpServlet.java:486) javax.servlet.http.HttpServlet.service(HttpServlet.java:575) org.apache.jserv.JServConnection.run(JServConnection.java:349) "ConnectionExpiration-10.0.4.50:18287" (TID:0x40bd2a90, sys_thread_t:0x81110d0, state:CW, thread_t: t@18450, sp:0xbd7ff830 threadID:0x191a, stack_base:0xbd7ffe7c, stack_size:0x200000) prio=10 This lead me to beleive the bug was in memory management since it appeared to be happening in the call for garbage collection. To test my theory, I did a trial again, and again it failed right after the 1000 hit with a this stack trace that looks very very similar to the first: SIGSEGV 11* segmentation violation stackbase=BF1FFE7C, stackpointer=BF1FFA10 Full thread dump: "Thread-1005" (TID:0x40bd3b78, sys_thread_t:0x812fa50, state:R, thread_t: t@1040375, sp:0x0 threadID:0x0, stack_base:0x0, stack_size:0x0) prio=5 "Thread-1004" (TID:0x40bd3b70, sys_thread_t:0x80c77d8, state:R, thread_t: t@1039350, sp:0x40fff334 threadID:0x232c, stack_base:0x40fffe7c, stack_size:0x200000) prio=10 java.lang.System.gc(System.java) EngineSnoopServlet.doGet(EngineSnoopServlet.java:125) javax.servlet.http.HttpServlet.service(HttpServlet.java:486) javax.servlet.http.HttpServlet.service(HttpServlet.java:575) org.apache.jserv.JServConnection.run(JServConnection.java:349) "ConnectionExpiration-10.0.4.50:18287" (TID:0x40bd3638, sys_thread_t:0x81cfe98, state:CW, thread_t: t@14350, sp:0xbdfff830 threadID:0x1f23, stack_base:0xbdfffe7c, stack_size:0x200000) prio=10 To see if something different would happen if I didn't call System.gc() I removed that from my code. Again, after 1000 hits, the VM seg faulted, only this time I got a different stack trace: Full thread dump: "Thread-1005" (TID:0x40c253d0, sys_thread_t:0x812f510, state:R, thread_t: t@1040375, sp:0x0 threadID:0x0, stack_base:0x0, stack_size:0x0) prio=5 "ConnectionExpiration-10.0.4.50:18287" (TID:0x40bd3630, sys_thread_t:0x81d0018, state:CW, thread_t: t@14350, sp:0xbdfff830 threadID:0x2629, stack_base:0xbdfffe7c, stack_size:0x200000) prio=10 sun.rmi.transport.tcp.TCPChannel$Reaper.run(TCPChannel.java:458) java.lang.Thread.run(Thread.java) "ConnectionExpiration-10.0.4.50:18010" (TID:0x40bd2528, sys_thread_t:0x81a12e8, state:CW, thread_t: t@13325, sp:0xbe1ff830 threadID:0x2627, stack_base:0xbe1ffe7c, stack_size:0x200000) prio=10 sun.rmi.transport.tcp.TCPChannel$Reaper.run(TCPChannel.java:458) java.lang.Thread.run(Thread.java) "Cleaner" (TID:0x40bd1220, sys_thread_t:0x81b9000, state:CW, thread_t: t@12300, sp:0xbe3ff820 threadID:0x2625, stack_base:0xbe3ffe7c, stack_size:0x200000) prio=10 sun.rmi.transport.DGCClient.run(DGCClient.java:621) java.lang.Thread.run(Thread.java) "LeaseRenewer" (TID:0x40bd10d8, sys_thread_t:0x81b8ee0, state:CW, thread_t: t@11275, sp:0xbe5ff820 threadID:0x2624, stack_base:0xbe5ffe7c, stack_size:0x200000) prio=10 sun.rmi.transport.DGCClient.doRenewal(DGCClient.java:677) sun.rmi.transport.DGCClient$LeaseRenewer.run(DGCClient.java:936) java.lang.Thread.run(Thread.java) "Reaper" (TID:0x40bd0fe0, sys_thread_t:0x81b7198, state:CW, thread_t: t@10250, sp:0xbe7ff830 threadID:0x2623, stack_base:0xbe7ffe7c, stack_size:0x200000) prio=10 sun.rmi.transport.Reaper.run(ObjectTable.java:199) java.lang.Thread.run(Thread.java) "ConnectionExpiration-10.0.4.50:1099" (TID:0x40bd0d18, sys_thread_t:0x81a2ba0, state:CW, thread_t: t@9225, sp:0xbe9ff830 threadID:0x2622, stack_base:0xbe9ffe7c, stack_size:0x200000) prio=10 sun.rmi.transport.tcp.TCPChannel$Reaper.run(TCPChannel.java:458) java.lang.Thread.run(Thread.java) "Thread-2" (TID:0x40bcc610, sys_thread_t:0x81113d0, state:R, thread_t: t@5125, sp:0xbf1ffa6c threadID:0x2613, stack_base:0xbf1ffe7c, stack_size:0x200000) prio=5 *current thread* It appears that something terrible is being hit right after the 1000th call (and maybe right around the 1000th thread creation?) Can anyone recreate this behavior? Keith, who's submiting this to jitterbug as well as to the list -- Keith T. Garner [EMAIL PROTECTED] STR Consultant http://www.str.com/ [EMAIL PROTECTED] "And to all the fly Linux coders out there workin on the Deuce-Deuce, Peace...." -- Jason Beatty