Hi Mark, On 25.09.2013 01:50, Mark Thomas wrote: > On 24/09/2013 16:19, Mark Thomas wrote: >> On 24/09/2013 08:16, Mark Thomas wrote: >>> On 23/09/2013 00:54, Rainer Jung wrote: >>> >>>> I agree that there's probably another problem further up the stack. >>> >>> I'm looking at this now. >> >> I've found a couple of issues. Not sure if either of these are the root >> cause of the remaining issues we see but I'll look into fixing them. >> >> - AprEndpoint.destroySocket() calls Poller.removeFromPoller() but that >> call isn't thread safe. I think we need a remove process similar to >> the add process. >> - sockets are not always removed from the timeout list when they are >> destroyed > > Fixed. > > Rainer, over to you.
The TestCoyoteAdapter now neither fails with tcnative 1.1.28 nor with the additional APR_NOTFOUND change. I did 50 runs for each of the two versions and also for bio/nio/apr and none failed. Good! I'm still a bit nervous about coupling the connection count decrement with the successful removal from the poller. I ran TestCoyoteAdapter with an additional log message which would tell me whether Poll.remove returned APR_NOTFOUND and also whether a remove wasn't successful for any reason. During 50 runs for bio/nio/apr each no such log output was written, so there were not unsuccessful or unneeded calls to Poll.remove. The TestCoyoteOutputStream still fails relatively often. This failure happens for bio, nio and apr. With the patched tcnative, I get 7/3/7 failures out of 50 runs each for bio/nio/apr. Except for one failure all of them are hangs. For apr the hang always happens in testNonBlockingWriteNoneBlockingWriteNone, for bio half of them in testNonBlockingWriteTwiceBlockingWriteOnce and the other half in testNonBlockingWriteNoneBlockingWriteNone, for nio one in testNonBlockingWriteOnceBlockingWriteOnce and one in testNonBlockingWriteTwiceBlockingWriteOnce, the third nio failure is a non-hang with failure: Testcase: testNonBlockingWriteTwiceBlockingWriteNone took 4.146 sec Caused an ERROR Bogus chunk size java.io.IOException: Bogus chunk size at sun.net.www.http.ChunkedInputStream.processRaw(ChunkedInputStream.java:319) at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:572) at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) at java.io.FilterInputStream.read(FilterInputStream.java:133) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3053) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at java.io.FilterInputStream.read(FilterInputStream.java:107) at org.apache.catalina.startup.TomcatBaseTest.methodUrl(TomcatBaseTest.java:264) at org.apache.catalina.startup.TomcatBaseTest.getUrl(TomcatBaseTest.java:219) at org.apache.catalina.connector.TestCoyoteOutputStream.doNonBlockingTest(TestCoyoteOutputStream.java:91) at org.apache.catalina.connector.TestCoyoteOutputStream.testNonBlockingWriteTwiceBlockingWriteNone(TestCoyoteOutputStream.java:54) The hangs, the common case, always have the following exception: Exception in thread "http-apr-127.0.0.1-auto-3-exec-3" java.lang.IllegalStateException: Calling [asyncDispatch()] is not valid for a request with Async state [READ_WRITE_OP] at org.apache.coyote.AsyncStateMachine.asyncDispatch(AsyncStateMachine.java:282) at org.apache.coyote.http11.Http11AprProcessor.actionInternal(Http11AprProcessor.java:483) at org.apache.coyote.http11.AbstractHttp11Processor.action(AbstractHttp11Processor.java:838) at org.apache.coyote.Request.action(Request.java:373) at org.apache.catalina.core.AsyncContextImpl.dispatch(AsyncContextImpl.java:234) at org.apache.catalina.core.AsyncContextImpl.dispatch(AsyncContextImpl.java:191) at org.apache.catalina.connector.TestCoyoteOutputStream$NonBlockingWriteServlet.doAsyncWrite(TestCoyoteOutputStream.java:143) at org.apache.catalina.connector.TestCoyoteOutputStream$NonBlockingWriteServlet.access$000(TestCoyoteOutputStream.java:107) at org.apache.catalina.connector.TestCoyoteOutputStream$NonBlockingWriteServlet$AsyncTask.run(TestCoyoteOutputStream.java:163) at org.apache.catalina.core.AsyncContextImpl$RunnableWrapper.run(AsyncContextImpl.java:568) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) The thread dumps shows the main thread hanging in waiting for an http answer: "main" prio=3 tid=0x00029800 nid=0x2 runnable [0xfdf7d000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) - locked <0xe6355af8> (a java.io.BufferedInputStream) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) - locked <0xe6099b20> (a sun.net.www.protocol.http.HttpURLConnection) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468) at org.apache.catalina.startup.TomcatBaseTest.methodUrl(TomcatBaseTest.java:247) at org.apache.catalina.startup.TomcatBaseTest.getUrl(TomcatBaseTest.java:219) at org.apache.catalina.connector.TestCoyoteOutputStream.doNonBlockingTest(TestCoyoteOutputStream.java:91) at org.apache.catalina.connector.TestCoyoteOutputStream.testNonBlockingWriteNoneBlockingWriteNone(TestCoyoteOutputStream.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:38) Maybe we could set (if not already in place) a connect and a read timeout to let the tests proceed (which of course does not fix the failure). The pool threads do not seem to work on anything during the hang. Regards, Rainer --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org