Re: 7.0.22+ fd leak with APR/native
Has a bug been logged for this issue (what seems to be a file descriptor leak)? On Tue, Jan 3, 2012 at 1:17 PM, Mark Thomas ma...@apache.org wrote: I am trying to bring together all the information I have gleaned on this so far from the multiple threads to try and find the common factors. So far I have: - 7.0.21 is OK - 7.0.22 has an fd leak - 7.0.23 has an fd leak and may leak faster than 7.0.22 - occurs with APR/native - does not occur with BIO - has been observed in HTTP HTTPS - use of Comet does not trigger it - use of compression does not trigger it - separate connection and keep-alive timeouts does not trigger it It may be related to POST processing. I have tried (and so far failed) to reproduce this. I'll be looking at POST processing next. In the meantime, here are some further questions to try and narrow things down: 1. Does the application where this is observed make use of Servlet 3.0 async requests? 2. Does this leak occur when the NIO connector is used? 3. Are there any exceptions in the logs that weren't present in 7.0.21 or earlier? 4. Does the leak occur if sendfile is disabled? I also have reviewing the 7.0.21 to 7.0.22 changes on my todo list but there are quite a few as I was refactoring the connectors to reduce code duplication and ironically, reduce maintenance requirements, at the time. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
I am curious to know if this leak is related to unix sockets, or the IPv6 file handles. I have seen a similar issue with the NIO HTTP handler, where it does not close some connections properly and they incarnate as file handles corresponding to unix sockets (all pointing to same inode number). Even 7.0.21 has problems. Thanks, Arijit On Fri, Jan 13, 2012 at 4:24 PM, Mike Wertheim m...@hyperreal.org wrote: Has a bug been logged for this issue (what seems to be a file descriptor leak)? On Tue, Jan 3, 2012 at 1:17 PM, Mark Thomas ma...@apache.org wrote: I am trying to bring together all the information I have gleaned on this so far from the multiple threads to try and find the common factors. So far I have: - 7.0.21 is OK - 7.0.22 has an fd leak - 7.0.23 has an fd leak and may leak faster than 7.0.22 - occurs with APR/native - does not occur with BIO - has been observed in HTTP HTTPS - use of Comet does not trigger it - use of compression does not trigger it - separate connection and keep-alive timeouts does not trigger it It may be related to POST processing. I have tried (and so far failed) to reproduce this. I'll be looking at POST processing next. In the meantime, here are some further questions to try and narrow things down: 1. Does the application where this is observed make use of Servlet 3.0 async requests? 2. Does this leak occur when the NIO connector is used? 3. Are there any exceptions in the logs that weren't present in 7.0.21 or earlier? 4. Does the leak occur if sendfile is disabled? I also have reviewing the 7.0.21 to 7.0.22 changes on my todo list but there are quite a few as I was refactoring the connectors to reduce code duplication and ironically, reduce maintenance requirements, at the time. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
On 04/01/2012 13:59, Konstantin Kolinko wrote: 2012/1/4 Konstantin Kolinko knst.koli...@gmail.com: 2. The processSocket(desc[n*2+1], SocketStatus.DISCONNECT); call just above the fixed line. It looks like a NOOP, because processSocket(long,SocketStatus) has an if() that does not mention SocketStatus.DISCONNECT. I think it is one more way to leak sockets. This method is used only in comet requests. and I think that if() might be a cause of missing some ERROR events in comet. Agreed. Can the if() condition in processSocket(long,SocketStatus) be removed? Besides SocketStatus.DISCONNECT. there is one more status that is not mentioned in that if() and can lead to similar leaks: SocketStatus.ERROR. The if() was added in http://svn.apache.org/viewvc?view=revisionrevision=944518 I do not see a reason for this if(). The r944518 says about running with a SecurityManager so it should have just added a PrivilegedAction there. One more oddity with r944518 in AprEndpoint: 1. I do not understand why one has to set TCCL before calling getExecutor().execute(). We do not do this in other places. E.g. we do not do it in AprEndpoint#processSocket(long). I suspect because the Servlet 3 Async code was re-using bits of the Comet code (inconsistently in different ways on different connectors). 2. r944518 says about TCK failures, but TCK does not test comet and so it could not be a reason for the change in this method which is used only by comet. It is only used by Comet now but r944518 pre-dates r1001698 where there was a major refactoring of the connectors. So it looks that not only the if(), but the whole r944518 change in AprEndpoint has to be reverted. That looks OK to me. I'll do that but run the TCKs as a double check. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
2012/1/6 Mark Thomas ma...@apache.org: So it looks that not only the if(), but the whole r944518 change in AprEndpoint has to be reverted. That looks OK to me. I'll do that but run the TCKs as a double check. I am running TestCometProcessor tests with this very change just now. ;) (My plan is to reenable TestCometProcessor - that is revert http://svn.apache.org/viewvc?rev=1225632view=rev but with the following change in testCometConnectorStop(): - Do not ask reader thread whether END event message was received over the wire, but ask the servlet directly whether the last CometEvent that it processed was EventType.END. I think it would be more reliable. ) The NIO connector is OK, but I observe an issue with APR one: In testCometConnectorStop() the test fails: [[[ junit.framework.AssertionFailedError: No exception in writing thread at org.apache.catalina.comet.TestCometProcessor.testCometConnectorStop(TestCometProcessor.java:289) ]]] So no SocketException happens in the WriterThread. At the same time when I ask SimpleCometServlet for the last event that it received it is EventType.END. My thought is that the WriterThread does not see that the socket that it writes to has been closed. (It probably is not of much concern because it happens at shutdown, but it might be that we are still missing something). I thought that maybe during shutdown the #processSocket(long,SocketStatus) that uses an executor to run the event did not have chance to run, but because the END event was received by servlet I think that it did run. Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
On 06/01/2012 20:13, Konstantin Kolinko wrote: 2012/1/6 Mark Thomas ma...@apache.org: So it looks that not only the if(), but the whole r944518 change in AprEndpoint has to be reverted. That looks OK to me. I'll do that but run the TCKs as a double check. I am running TestCometProcessor tests with this very change just now. ;) (My plan is to reenable TestCometProcessor - that is revert http://svn.apache.org/viewvc?rev=1225632view=rev but with the following change in testCometConnectorStop(): - Do not ask reader thread whether END event message was received over the wire, but ask the servlet directly whether the last CometEvent that it processed was EventType.END. I think it would be more reliable. ) I'm not keen on that idea since the point of the test was to ensure that the END event was received by the client, not that it was processed by the Server. The NIO connector is OK, but I observe an issue with APR one: In testCometConnectorStop() the test fails: [[[ junit.framework.AssertionFailedError: No exception in writing thread at org.apache.catalina.comet.TestCometProcessor.testCometConnectorStop(TestCometProcessor.java:289) ]]] So no SocketException happens in the WriterThread. At the same time when I ask SimpleCometServlet for the last event that it received it is EventType.END. My thought is that the WriterThread does not see that the socket that it writes to has been closed. (It probably is not of much concern because it happens at shutdown, but it might be that we are still missing something). I'd be happy for the lack of a WriterThread exception to trigger a log message rather than a test failure. Mark I thought that maybe during shutdown the #processSocket(long,SocketStatus) that uses an executor to run the event did not have chance to run, but because the END event was received by servlet I think that it did run. Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
2012/1/7 Mark Thomas ma...@apache.org: On 06/01/2012 20:13, Konstantin Kolinko wrote: 2012/1/6 Mark Thomas ma...@apache.org: So it looks that not only the if(), but the whole r944518 change in AprEndpoint has to be reverted. That looks OK to me. I'll do that but run the TCKs as a double check. I am running TestCometProcessor tests with this very change just now. ;) (My plan is to reenable TestCometProcessor - that is revert http://svn.apache.org/viewvc?rev=1225632view=rev but with the following change in testCometConnectorStop(): - Do not ask reader thread whether END event message was received over the wire, but ask the servlet directly whether the last CometEvent that it processed was EventType.END. I think it would be more reliable. ) I'm not keen on that idea since the point of the test was to ensure that the END event was received by the client, not that it was processed by the Server. Is there any promise that it should happen? Won't the client see the same from closed socket? (Anyway ci failures showed that it does not happen reliably). The NIO connector is OK, but I observe an issue with APR one: In testCometConnectorStop() the test fails: [[[ junit.framework.AssertionFailedError: No exception in writing thread at org.apache.catalina.comet.TestCometProcessor.testCometConnectorStop(TestCometProcessor.java:289) ]]] So no SocketException happens in the WriterThread. At the same time when I ask SimpleCometServlet for the last event that it received it is EventType.END. My thought is that the WriterThread does not see that the socket that it writes to has been closed. (It probably is not of much concern because it happens at shutdown, but it might be that we are still missing something). I'd be happy for the lack of a WriterThread exception to trigger a log message rather than a test failure. Mark I thought that maybe during shutdown the #processSocket(long,SocketStatus) that uses an executor to run the event did not have chance to run, but because the END event was received by servlet I think that it did run. Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
2012/1/4 Mark Thomas ma...@apache.org: On 03/01/2012 21:32, Mark Thomas wrote: On 03/01/2012 21:26, André Warnier wrote: Mark Thomas wrote: I am trying to bring together all the information I have gleaned on this so far from the multiple threads to try and find the common factors. snip/ Suggestion: the large POST requests mentioned by a couple of posters suggest file uploads, which may imply temporary files used to buffer the files being uploaded, no ? Probably not. That code is connector agnostic. There might be something in the APR InputBuffer though. The APR InputBuffer is untouched between 7.0.21 and 7.0.22. However, I did find an error in r1166072 that might explain the fd leak. I am trying to reproduce it at the moment. In the meantime, if someone could try the following patch that would be great: Index: java/org/apache/tomcat/util/net/AprEndpoint.java === --- java/org/apache/tomcat/util/net/AprEndpoint.java (revision 1226551) +++ java/org/apache/tomcat/util/net/AprEndpoint.java (working copy) @@ -1364,7 +1364,6 @@ } else { destroySocket(desc[n*2+1]); } - return true; } } } else if (rv 0) { Good catch. 1. I think to reproduce the leak here the Poll.pol() has to return several sockets where one has Poll.APR_POLLHUP flag set. The sockets after that one will be lost. (I do not see an easy way to get processSocket() to return false, so I think APR_POLLHUP is a more easy target to reproduce this.). 2. The processSocket(desc[n*2+1], SocketStatus.DISCONNECT); call just above the fixed line. It looks like a NOOP, because processSocket(long,SocketStatus) has an if() that does not mention SocketStatus.DISCONNECT. I think it is one more way to leak sockets. This method is used only in comet requests. Can the if() condition in processSocket(long,SocketStatus) be removed? Besides SocketStatus.DISCONNECT. there is one more status that is not mentioned in that if() and can lead to similar leaks: SocketStatus.ERROR. The if() was added in http://svn.apache.org/viewvc?view=revisionrevision=944518 I do not see a reason for this if(). The r944518 says about running with a SecurityManager so it should have just added a PrivilegedAction there. Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
2012/1/4 Konstantin Kolinko knst.koli...@gmail.com: 2. The processSocket(desc[n*2+1], SocketStatus.DISCONNECT); call just above the fixed line. It looks like a NOOP, because processSocket(long,SocketStatus) has an if() that does not mention SocketStatus.DISCONNECT. I think it is one more way to leak sockets. This method is used only in comet requests. and I think that if() might be a cause of missing some ERROR events in comet. Can the if() condition in processSocket(long,SocketStatus) be removed? Besides SocketStatus.DISCONNECT. there is one more status that is not mentioned in that if() and can lead to similar leaks: SocketStatus.ERROR. The if() was added in http://svn.apache.org/viewvc?view=revisionrevision=944518 I do not see a reason for this if(). The r944518 says about running with a SecurityManager so it should have just added a PrivilegedAction there. One more oddity with r944518 in AprEndpoint: 1. I do not understand why one has to set TCCL before calling getExecutor().execute(). We do not do this in other places. E.g. we do not do it in AprEndpoint#processSocket(long). 2. r944518 says about TCK failures, but TCK does not test comet and so it could not be a reason for the change in this method which is used only by comet. So it looks that not only the if(), but the whole r944518 change in AprEndpoint has to be reverted. Best regards, Konstantin Kolinko - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
On Tue, Jan 3, 2012 at 1:17 PM, Mark Thomas ma...@apache.org wrote: I am trying to bring together all the information I have gleaned on this so far from the multiple threads to try and find the common factors. So far I have: - 7.0.21 is OK - 7.0.22 has an fd leak - 7.0.23 has an fd leak and may leak faster than 7.0.22 - occurs with APR/native - does not occur with BIO - has been observed in HTTP HTTPS - use of Comet does not trigger it - use of compression does not trigger it - separate connection and keep-alive timeouts does not trigger it It may be related to POST processing. I have tried (and so far failed) to reproduce this. I'll be looking at POST processing next. In the meantime, here are some further questions to try and narrow things down: 1. Does the application where this is observed make use of Servlet 3.0 async requests? No 2. Does this leak occur when the NIO connector is used? I haven't tried NIO. I only tried APR. 3. Are there any exceptions in the logs that weren't present in 7.0.21 or earlier? No 4. Does the leak occur if sendfile is disabled? Yes. I just disabled sendfile and the behavior is the same, I also have reviewing the 7.0.21 to 7.0.22 changes on my todo list but there are quite a few as I was refactoring the connectors to reduce code duplication and ironically, reduce maintenance requirements, at the time. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
Mark Thomas wrote: I am trying to bring together all the information I have gleaned on this so far from the multiple threads to try and find the common factors. So far I have: - 7.0.21 is OK - 7.0.22 has an fd leak - 7.0.23 has an fd leak and may leak faster than 7.0.22 - occurs with APR/native - does not occur with BIO - has been observed in HTTP HTTPS - use of Comet does not trigger it - use of compression does not trigger it - separate connection and keep-alive timeouts does not trigger it It may be related to POST processing. I have tried (and so far failed) to reproduce this. I'll be looking at POST processing next. In the meantime, here are some further questions to try and narrow things down: 1. Does the application where this is observed make use of Servlet 3.0 async requests? 2. Does this leak occur when the NIO connector is used? 3. Are there any exceptions in the logs that weren't present in 7.0.21 or earlier? 4. Does the leak occur if sendfile is disabled? I also have reviewing the 7.0.21 to 7.0.22 changes on my todo list but there are quite a few as I was refactoring the connectors to reduce code duplication and ironically, reduce maintenance requirements, at the time. Suggestion: the large POST requests mentioned by a couple of posters suggest file uploads, which may imply temporary files used to buffer the files being uploaded, no ? - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
On 03/01/2012 21:26, André Warnier wrote: Mark Thomas wrote: I am trying to bring together all the information I have gleaned on this so far from the multiple threads to try and find the common factors. snip/ Suggestion: the large POST requests mentioned by a couple of posters suggest file uploads, which may imply temporary files used to buffer the files being uploaded, no ? Probably not. That code is connector agnostic. There might be something in the APR InputBuffer though. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: 7.0.22+ fd leak with APR/native
On 03/01/2012 21:32, Mark Thomas wrote: On 03/01/2012 21:26, André Warnier wrote: Mark Thomas wrote: I am trying to bring together all the information I have gleaned on this so far from the multiple threads to try and find the common factors. snip/ Suggestion: the large POST requests mentioned by a couple of posters suggest file uploads, which may imply temporary files used to buffer the files being uploaded, no ? Probably not. That code is connector agnostic. There might be something in the APR InputBuffer though. The APR InputBuffer is untouched between 7.0.21 and 7.0.22. However, I did find an error in r1166072 that might explain the fd leak. I am trying to reproduce it at the moment. In the meantime, if someone could try the following patch that would be great: Index: java/org/apache/tomcat/util/net/AprEndpoint.java === --- java/org/apache/tomcat/util/net/AprEndpoint.java(revision 1226551) +++ java/org/apache/tomcat/util/net/AprEndpoint.java(working copy) @@ -1364,7 +1364,6 @@ } else { destroySocket(desc[n*2+1]); } -return true; } } } else if (rv 0) { Cheers, Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org