Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Having finally received the actual details of what the OP actually is doing in email #37 of this thread, I was struck by a simple thought. I have re-read the whole thread, and don't think/hope that I am about to say anything completely stupid. We develop software that routes millions of requests to dozens of Tomcat instances. So you have your own software in front of many tomcats that is responsible for distributing the load between multiple tomcat instances Yes, we can and do support connection throttling at a slight cost to safeguard a single Tomcat from receiving more connections that it can, but if Tomcat was able to not reset connections at the TCP level - we can perform our task much better, and I do not think this will cause any problem to any other use cases of Tomcat - if we can just enable this behavior with a configuration parameter My simple thought was that it sounds like your code isn't working. You have more load than one tomcat instance can handle, which overloads that instance. You are trying to write code to handle this situation, and seem convinced that the only solution is to alter tomcat such that you can detect/handle this occurrence in a way that is easier for your software. You also state that when this happens, you will simple route to there tomcat instances - the implicit assumption that they have spare capacity on the other instances. If this is the case, why didn't your code route to these other instances in the first place? Surely this would obviate the need for any changes to tomcat? What algorithm do you use to determine where to send the load? I do not understand the negativity here.. After writing comments such as If you can, try to understand what I said better.. Its ok to not accept this proposal and/or not understand it.. you really can't understand the negativity? Really? Are you sure? Chris
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris, On 11/14/12 11:02 AM, chris derham wrote: My simple thought was that it sounds like your code isn't working. You have more load than one tomcat instance can handle, which overloads that instance. You are trying to write code to handle this situation, and seem convinced that the only solution is to alter tomcat such that you can detect/handle this occurrence in a way that is easier for your software. I think this is an accurate summary of the proposal. Honestly, it does make *some* sense because the lb's job is to determine what is going on with the backend servers and distribute load. If one backend server is unhealthy, the lb needs to know about it. I'm not sure that pushing HTTP connections through to the backend is the proper way to do that - -- there *are* other ways to determine if Tomcat is under heavy load and those options do not necessarily use HTTP request/response to test them. For instance, if you use AJP, you can use cping/cpong. You can probably also use JMX (but that requires that the client has a JMX client available to it, or that you do something like use HTTP requests to the manager webapp's JMXProxyServlet). - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCkB7UACgkQ9CaO5/Lv0PB1tACgvQ59BYQ+THOajPGRJSXVGawz OJkAoIgVD1hzfHoDxyz54wFmeF4tnNZU =2a7f -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/14/12 11:02 AM, chris derham wrote: My simple thought was that it sounds like your code isn't working. You have more load than one tomcat instance can handle, which overloads that instance. You are trying to write code to handle this situation, and seem convinced that the only solution is to alter tomcat such that you can detect/handle this occurrence in a way that is easier for your software. I think this is an accurate summary of the proposal. Honestly, it does make *some* sense because the lb's job is to determine what is going on with the backend servers and distribute load. If one backend server is unhealthy, the lb needs to know about it. Thank you everyone for taking time to comment on this thread and sharing your thoughts I now realize that it would be better to avoid this situation from our end as suggested, than trying to detect and overcome it regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
In addition, those clients will then get exactly the behaviour that you are complaining about: a successful connection and then a 'connection reset' when doing I/O. No, they will not get an ACK to the SYN packet which is much better. No, Asahnka, you are completely wrong about this. Clients on the backlog queue have *already got* the SYN/ACK in reply to the SYN packet. That's *why* they are on the backlog queue. They have fully formed connections, and they have probably already sent at least part of their HTTP request. If you close the listening socket as per your bizarre suggestion, all the connections on the backlog queue will get closed, and the clients concerned will get 'connection reset' if they are still sending, or, even worse, premature EOFs when trying to receive the reply. You don't seem to know much about it. Esmond Pitt FACS Author, 'java.rmi: the Guide to Remote Method Invocation' and 'Fundamental Networking in Java'. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/9/2012 1:41 PM, Christopher Schultz wrote: Closing the listening socket, as you seem to be now suggesting, is a very poor idea indeed: what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? I haven't been following this thread closely enough to comment on the proposed solution but isn't preventing unintended usage of a port a systems administration problem? What happens when Tomcat is restarted? -Terence Bandoian - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Esmond, On 11/11/12 10:18 PM, Esmond Pitt wrote: Asankha To reiterate what Christopher said, if you close the listening socket because you think you can't service one extra client, you will lose all the connections on the backlog queue, which could be hundreds of clients, that you *can* service. In addition, those clients will then get exactly the behaviour that you are complaining about: a successful connection and then a 'connection reset' when doing I/O. +1 Note also that you cannot know how long the backlog queue is, nor how many clients are already in it. There is no possibility of this proposal being accepted. +1 - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlChTr8ACgkQ9CaO5/Lv0PArRgCfT6Hlmw1jIbTp/FmTXZFZp36p macAoJA4UkWcusLuq7KxZIelClXzlsm4 =v5Fd -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond To reiterate what Christopher said, if you close the listening socket because you think you can't service one extra client, you will lose all the connections on the backlog queue, which could be hundreds of clients, that you *can* service. I do not see a problem here. We develop software that routes millions of requests to dozens of Tomcat instances. For a single instance of Tomcat - the problem is much simpler. It should just handle the maximum number of connections its configured for. If each single Tomcat instance behaves slightly better (i.e. refuse vs reset), we can avoid making any guesses about whether Tomcat has crashed etc - if a TCP level reset on the connection is received instead of a refused, and can perform much better support independent of whether a service or method is idempotent. This context is different from a clients making say JSP calls for a UI. The end clients connecting to us may use HTTP or HTTPS, with out without keep-alives etc. We will handle those connections, and then route, load balance and failover against dozens of Tomcats. Now each connection we establish to a Tomcat will almost always be a well kept Keep-Alive, which is re-used even for different requests originating from multiple external clients. So if we are managing say 10K connections, maybe 4K with keep-alive and 6K without with our clients, we will still use a limited number of keep alive connections to each single Tomcat we load balance and fail over against. Yes, we can and do support connection throttling at a slight cost to safeguard a single Tomcat from receiving more connections that it can, but if Tomcat was able to not reset connections at the TCP level - we can perform our task much better, and I do not think this will cause any problem to any other use cases of Tomcat - if we can just enable this behavior with a configuration parameter In addition, those clients will then get exactly the behaviour that you are complaining about: a successful connection and then a 'connection reset' when doing I/O. No, they will not get an ACK to the SYN packet which is much better. Else they will get an ACK, and the Tomcat will start receiving a part or whole payload - and then reset which is nasty - this is the main difference There is no possibility of this proposal being accepted. I do not understand the negativity here.. I was wondering if I should take this discussion to the dev@ list since I've already discussed it on user@. I wish Tomcat had a Wiki or used JIRA where I could submit this as a proposal - maybe with a diagram/screen shots etc - and make end users vote for this across a few months - until we find that this solution has a value. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/12/2012 10:47 PM, Terence M. Bandoian wrote: On 11/9/2012 1:41 PM, Christopher Schultz wrote: Closing the listening socket, as you seem to be now suggesting, is a very poor idea indeed: what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? I haven't been following this thread closely enough to comment on the proposed solution but isn't preventing unintended usage of a port a systems administration problem? What happens when Tomcat is restarted? Exactly.. also if a node is to restart say on a cloud infra after a crash etc.. and if two processes are to fight for the same port - it would be quite unpredictable.. I wonder if there are any real users who let this happen on a production environment as described on this list regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Esmond Pitt esmond.p...@bigpond.com wrote: Asankha To reiterate what Christopher said, if you close the listening socket because you think you can't service one extra client, you will lose all the connections on the backlog queue, which could be hundreds of clients, that you *can* service. Indeed. I'd like to hear the OP's response to this problem. In addition, those clients will then get exactly the behaviour that you are complaining about: a successful connection and then a 'connection reset' when doing I/O. That is my understanding of what would happen as well. Again, I'd like to hear the OP's response. Note also that you cannot know how long the backlog queue is, nor how many clients are already in it. There is no possibility of this proposal being accepted. It certainly looks that way at the moment. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Asankha C. Perera asan...@apache.org wrote: I do not understand the negativity here.. You are trying to solve a problem no-one here recognises. You are ignoring and/or dismissing people that ask difficult questions or point out flaws in your proposal. Your posts - this one in particular - come across as very arrogant and that never wins people over. I was wondering if I should take this discussion to the dev@ list since I've already discussed it on user@. Please don't. Rather than trying to bypass the users list because no-one supports your view, you should be engaging with the list more and addressing the issues and questions raised. This belongs on users until there is agreement that something needs to change. If that happens, then the details of the how can be discussed on dev. I wish Tomcat had a Wiki It does. I wish you'd bothered to look for it. or used JIRA where I could submit this as a proposal - maybe with a diagram/screen shots etc - Jira is not a prerequisite for submitting an enhancement requst. All ASF bug trackers - including the one Tomcat uses - support enhancement requests. and make end users vote for this across a few months - until we find that this solution has a value. WTF? Make end users vote? Keep at this until everyone realises they were wrong and you were right? Seriously? You really don't seem to understand how Apache works at all. I did hope that some value could emerge from this thread such as a critique of my analysis (that demonstrated that the proposed solution is pointless) where the critique identified some scenarios where Tomcat could do better. However, as this thread continues and the probing questions are ignored I am rapidly reaching the conclusion that continuing this discussion is a waste of time. I hope your reply to this invalidates that conclusion. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi All, I'm new to this mailing list. I read the above mail list and not able to grasp the concepts like load balancing, keep alive. Can you please give me links where I can find links about it. On Tue, Nov 13, 2012 at 6:32 AM, Mark Thomas ma...@apache.org wrote: Asankha C. Perera asan...@apache.org wrote: I do not understand the negativity here.. You are trying to solve a problem no-one here recognises. You are ignoring and/or dismissing people that ask difficult questions or point out flaws in your proposal. Your posts - this one in particular - come across as very arrogant and that never wins people over. I was wondering if I should take this discussion to the dev@ list since I've already discussed it on user@. Please don't. Rather than trying to bypass the users list because no-one supports your view, you should be engaging with the list more and addressing the issues and questions raised. This belongs on users until there is agreement that something needs to change. If that happens, then the details of the how can be discussed on dev. I wish Tomcat had a Wiki It does. I wish you'd bothered to look for it. or used JIRA where I could submit this as a proposal - maybe with a diagram/screen shots etc - Jira is not a prerequisite for submitting an enhancement requst. All ASF bug trackers - including the one Tomcat uses - support enhancement requests. and make end users vote for this across a few months - until we find that this solution has a value. WTF? Make end users vote? Keep at this until everyone realises they were wrong and you were right? Seriously? You really don't seem to understand how Apache works at all. I did hope that some value could emerge from this thread such as a critique of my analysis (that demonstrated that the proposed solution is pointless) where the critique identified some scenarios where Tomcat could do better. However, as this thread continues and the probing questions are ignored I am rapidly reaching the conclusion that continuing this discussion is a waste of time. I hope your reply to this invalidates that conclusion. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
From: selvakumar netaji [mailto:vvekselva...@gmail.com] Subject: Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT I'm new to this mailing list. I read the above mail list and not able to grasp the concepts like load balancing, keep alive. Can you please give me links where I can find links about it. http://lmgtfy.com/?q=load+balancing http://lmgtfy.com/?q=keep+alive - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 10/11/2012 04:52, Asankha C. Perera wrote: Hi Chris processing 1 connection through completion (there are 99 others still running), re-binding, accepting a single connection into the application plus 100 others into the backlog, then choking again and dropping 100 connections, then processing another single connection. That's a huge waste of time unbinding and re-binding to the port, killing the backlog over and over again... and all for 1-connection-at-a-time pumping. Insanity. I'm sorry but you've misunderstood what I was saying. Yes the example I used showed it for one connection to make it easier to understand what I was proposing. But in reality you would not stop and start at each connection. Remember the two thresholds I was talking about? You could stop listening at 4K connections, and start listening again when the connections drops to say 3K - and these could be user specified parameters based on the deployment. HTTP keep-alive from a load balancer in front would work extremely well under these conditions as established TCP connections are re-used. Any production grade load balancer could immediately fail-over only the failing requests to another Tomcat when one is under too much load - and this would work for even non-idempotent services. If there's an LB in front, it should be protecting the Tomcat instance from an excessive number of connections, no? p You want to add all this extra complexity to the code and, IMO, shitty handling of your incoming connections just so you can say well, you're getting 'connection refused' instead of hanging... isn't that better?. I assert that it is *not* better. Clients can set TCP handshake timeouts and survive. Your server will perform much better without all this foolishness. If you can, try to understand what I said better.. Its ok to not accept this proposal and/or not understand it.. regards asankha -- [key:62590808] signature.asc Description: OpenPGP digital signature
RE: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Asankha To reiterate what Christopher said, if you close the listening socket because you think you can't service one extra client, you will lose all the connections on the backlog queue, which could be hundreds of clients, that you *can* service. In addition, those clients will then get exactly the behaviour that you are complaining about: a successful connection and then a 'connection reset' when doing I/O. Note also that you cannot know how long the backlog queue is, nor how many clients are already in it. There is no possibility of this proposal being accepted. EJP - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Esmond, On 11/7/12 10:03 PM, Esmond Pitt wrote: Asankha I haven't said a word about your second program, that closes the listening socket. *Of course* that causes connection refusals, it can't possibly not, but it isn't relevant to the misconceptions about what OP_ACCEPT does that you have been expressing here and that I have been addressing. Closing the listening socket, as you seem to be now suggesting, is a very poor idea indeed: what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? +1 This is the TCP/IP equivalent of a busy-wait: while(!ready()) ; // Check again! Imagine the quite likely case where high load isn't just a single connection over the high-water mark that you are encountering. Let's say: active request processors = 100 backlog = 100 This means that 200 simultaneous connection can get ... somewhat well-defined behavior. Everyone else gets weirdness. Let's accept that for the time being. Let's talk about 1000 simultaneous clients pounding on this service: the 200 lucky winners essentially get connections, all others get weirdness but will likely reconnect a short time later. If you just use the IP stack's backlog, then the queue gets processed by the OS: the Java code is super-simple (just accept() and wait) and incoming connections are essentially buffered by the TCP/IP stack's backlog. Basically, your application serves requests as fast as it and the OS can allow. Instead, if you unbind and re-bind the port, you not only run the risk of losing your port (which I'll admit is fairly far-fetched, but it certainly could happen) then you are potentially dropping 100 connections immediately from the backlog (what kind of experience to *those* clients get), processing 1 connection through completion (there are 99 others still running), re-binding, accepting a single connection into the application plus 100 others into the backlog, then choking again and dropping 100 connections, then processing another single connection. That's a huge waste of time unbinding and re-binding to the port, killing the backlog over and over again... and all for 1-connection-at-a-time pumping. Insanity. You want to add all this extra complexity to the code and, IMO, shitty handling of your incoming connections just so you can say well, you're getting 'connection refused' instead of hanging... isn't that better?. I assert that it is *not* better. Clients can set TCP handshake timeouts and survive. Your server will perform much better without all this foolishness. I have yet to see any performance data but I suspect that throughput would go down substantially if this idea were to be implemented. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCdXGYACgkQ9CaO5/Lv0PD0awCdHTTziJzB0Ti9Br8hPxikSOF2 NcAAoL7oRueM8bq9eE4tKb37+uc8S1gc =bStl -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris processing 1 connection through completion (there are 99 others still running), re-binding, accepting a single connection into the application plus 100 others into the backlog, then choking again and dropping 100 connections, then processing another single connection. That's a huge waste of time unbinding and re-binding to the port, killing the backlog over and over again... and all for 1-connection-at-a-time pumping. Insanity. I'm sorry but you've misunderstood what I was saying. Yes the example I used showed it for one connection to make it easier to understand what I was proposing. But in reality you would not stop and start at each connection. Remember the two thresholds I was talking about? You could stop listening at 4K connections, and start listening again when the connections drops to say 3K - and these could be user specified parameters based on the deployment. HTTP keep-alive from a load balancer in front would work extremely well under these conditions as established TCP connections are re-used. Any production grade load balancer could immediately fail-over only the failing requests to another Tomcat when one is under too much load - and this would work for even non-idempotent services. You want to add all this extra complexity to the code and, IMO, shitty handling of your incoming connections just so you can say well, you're getting 'connection refused' instead of hanging... isn't that better?. I assert that it is *not* better. Clients can set TCP handshake timeouts and survive. Your server will perform much better without all this foolishness. If you can, try to understand what I said better.. Its ok to not accept this proposal and/or not understand it.. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 08/11/2012 03:35, Asankha C. Perera wrote: Hi Esmond Closing the listening socket, as you seem to be now suggesting, is a very poor idea indeed: I personally do not think there is anything at all bad about turning it off. After all, if you are not ready to accept more, you should be clear and upfront about it, even at the TCP level. Having different thresholds to stop listening (say at 4K), and to resume (say at 2K) would ensure that you do not start acting weirdly by starting/stopping/starting/.. acceptance around just one value. what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? In reality I do not know of a single client production deployment that would allocate the same port to possibly conflicting services, that may grab another's port when its suffering under load. Just because it wouldn't cause a problem for a limited subset of Tomcat users - your clients - does not mean that it would not cause problems for other Tomcat users. I cannot see any other issues of turning off accepting - and I am curious to know if anyone else could share their views on this - considering real production deployments The problems have already been explained to you. Another process could use the port. Having reviewed this thread the problem you seem to be trying to solve is this: - a load-balancer is in use - Tomcat is under load - a client attempts a connection - the connection is added to the TCP backlog - Tomcat does not process the connection before it times out - the connection is reset when it times out - the client can't differentiate between the above and when an error occurs during processing resulting in a connection reset - the client doesn't know whether to replay the request or not First of all, it is extremely rare for Tomcat to reset a connection once processing has started. The only circumstances where I am aware that would happen is if Tomcat is shutting down and a long running request failed to complete or if Tomcat crashes. All other error cases should receive an appropriate HTTP error code. In a controlled shut down load can be moved off the Tomcat node before it is shut down. That leaves differentiating a Tomcat crash during request processing and the request timing out in the backlog. For GET requests this should be a non-issue since GET requests are meant to be idmepotent. GET requests can always be re-tried after a TCP reset. For POST requests, use of the 100 Continue status can enable the client to determine if the headers have been received. A TCP reset before the 100 continue response means the request needs to be re-tried. A TCP reset after the 100 continue response means it is unknown if a retry is necessary (there is no way for the client to determine the correct answer in this case). Given the above I don't see any reason to change Tomcat's current behaviour. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Mark what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? In reality I do not know of a single client production deployment that would allocate the same port to possibly conflicting services, that may grab another's port when its suffering under load. Just because it wouldn't cause a problem for a limited subset of Tomcat users - your clients - does not mean that it would not cause problems for other Tomcat users. I cannot see any other issues of turning off accepting - and I am curious to know if anyone else could share their views on this - considering real production deployments The problems have already been explained to you. Another process could use the port. I would consider such production deployment as a risk - a Tomcat crash, or even a restart might end up in a soup if another process starts using the port in the mean time.. Having reviewed this thread the problem you seem to be trying to solve is this: - a load-balancer is in use - Tomcat is under load - a client attempts a connection - the connection is added to the TCP backlog - Tomcat does not process the connection before it times out - the connection is reset when it times out - the client can't differentiate between the above and when an error occurs during processing resulting in a connection reset - the client doesn't know whether to replay the request or not Yes, this is correct First of all, it is extremely rare for Tomcat to reset a connection once processing has started. The only circumstances where I am aware that would happen is if Tomcat is shutting down and a long running request failed to complete or if Tomcat crashes. All other error cases should receive an appropriate HTTP error code. In a controlled shut down load can be moved off the Tomcat node before it is shut down. That leaves differentiating a Tomcat crash during request processing and the request timing out in the backlog. For GET requests this should be a non-issue since GET requests are meant to be idmepotent. GET requests can always be re-tried after a TCP reset. For POST requests, use of the 100 Continue status can enable the client to determine if the headers have been received. A TCP reset before the 100 continue response means the request needs to be re-tried. A TCP reset after the 100 continue response means it is unknown if a retry is necessary (there is no way for the client to determine the correct answer in this case). Given the above I don't see any reason to change Tomcat's current behaviour. Ok, thank you for considering my proposal. I respect the decision of the Tomcat community. Hopefully someone else will find this thread useful in future to understand the issue better and to overcome it regards asankha -- Asankha C. Perera AdroitLogic,http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 08/11/2012 15:03, Asankha C. Perera wrote: Hi Mark what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? In reality I do not know of a single client production deployment that would allocate the same port to possibly conflicting services, that may grab another's port when its suffering under load. Just because it wouldn't cause a problem for a limited subset of Tomcat users - your clients - does not mean that it would not cause problems for other Tomcat users. I cannot see any other issues of turning off accepting - and I am curious to know if anyone else could share their views on this - considering real production deployments The problems have already been explained to you. Another process could use the port. I would consider such production deployment as a risk - a Tomcat crash, or even a restart might end up in a soup if another process starts using the port in the mean time.. It is not uncommon for monitoring tools to attempt to (re)start a service when it is observed not to be listening on its designated port. p Having reviewed this thread the problem you seem to be trying to solve is this: - a load-balancer is in use - Tomcat is under load - a client attempts a connection - the connection is added to the TCP backlog - Tomcat does not process the connection before it times out - the connection is reset when it times out - the client can't differentiate between the above and when an error occurs during processing resulting in a connection reset - the client doesn't know whether to replay the request or not Yes, this is correct First of all, it is extremely rare for Tomcat to reset a connection once processing has started. The only circumstances where I am aware that would happen is if Tomcat is shutting down and a long running request failed to complete or if Tomcat crashes. All other error cases should receive an appropriate HTTP error code. In a controlled shut down load can be moved off the Tomcat node before it is shut down. That leaves differentiating a Tomcat crash during request processing and the request timing out in the backlog. For GET requests this should be a non-issue since GET requests are meant to be idmepotent. GET requests can always be re-tried after a TCP reset. For POST requests, use of the 100 Continue status can enable the client to determine if the headers have been received. A TCP reset before the 100 continue response means the request needs to be re-tried. A TCP reset after the 100 continue response means it is unknown if a retry is necessary (there is no way for the client to determine the correct answer in this case). Given the above I don't see any reason to change Tomcat's current behaviour. Ok, thank you for considering my proposal. I respect the decision of the Tomcat community. Hopefully someone else will find this thread useful in future to understand the issue better and to overcome it regards asankha -- [key:62590808] signature.asc Description: OpenPGP digital signature
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/09/2012 02:16 AM, Pid wrote: On 08/11/2012 15:03, Asankha C. Perera wrote: Hi Mark what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? In reality I do not know of a single client production deployment that would allocate the same port to possibly conflicting services, that may grab another's port when its suffering under load. Just because it wouldn't cause a problem for a limited subset of Tomcat users - your clients - does not mean that it would not cause problems for other Tomcat users. I cannot see any other issues of turning off accepting - and I am curious to know if anyone else could share their views on this - considering real production deployments The problems have already been explained to you. Another process could use the port. I would consider such production deployment as a risk - a Tomcat crash, or even a restart might end up in a soup if another process starts using the port in the mean time.. It is not uncommon for monitoring tools to attempt to (re)start a service when it is observed not to be listening on its designated port. But that could happen even now, if the backlog fills and connections are being reset as seen currently cheers asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Mark maxThreads limits the number of concurrent threads available for processing requests. connection != concurrent request, primarily because of HTTP keep-alive. maxConnections can be used to limit the number of connections. Thanks for this insight.. I initially missed this when I went through the Tomcat source, but now spent some time trying to understand how it was expected to work If you set maxConnections to your desired value and repeat your tests you will hopefully see different results. Depending on exactly how the load test is designed, acceptCount may still influence the results. I would be worth experimenting with different values for that as well (I'd suggest 100, 1 and 0). However when I tested with this, the same TCP resets were seen under load. After analyzing the source of the NioEndpoint closer I find that it only delays calling serverSock.accept() with Thread.sleep()'s - which is not going to help as shown in my first Java example. // Loop until we receive a shutdown command while (running) { // Loop if endpoint is paused while (paused running) { state = AcceptorState.PAUSED; try { Thread.sleep(50); } catch (InterruptedException e) { // Ignore } } if (!running) { break; } state = AcceptorState.RUNNING; try { //if we have reached max connections, wait countUpOrAwaitConnection(); SocketChannel socket = null; try { // Accept the next incoming connection from the server // socket socket = serverSock.accept(); } catch (IOException ioe) { //we didn't get a socket countDownConnection(); regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond That wouldn't have any different effect to not calling accept() at all in blocking mode Clearly there is a difference. There isn't a difference. All that deregistering OP_ACCEPT does is prevent the application from calling accept(). It has exactly the same effect as thread-starving the accepting thread in blocking mode. I have written books on Java networking and I do know about this. Your 3-line program allows 1 connection at a time because of the backlog queue, as I have been explaining, and when the backlog queue fills up, as it does when the application doesn't call accept() fast enough, or at all, you get platform-dependent behaviour. There is nothing you can do about this in Java or indeed in C either. A program that created a ServerSocketChannel, didn't register it for OP_ACCEPT, and then called select(), would behave in exactly the same way. EJP - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/08/2012 04:57 AM, Esmond Pitt wrote: That wouldn't have any different effect to not calling accept() at all in blocking mode Clearly there is a difference. There isn't a difference. All that deregistering OP_ACCEPT does is prevent the application from calling accept(). It has exactly the same effect as thread-starving the accepting thread in blocking mode. I hope you actually checked the second program I shared [1], and tried it. What it does is simply not delay accept(), but stop accepting. if (key.isAcceptable()) { SocketChannel client = server.accept(); client.configureBlocking(false); client.socket().setTcpNoDelay(true); client.register(selector, SelectionKey.OP_READ); System.out.println(I accepted this one.. but not any more now); key.cancel(); key.channel().close(); When the server is ready to accept more messages, it re-binds to the listening socket and re-registers for OP_ACCEPT. server = ServerSocketChannel.open(); server.socket().bind(new InetSocketAddress(8280), 0); server.configureBlocking(false); server.register(selector, SelectionKey.OP_ACCEPT); System.out.println(\nI am ready to listen for new messages now..); I have written books on Java networking and I do know about this. Your 3-line program allows 1 connection at a time because of the backlog queue, as I have been explaining, and when the backlog queue fills up, as it does when the application doesn't call accept() fast enough, or at all, you get platform-dependent behaviour. There is nothing you can do about this in Java or indeed in C either. A program that created a ServerSocketChannel, didn't register it for OP_ACCEPT, and then called select(), would behave in exactly the same way. Sorry, I do not know to explain it any better - I write code, try it.. [1] http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html cheers asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com
RE: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Asankha I haven't said a word about your second program, that closes the listening socket. *Of course* that causes connection refusals, it can't possibly not, but it isn't relevant to the misconceptions about what OP_ACCEPT does that you have been expressing here and that I have been addressing. Closing the listening socket, as you seem to be now suggesting, is a very poor idea indeed: what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? EJP
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond I haven't said a word about your second program, that closes the listening socket. *Of course* that causes connection refusals, it can't possibly not, but it isn't relevant to the misconceptions about what OP_ACCEPT does that you have been expressing here and that I have been addressing. I was learning things while discussing this issue over the Tomcat list. I started out asking the Tomcat community why I saw the hard RST behavior, and then started looking at the source of Tomcat, and then referenced the HttpComponents project - where at first I believed it was turning off interest in OP_ACCEPT - an assumption I was wrong about - since I had looked up only the discussion threads of HttpComponents and not the source. Then I wrote the second program to illustrate how HttpComponents handled it after looking at its source code, and to answer the question posed by Chris on how it was done in HttpComponents. Since then I was basing my discussion around that second program, but I believe you were addressing issues from earlier - I apologize. Closing the listening socket, as you seem to be now suggesting, is a very poor idea indeed: I personally do not think there is anything at all bad about turning it off. After all, if you are not ready to accept more, you should be clear and upfront about it, even at the TCP level. Having different thresholds to stop listening (say at 4K), and to resume (say at 2K) would ensure that you do not start acting weirdly by starting/stopping/starting/.. acceptance around just one value. what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? In reality I do not know of a single client production deployment that would allocate the same port to possibly conflicting services, that may grab another's port when its suffering under load. I cannot see any other issues of turning off accepting - and I am curious to know if anyone else could share their views on this - considering real production deployments regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com
RE: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
That wouldn't have any different effect to not calling accept() at all in blocking mode, or to thread starvation such that the accept thread didn't get a run. It wouldn't make any difference to whether the client got a connection refused/reset. The backlog queue would still fill up in exactly the same way and then the platform would then react to new incoming connections in exactly the same platform-dependent way. What's done by HttpComponents is essentially turn off interest in SelectionKey.OP_ACCEPT [1] if I remember [2]. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond That wouldn't have any different effect to not calling accept() at all in blocking mode Clearly there is a difference. Please see the samples in [1] [2] and execute them to see this. The TestAccept1 below allows one to open more than one connection at a time, even when only one accept() call is made as has been explained in [1] import java.net.ServerSocket; import java.net.Socket; public class TestAccept1 { public static void main(String[] args) throws Exception { ServerSocket serverSocket = new ServerSocket(8280, 0); Socket socket = serverSocket.accept(); Thread.sleep(300); // do nothing } } [1] http://esbmagic.blogspot.com/2012/10/does-tomcat-bite-more-than-it-can-chew.html [2] http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris My expectation from the backlog is: 1. Connections that can be handled directly will be accepted and work will begin 2. Connections that cannot be handled will accumulate in the backlog 3. Connections that exceed the backlog will get connection refused There are caveats, I would imagine. For instance, do the connections in the backlog have any kind of server-side timeouts associated with them -- what is, will they ever get discarded from the queue without ever being handled by the bound process (assuming the bound process doesn't terminate or anything weird like that)? Do the clients have any timeouts associated with them? Does the above *not* happen? On which platform? Is this only with NIO? I am not a Linux level TCP expert, but what I believe is that the TCP layer has its timeouts and older connection requests will get discarded from the queue etc. Typically a client will have a TCP level timeout as well, i.e. the time it will wait for the other party to accept its SYN packet. My testing has been primarily on Linux / Ubuntu. Leaving everything to the TCP backlog makes the end clients see nasty RSTs when Tomcat is under load instead of connection refused - and could prevent the client from performing a clean fail-over when one Tomcat node is overloaded. So you are eliminating the backlog entirely? Or are you allowing the backlog to work as expected? Does closing and re-opening the socket clear the existing backlog (which would cancel a number of waiting though not technically accepted connections, I think), or does it retain the backlog? Since you are re-binding, I would imagine that the backlog gets flushed every time there is a pause. I am not sure how the backlog would work under different operating systems and conditions etc. However, the code I've shared shows how a pure Java program could take better control of the underlying TCP behavior - as visible to its clients. What about performance effects of maintaining a connector-wide counter of active connections, plus pausing and resuming the channel -- plus re-connects by clients that have been dropped from the backlog? What the UltraESB does by default is to stop accepting new connections after a threshold is reached (e.g. 4096) and remain paused until the active connections drops back to another threshold (e.g. 3073). Each of these parameters are user configurable, and depends on the maximum number of connections each node is expected to handle. Maintaining connector wide counts in my experience does not cause any performance effects, neither re-connects by clients - as whats expected in reality is for a hardware load balancer to forward requests that are refused by one node, to another node, which hopefully is not loaded. Such a fail-over can take place immediately, cleanly and without any cause of confusion even if the backend service is not idempotent. This is clearly not the case when a TCP/HTTP connection is accepted and then met with a hard RST after a part or a full request has been sent to it. I'm concerned that all of your bench tests appear to be done using telnet with a single acceptable connection. What if you allow 1000 simultaneous connections and test it under some real load so we can see how such a solution would behave. Clearly the example I shared was just to illustrate this with a pure Java program. We usually conduct performance tests over half a dozen open source ESBs with concurrency levels of 20,40,80,160,320,640,1280 and 2560 and payload sizes of 0.5, 1, 5, 10 and 100K bytes. You can see some of the scenarios here http://esbperformance.org. We privately conduct performance tests beyond 2560 to much higher levels. We used a HttpComponents based EchoService as our backend service all this time, and it behaved very well with all load levels. However some weeks back we accepted a contribution which was an async servlet to be deployed on Tomcat as it was considered more real world. The issues I noticed was when running high load levels over this servet deployed on Tomcat, especially when the response was being delayed to simulate realistic behavior. Although we do not Tomcat ourselves, our customers do. I am also not calling this a bug - but as an area for possible improvement. If the Tomcat users, developers and the PMC thinks this is worthwhile to pursue, I believe it would be a good enhancement - maybe even a good GSoc project. As a fellow member of the ASF and a committer on multiple projects/years, I believed it was my duty to bring this to the attention of the Tomcat community. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Asankha C. Perera asan...@apache.org wrote: My testing has been primarily on Linux / Ubuntu. With which version of Tomcat? Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Mark Thomas ma...@apache.org wrote: Asankha C. Perera asan...@apache.org wrote: My testing has been primarily on Linux / Ubuntu. With which version of Tomcat? And which connector implentation? Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/07/2012 11:55 AM, Mark Thomas wrote: Mark Thomas ma...@apache.org wrote: Asankha C. Perera asan...@apache.org wrote: My testing has been primarily on Linux / Ubuntu. With which version of Tomcat? And which connector implentation? Tomcat 7.0.29 and possibly 7.0.32 too, but I believe its common to all versions Connector config was already shared http://tomcat.10.n6.nabble.com/Handling-requests-when-under-load-ACCEPT-and-RST-vs-non-ACCEPT-tt4988693.html#a4988712 asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 07/11/2012 08:13, Asankha C. Perera wrote: On 11/07/2012 11:55 AM, Mark Thomas wrote: Mark Thomas ma...@apache.org wrote: Asankha C. Perera asan...@apache.org wrote: My testing has been primarily on Linux / Ubuntu. With which version of Tomcat? And which connector implentation? Tomcat 7.0.29 and possibly 7.0.32 too, but I believe its common to all versions Connector config was already shared http://tomcat.10.n6.nabble.com/Handling-requests-when-under-load-ACCEPT-and-RST-vs-non-ACCEPT-tt4988693.html#a4988712 It appears you are trying to use maxThreads to limit the number of connections. That won't work. maxThreads limits the number of concurrent threads available for processing requests. connection != concurrent request, primarily because of HTTP keep-alive. maxConnections can be used to limit the number of connections. If you set maxConnections to your desired value and repeat your tests you will hopefully see different results. Depending on exactly how the load test is designed, acceptCount may still influence the results. I would be worth experimenting with different values for that as well (I'd suggest 100, 1 and 0). Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris / Mark Or you could just read the configuration documentation for the connector. Hint: acceptCount - and it has been there since at least Tomcat 4. The acceptCount WAS being used, but was not being honored as an end user would expect in reality (See the configurations I've shared at the start) If HttpComponents works as the OP expects, I wonder if he'd be willing to give us the configuration he uses for *that*? Perhaps there is some kind of TCP option that HttpComponents is using that Tomcat does not. Whats done by HttpComponents is essentially turn off interest in SelectionKey.OP_ACCEPT [1] if I remember [2]. Check the code of the DefaultListeningIOReactor.pause() and resume() [3] regards asankha [1] http://docs.oracle.com/javase/6/docs/api/java/nio/channels/SelectionKey.html#OP_ACCEPT [2] http://old.nabble.com/Controlling-%22acceptance%22-of-connections-tt27431279r4.html [3] http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/apidocs/org/apache/http/impl/nio/reactor/DefaultListeningIOReactor.html -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Asankha, On 11/5/12 8:36 AM, Asankha C. Perera wrote: Hi Chris / Mark Or you could just read the configuration documentation for the connector. Hint: acceptCount - and it has been there since at least Tomcat 4. The acceptCount WAS being used, but was not being honored as an end user would expect in reality (See the configurations I've shared at the start) If HttpComponents works as the OP expects, I wonder if he'd be willing to give us the configuration he uses for *that*? Perhaps there is some kind of TCP option that HttpComponents is using that Tomcat does not. Whats done by HttpComponents is essentially turn off interest in SelectionKey.OP_ACCEPT [1] if I remember [2]. Check the code of the DefaultListeningIOReactor.pause() and resume() [3] So I looked at all your references (including the incorrect reference to Javadoc instead of Java code) and not surprisingly the most informative was the http-components mailing list thread[1] I have done some digging. First, evidently, acceptCount almost does not appear in the Tomcat source. It's real name is backlog if you want to do some searching. It's been in there forever. Second, all three connectors (APR, JIO, NIO) (through their appropriate Endpoint implementation classes) faithfully configure the backlog for their various sockets: AprEndpoint: // Bind the server socket int ret = Socket.bind(serverSock, inetAddress); if (ret != 0) { throw new Exception(sm.getString(endpoint.init.bind, + ret, Error.strerror(ret))); } // Start listening on the server socket ret = Socket.listen(serverSock, getBacklog()); JioEndpoint: if (getAddress() == null) { serverSocket = serverSocketFactory.createSocket(getPort(), getBacklog()); } else { serverSocket = serverSocketFactory.createSocket(getPort(), getBacklog(), getAddress()); } (Note: serverSocketFactory.createSocket calls new ServerSocket(port, backlog[, address])) NioEndpoint: serverSock = ServerSocketChannel.open(); socketProperties.setProperties(serverSock.socket()); InetSocketAddress addr = (getAddress()!=null?new InetSocketAddress(getAddress(),getPort()):new InetSocketAddress(getPort())); serverSock.socket().bind(addr,getBacklog()); So, barring some JVM bug, the backlog is being set as appropriately as possible. Third is the notion of playing with OP_ACCEPT on a selector. I'm no NIO expert, here, but I don't understand why adding OP_ACCEPT to the SelectionKey would change anything, here: the socket handles the backlog, and the behavior of the selector shouldn't affect the OS's TCP backlog. Doing so would be incredibly foolish: forcing the application to react to all incoming connections before they went into the backlog queue would essentially obviate the need for the backlog queue in the first place. If you can suggest something specific, here, I'd certainly be interested in what your suggestion is. So far, what I'm hearing is that it works with HttpComponents but I have yet to hear what it is. Are you saying that, basically, NIO sockets simply do not have a backlog, and we have to fake it using some other mechanism? - -chris [1] http://old.nabble.com/Controlling-%22acceptance%22-of-connections-tt27431279r4.html -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCYSvUACgkQ9CaO5/Lv0PCpNgCfXdjSELy0aaPFHfQCZMWoRfj5 gecAoI5vxhKbJfyLrdG8z97c5Xc1bS4e =O65S -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris First, evidently, acceptCount almost does not appear in the Tomcat source. It's real name is backlog if you want to do some searching. It's been in there forever. Yes, I found it too; but saw that it didn't perform what an 'end user' would expect from Tomcat. Second, all three connectors (APR, JIO, NIO) (through their appropriate Endpoint implementation classes) faithfully configure the backlog for their various sockets: ... So, barring some JVM bug, the backlog is being set as appropriately as possible. Although the backlog is set, you cannot depend on it alone to make Tomcat behave more gracefully when under too much load. As explained in my previous blog post, this is not because of a defect of Tomcat - but the way things work in reality, causing TCP and HTTP connections to be established, requests to be [partially]sent and subsequently face hard TCP resets. Third is the notion of playing with OP_ACCEPT on a selector. I'm no NIO expert, here, but I don't understand why adding OP_ACCEPT to the SelectionKey would change anything, here: the socket handles the backlog, and the behavior of the selector shouldn't affect the OS's TCP backlog. Doing so would be incredibly foolish: forcing the application to react to all incoming connections before they went into the backlog queue would essentially obviate the need for the backlog queue in the first place. If you can suggest something specific, here, I'd certainly be interested in what your suggestion is. So far, what I'm hearing is that it works with HttpComponents but I have yet to hear what it is. Are you saying that, basically, NIO sockets simply do not have a backlog, and we have to fake it using some other mechanism? Sure, I've written a pure Java example [1] that illustrates what I am proposing. It illustrates how you could turn off accepting new connections, and resume normal operations once load levels returns to normal. [1] http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond You are correct. As I recently found out Tomcat and Java is not causing this explicitly, as I first thought. So there is no 'bug' to be fixed. But I believe there is an elegant way to refuse further connections when under load by turning off just the 'accepting' of new connections, and causing the client to see a 'connection refused' instead of allowing new connections, accepting requests and then resetting connections with a 'connection reset', preventing the client from a clean failover for non-idempotent requests. Apache HttpComponents/NIO library already supports this, so its something that Tomcat too can support if the community thinks it would be useful. cheers asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 02/11/2012 19:13, Asankha C. Perera wrote: Hi Esmond You are correct. As I recently found out Tomcat and Java is not causing this explicitly, as I first thought. So there is no 'bug' to be fixed. But I believe there is an elegant way to refuse further connections when under load by turning off just the 'accepting' of new connections, and causing the client to see a 'connection refused' instead of allowing new connections, accepting requests and then resetting connections with a 'connection reset', preventing the client from a clean failover for non-idempotent requests. Apache HttpComponents/NIO library already supports this, so its something that Tomcat too can support if the community thinks it would be useful. Or you could just read the configuration documentation for the connector. Hint: acceptCount - and it has been there since at least Tomcat 4. Mark - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mark, On 11/2/12 3:51 PM, Mark Thomas wrote: On 02/11/2012 19:13, Asankha C. Perera wrote: Hi Esmond You are correct. As I recently found out Tomcat and Java is not causing this explicitly, as I first thought. So there is no 'bug' to be fixed. But I believe there is an elegant way to refuse further connections when under load by turning off just the 'accepting' of new connections, and causing the client to see a 'connection refused' instead of allowing new connections, accepting requests and then resetting connections with a 'connection reset', preventing the client from a clean failover for non-idempotent requests. Apache HttpComponents/NIO library already supports this, so its something that Tomcat too can support if the community thinks it would be useful. Or you could just read the configuration documentation for the connector. Hint: acceptCount - and it has been there since at least Tomcat 4. That's kind of what I was thinking, but getting information from the OP was like pulling teeth. I just gave up. Note that his configuration *does* include an acceptCount which is being changed from 1 to 1000. I think the problem is that the OS might be a little fuzzy with that value. If HttpComponents works as the OP expects, I wonder if he'd be willing to give us the configuration he uses for *that*? Perhaps there is some kind of TCP option that HttpComponents is using that Tomcat does not. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCULW0ACgkQ9CaO5/Lv0PATLwCfSsPoVeFJhguRkYwJwHr+s/xk nSgAn0X1IUSwsH4DqKSMVjUo/g9jzvew =myR3 -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris I was connecting locally to the same node over the local interface on both EC2 and locally. Since this went unresolved for sometime now for me, I investigated this a bit myself, first looking at the Coyote source code, and then experimenting with plain Java sockets. It seems like the issue is not really Tomcat resetting connections by itself, but rather; letting the underlying OS do it. It seems like it could be a but difficult to prevent this with blocking sockets, but I hope what I've found investigating this issue will help others in future http://esbmagic.blogspot.com/2012/10/does-tomcat-bite-more-than-it-can-chew.html regards asankha On 10/31/2012 09:27 PM, Christopher Schultz wrote: Also, are you using a load balancer, or connecting directly to the EC2 instance? Do you have a public, static IP? If you use a static IP, Amazon proxies your connections. I'm not sure what happens if you use a non-static IP (which are public, but can change). -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Asankha What you are looking at is TCP platform-dependent behaviour. There is a 'backlog' queue of inbound connections that have been completed by the TCP stack but not yet accepted by the application via the accept() API. This is the queue whose length is specified in the 'C' listen() method (although the platform is free to adjust it either up or down, and generally does so). When the backlog queue fills, the behaviour for subsequent incoming connections is platform-dependent: (a) Windows sends an RST (b) other platforms ignore the incoming connection, in the hope that the backlog will clear and the client will try again. An RST sent because the backlog queue is full is indistinguishable from an RST sent because there is nothing listening at that port, so in either case you should get 'connection refused' or possibly 'connection reset by peer'. Failure to reply at all is indistinguisable from a lost packet, so TCP should retry it a few times before timing out and giving a 'connection timeout'. Whether Windows is correct in sending an RST is debatable but it's been doing it for decades and it certainly isn't going to change. Tomcat and indeed Java have nothing to do with this behaviour, and expecting either to be modified to 'fix' it would be like keeping a dog and barking yourself. EJP - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Asankha, On 10/29/12 11:56 PM, Asankha C. Perera wrote: Hi Chris Sorry, also what is your OS (be as specific as possible) and what JVM are you running on? Locally for the Wireshark capture I ran this on: asankha@asankha-dm4:~$ uname -a Linux asankha-dm4 3.2.0-31-generic #50-Ubuntu SMP Fri Sep 7 16:16:45 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux asankha@asankha-dm4:~$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION=Ubuntu 12.04.1 LTS asankha@asankha-dm4:~$ java -version java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) On EC2 nodes (c1.xlarge), I saw this with Ubuntu 10.10, with the same JDK on x64 platforms - but I believe this issue applies across for any OS I'm interested to know if Tomcat can refuse to accept a connection when overloaded - without accepting and closing the ones that it cannot handle. Also, are you using a load balancer, or connecting directly to the EC2 instance? Do you have a public, static IP? If you use a static IP, Amazon proxies your connections. I'm not sure what happens if you use a non-static IP (which are public, but can change). - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCRSk4ACgkQ9CaO5/Lv0PBtJgCgqMlmEhWIl1DqwG9Ts0pO8PsQ Sh4An0bKLBucHwbJc5rgxWPOKPImj+iy =yDJz -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Asankha, On 10/29/12 9:20 AM, Asankha C. Perera wrote: During some performance testing I've seen that Tomcat resets accepted TCP connections when under load. I had seen this previously too [1], but was not able to analyze the scenario in detail earlier. Please post your Connector configuration and let us know if you are using APR/native. Thanks, - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCOqxAACgkQ9CaO5/Lv0PAr/gCcC1VbY1CqiZjcOMPbgMyAucrw WnIAn21kAAy98I25ExNJ1vUi5FT+iAP1 =b8dz -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris Connector port=9000 protocol=org.apache.coyote.http11.Http11NioProtocol connectionTimeout=2 redirectPort=8443 maxKeepAliveRequests=1 processorCache=1 acceptCount=1 maxThreads=1/ I used the above on my notebook to re-produce the issue easily and get a clear Wireshark dump, but the below configuration also caused the same issue with a real load test on a larger EC2 node: Connector port=9000 protocol=org.apache.coyote.http11.Http11NioProtocol connectionTimeout=2 redirectPort=8443 maxKeepAliveRequests=1 processorCache=2560 acceptCount=1000 maxThreads=300/ thanks asankha On 10/29/2012 09:43 PM, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Asankha, On 10/29/12 9:20 AM, Asankha C. Perera wrote: During some performance testing I've seen that Tomcat resets accepted TCP connections when under load. I had seen this previously too [1], but was not able to analyze the scenario in detail earlier. Please post your Connector configuration and let us know if you are using APR/native. Thanks, - -chris -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Asankha, On 10/29/12 12:29 PM, Asankha C. Perera wrote: Hi Chris Connector port=9000 protocol=org.apache.coyote.http11.Http11NioProtocol connectionTimeout=2 redirectPort=8443 maxKeepAliveRequests=1 processorCache=1 acceptCount=1 maxThreads=1/ I used the above on my notebook to re-produce the issue easily and get a clear Wireshark dump, but the below configuration also caused the same issue with a real load test on a larger EC2 node: Connector port=9000 protocol=org.apache.coyote.http11.Http11NioProtocol connectionTimeout=2 redirectPort=8443 maxKeepAliveRequests=1 processorCache=2560 acceptCount=1000 maxThreads=300/ Sorry, also what is your OS (be as specific as possible) and what JVM are you running on? - -chris -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iEYEARECAAYFAlCO83wACgkQ9CaO5/Lv0PBc/ACgqFVIRlbEeHAF8YkPOlyLasCE PsAAmgI/v5rTi7WJh46bNE0X1nhtbUDc =TTVi -END PGP SIGNATURE- - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris Sorry, also what is your OS (be as specific as possible) and what JVM are you running on? Locally for the Wireshark capture I ran this on: asankha@asankha-dm4:~$ uname -a Linux asankha-dm4 3.2.0-31-generic #50-Ubuntu SMP Fri Sep 7 16:16:45 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux asankha@asankha-dm4:~$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION=Ubuntu 12.04.1 LTS asankha@asankha-dm4:~$ java -version java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) On EC2 nodes (c1.xlarge), I saw this with Ubuntu 10.10, with the same JDK on x64 platforms - but I believe this issue applies across for any OS I'm interested to know if Tomcat can refuse to accept a connection when overloaded - without accepting and closing the ones that it cannot handle. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org