Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/14/12 11:02 AM, chris derham wrote: My simple thought was that it sounds like your code isn't working. You have more load than one tomcat instance can handle, which overloads that instance. You are trying to write code to handle this situation, and seem convinced that the only solution is to alter tomcat such that you can detect/handle this occurrence in a way that is easier for your software. I think this is an accurate summary of the proposal. Honestly, it does make *some* sense because the lb's job is to determine what is going on with the backend servers and distribute load. If one backend server is unhealthy, the lb needs to know about it. Thank you everyone for taking time to comment on this thread and sharing your thoughts I now realize that it would be better to avoid this situation from our end as suggested, than trying to detect and overcome it regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond To reiterate what Christopher said, if you close the listening socket because you think you can't service one extra client, you will lose all the connections on the backlog queue, which could be hundreds of clients, that you *can* service. I do not see a problem here. We develop software that routes millions of requests to dozens of Tomcat instances. For a single instance of Tomcat - the problem is much simpler. It should just handle the maximum number of connections its configured for. If each single Tomcat instance behaves slightly better (i.e. refuse vs reset), we can avoid making any guesses about whether Tomcat has crashed etc - if a TCP level reset on the connection is received instead of a refused, and can perform much better support independent of whether a service or method is idempotent. This context is different from a clients making say JSP calls for a UI. The end clients connecting to us may use HTTP or HTTPS, with out without keep-alives etc. We will handle those connections, and then route, load balance and failover against dozens of Tomcats. Now each connection we establish to a Tomcat will almost always be a well kept Keep-Alive, which is re-used even for different requests originating from multiple external clients. So if we are managing say 10K connections, maybe 4K with keep-alive and 6K without with our clients, we will still use a limited number of keep alive connections to each single Tomcat we load balance and fail over against. Yes, we can and do support connection throttling at a slight cost to safeguard a single Tomcat from receiving more connections that it can, but if Tomcat was able to not reset connections at the TCP level - we can perform our task much better, and I do not think this will cause any problem to any other use cases of Tomcat - if we can just enable this behavior with a configuration parameter In addition, those clients will then get exactly the behaviour that you are complaining about: a successful connection and then a 'connection reset' when doing I/O. No, they will not get an ACK to the SYN packet which is much better. Else they will get an ACK, and the Tomcat will start receiving a part or whole payload - and then reset which is nasty - this is the main difference There is no possibility of this proposal being accepted. I do not understand the negativity here.. I was wondering if I should take this discussion to the dev@ list since I've already discussed it on user@. I wish Tomcat had a Wiki or used JIRA where I could submit this as a proposal - maybe with a diagram/screen shots etc - and make end users vote for this across a few months - until we find that this solution has a value. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/12/2012 10:47 PM, Terence M. Bandoian wrote: On 11/9/2012 1:41 PM, Christopher Schultz wrote: Closing the listening socket, as you seem to be now suggesting, is a very poor idea indeed: what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? I haven't been following this thread closely enough to comment on the proposed solution but isn't preventing unintended usage of a port a systems administration problem? What happens when Tomcat is restarted? Exactly.. also if a node is to restart say on a cloud infra after a crash etc.. and if two processes are to fight for the same port - it would be quite unpredictable.. I wonder if there are any real users who let this happen on a production environment as described on this list regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris processing 1 connection through completion (there are 99 others still running), re-binding, accepting a single connection into the application plus 100 others into the backlog, then choking again and dropping 100 connections, then processing another single connection. That's a huge waste of time unbinding and re-binding to the port, killing the backlog over and over again... and all for 1-connection-at-a-time pumping. Insanity. I'm sorry but you've misunderstood what I was saying. Yes the example I used showed it for one connection to make it easier to understand what I was proposing. But in reality you would not stop and start at each connection. Remember the two thresholds I was talking about? You could stop listening at 4K connections, and start listening again when the connections drops to say 3K - and these could be user specified parameters based on the deployment. HTTP keep-alive from a load balancer in front would work extremely well under these conditions as established TCP connections are re-used. Any production grade load balancer could immediately fail-over only the failing requests to another Tomcat when one is under too much load - and this would work for even non-idempotent services. You want to add all this extra complexity to the code and, IMO, shitty handling of your incoming connections just so you can say well, you're getting 'connection refused' instead of hanging... isn't that better?. I assert that it is *not* better. Clients can set TCP handshake timeouts and survive. Your server will perform much better without all this foolishness. If you can, try to understand what I said better.. Its ok to not accept this proposal and/or not understand it.. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Mark what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? In reality I do not know of a single client production deployment that would allocate the same port to possibly conflicting services, that may grab another's port when its suffering under load. Just because it wouldn't cause a problem for a limited subset of Tomcat users - your clients - does not mean that it would not cause problems for other Tomcat users. I cannot see any other issues of turning off accepting - and I am curious to know if anyone else could share their views on this - considering real production deployments The problems have already been explained to you. Another process could use the port. I would consider such production deployment as a risk - a Tomcat crash, or even a restart might end up in a soup if another process starts using the port in the mean time.. Having reviewed this thread the problem you seem to be trying to solve is this: - a load-balancer is in use - Tomcat is under load - a client attempts a connection - the connection is added to the TCP backlog - Tomcat does not process the connection before it times out - the connection is reset when it times out - the client can't differentiate between the above and when an error occurs during processing resulting in a connection reset - the client doesn't know whether to replay the request or not Yes, this is correct First of all, it is extremely rare for Tomcat to reset a connection once processing has started. The only circumstances where I am aware that would happen is if Tomcat is shutting down and a long running request failed to complete or if Tomcat crashes. All other error cases should receive an appropriate HTTP error code. In a controlled shut down load can be moved off the Tomcat node before it is shut down. That leaves differentiating a Tomcat crash during request processing and the request timing out in the backlog. For GET requests this should be a non-issue since GET requests are meant to be idmepotent. GET requests can always be re-tried after a TCP reset. For POST requests, use of the 100 Continue status can enable the client to determine if the headers have been received. A TCP reset before the 100 continue response means the request needs to be re-tried. A TCP reset after the 100 continue response means it is unknown if a retry is necessary (there is no way for the client to determine the correct answer in this case). Given the above I don't see any reason to change Tomcat's current behaviour. Ok, thank you for considering my proposal. I respect the decision of the Tomcat community. Hopefully someone else will find this thread useful in future to understand the issue better and to overcome it regards asankha -- Asankha C. Perera AdroitLogic,http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/09/2012 02:16 AM, Pid wrote: On 08/11/2012 15:03, Asankha C. Perera wrote: Hi Mark what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? In reality I do not know of a single client production deployment that would allocate the same port to possibly conflicting services, that may grab another's port when its suffering under load. Just because it wouldn't cause a problem for a limited subset of Tomcat users - your clients - does not mean that it would not cause problems for other Tomcat users. I cannot see any other issues of turning off accepting - and I am curious to know if anyone else could share their views on this - considering real production deployments The problems have already been explained to you. Another process could use the port. I would consider such production deployment as a risk - a Tomcat crash, or even a restart might end up in a soup if another process starts using the port in the mean time.. It is not uncommon for monitoring tools to attempt to (re)start a service when it is observed not to be listening on its designated port. But that could happen even now, if the backlog fills and connections are being reset as seen currently cheers asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Mark maxThreads limits the number of concurrent threads available for processing requests. connection != concurrent request, primarily because of HTTP keep-alive. maxConnections can be used to limit the number of connections. Thanks for this insight.. I initially missed this when I went through the Tomcat source, but now spent some time trying to understand how it was expected to work If you set maxConnections to your desired value and repeat your tests you will hopefully see different results. Depending on exactly how the load test is designed, acceptCount may still influence the results. I would be worth experimenting with different values for that as well (I'd suggest 100, 1 and 0). However when I tested with this, the same TCP resets were seen under load. After analyzing the source of the NioEndpoint closer I find that it only delays calling serverSock.accept() with Thread.sleep()'s - which is not going to help as shown in my first Java example. // Loop until we receive a shutdown command while (running) { // Loop if endpoint is paused while (paused running) { state = AcceptorState.PAUSED; try { Thread.sleep(50); } catch (InterruptedException e) { // Ignore } } if (!running) { break; } state = AcceptorState.RUNNING; try { //if we have reached max connections, wait countUpOrAwaitConnection(); SocketChannel socket = null; try { // Accept the next incoming connection from the server // socket socket = serverSock.accept(); } catch (IOException ioe) { //we didn't get a socket countDownConnection(); regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/08/2012 04:57 AM, Esmond Pitt wrote: That wouldn't have any different effect to not calling accept() at all in blocking mode Clearly there is a difference. There isn't a difference. All that deregistering OP_ACCEPT does is prevent the application from calling accept(). It has exactly the same effect as thread-starving the accepting thread in blocking mode. I hope you actually checked the second program I shared [1], and tried it. What it does is simply not delay accept(), but stop accepting. if (key.isAcceptable()) { SocketChannel client = server.accept(); client.configureBlocking(false); client.socket().setTcpNoDelay(true); client.register(selector, SelectionKey.OP_READ); System.out.println(I accepted this one.. but not any more now); key.cancel(); key.channel().close(); When the server is ready to accept more messages, it re-binds to the listening socket and re-registers for OP_ACCEPT. server = ServerSocketChannel.open(); server.socket().bind(new InetSocketAddress(8280), 0); server.configureBlocking(false); server.register(selector, SelectionKey.OP_ACCEPT); System.out.println(\nI am ready to listen for new messages now..); I have written books on Java networking and I do know about this. Your 3-line program allows 1 connection at a time because of the backlog queue, as I have been explaining, and when the backlog queue fills up, as it does when the application doesn't call accept() fast enough, or at all, you get platform-dependent behaviour. There is nothing you can do about this in Java or indeed in C either. A program that created a ServerSocketChannel, didn't register it for OP_ACCEPT, and then called select(), would behave in exactly the same way. Sorry, I do not know to explain it any better - I write code, try it.. [1] http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html cheers asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond I haven't said a word about your second program, that closes the listening socket. *Of course* that causes connection refusals, it can't possibly not, but it isn't relevant to the misconceptions about what OP_ACCEPT does that you have been expressing here and that I have been addressing. I was learning things while discussing this issue over the Tomcat list. I started out asking the Tomcat community why I saw the hard RST behavior, and then started looking at the source of Tomcat, and then referenced the HttpComponents project - where at first I believed it was turning off interest in OP_ACCEPT - an assumption I was wrong about - since I had looked up only the discussion threads of HttpComponents and not the source. Then I wrote the second program to illustrate how HttpComponents handled it after looking at its source code, and to answer the question posed by Chris on how it was done in HttpComponents. Since then I was basing my discussion around that second program, but I believe you were addressing issues from earlier - I apologize. Closing the listening socket, as you seem to be now suggesting, is a very poor idea indeed: I personally do not think there is anything at all bad about turning it off. After all, if you are not ready to accept more, you should be clear and upfront about it, even at the TCP level. Having different thresholds to stop listening (say at 4K), and to resume (say at 2K) would ensure that you do not start acting weirdly by starting/stopping/starting/.. acceptance around just one value. what happens if some other process grabs the port in the meantime: what is Tomcat supposed to do then? In reality I do not know of a single client production deployment that would allocate the same port to possibly conflicting services, that may grab another's port when its suffering under load. I cannot see any other issues of turning off accepting - and I am curious to know if anyone else could share their views on this - considering real production deployments regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond That wouldn't have any different effect to not calling accept() at all in blocking mode Clearly there is a difference. Please see the samples in [1] [2] and execute them to see this. The TestAccept1 below allows one to open more than one connection at a time, even when only one accept() call is made as has been explained in [1] import java.net.ServerSocket; import java.net.Socket; public class TestAccept1 { public static void main(String[] args) throws Exception { ServerSocket serverSocket = new ServerSocket(8280, 0); Socket socket = serverSocket.accept(); Thread.sleep(300); // do nothing } } [1] http://esbmagic.blogspot.com/2012/10/does-tomcat-bite-more-than-it-can-chew.html [2] http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris My expectation from the backlog is: 1. Connections that can be handled directly will be accepted and work will begin 2. Connections that cannot be handled will accumulate in the backlog 3. Connections that exceed the backlog will get connection refused There are caveats, I would imagine. For instance, do the connections in the backlog have any kind of server-side timeouts associated with them -- what is, will they ever get discarded from the queue without ever being handled by the bound process (assuming the bound process doesn't terminate or anything weird like that)? Do the clients have any timeouts associated with them? Does the above *not* happen? On which platform? Is this only with NIO? I am not a Linux level TCP expert, but what I believe is that the TCP layer has its timeouts and older connection requests will get discarded from the queue etc. Typically a client will have a TCP level timeout as well, i.e. the time it will wait for the other party to accept its SYN packet. My testing has been primarily on Linux / Ubuntu. Leaving everything to the TCP backlog makes the end clients see nasty RSTs when Tomcat is under load instead of connection refused - and could prevent the client from performing a clean fail-over when one Tomcat node is overloaded. So you are eliminating the backlog entirely? Or are you allowing the backlog to work as expected? Does closing and re-opening the socket clear the existing backlog (which would cancel a number of waiting though not technically accepted connections, I think), or does it retain the backlog? Since you are re-binding, I would imagine that the backlog gets flushed every time there is a pause. I am not sure how the backlog would work under different operating systems and conditions etc. However, the code I've shared shows how a pure Java program could take better control of the underlying TCP behavior - as visible to its clients. What about performance effects of maintaining a connector-wide counter of active connections, plus pausing and resuming the channel -- plus re-connects by clients that have been dropped from the backlog? What the UltraESB does by default is to stop accepting new connections after a threshold is reached (e.g. 4096) and remain paused until the active connections drops back to another threshold (e.g. 3073). Each of these parameters are user configurable, and depends on the maximum number of connections each node is expected to handle. Maintaining connector wide counts in my experience does not cause any performance effects, neither re-connects by clients - as whats expected in reality is for a hardware load balancer to forward requests that are refused by one node, to another node, which hopefully is not loaded. Such a fail-over can take place immediately, cleanly and without any cause of confusion even if the backend service is not idempotent. This is clearly not the case when a TCP/HTTP connection is accepted and then met with a hard RST after a part or a full request has been sent to it. I'm concerned that all of your bench tests appear to be done using telnet with a single acceptable connection. What if you allow 1000 simultaneous connections and test it under some real load so we can see how such a solution would behave. Clearly the example I shared was just to illustrate this with a pure Java program. We usually conduct performance tests over half a dozen open source ESBs with concurrency levels of 20,40,80,160,320,640,1280 and 2560 and payload sizes of 0.5, 1, 5, 10 and 100K bytes. You can see some of the scenarios here http://esbperformance.org. We privately conduct performance tests beyond 2560 to much higher levels. We used a HttpComponents based EchoService as our backend service all this time, and it behaved very well with all load levels. However some weeks back we accepted a contribution which was an async servlet to be deployed on Tomcat as it was considered more real world. The issues I noticed was when running high load levels over this servet deployed on Tomcat, especially when the response was being delayed to simulate realistic behavior. Although we do not Tomcat ourselves, our customers do. I am also not calling this a bug - but as an area for possible improvement. If the Tomcat users, developers and the PMC thinks this is worthwhile to pursue, I believe it would be a good enhancement - maybe even a good GSoc project. As a fellow member of the ASF and a committer on multiple projects/years, I believed it was my duty to bring this to the attention of the Tomcat community. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
On 11/07/2012 11:55 AM, Mark Thomas wrote: Mark Thomas ma...@apache.org wrote: Asankha C. Perera asan...@apache.org wrote: My testing has been primarily on Linux / Ubuntu. With which version of Tomcat? And which connector implentation? Tomcat 7.0.29 and possibly 7.0.32 too, but I believe its common to all versions Connector config was already shared http://tomcat.10.n6.nabble.com/Handling-requests-when-under-load-ACCEPT-and-RST-vs-non-ACCEPT-tt4988693.html#a4988712 asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris / Mark Or you could just read the configuration documentation for the connector. Hint: acceptCount - and it has been there since at least Tomcat 4. The acceptCount WAS being used, but was not being honored as an end user would expect in reality (See the configurations I've shared at the start) If HttpComponents works as the OP expects, I wonder if he'd be willing to give us the configuration he uses for *that*? Perhaps there is some kind of TCP option that HttpComponents is using that Tomcat does not. Whats done by HttpComponents is essentially turn off interest in SelectionKey.OP_ACCEPT [1] if I remember [2]. Check the code of the DefaultListeningIOReactor.pause() and resume() [3] regards asankha [1] http://docs.oracle.com/javase/6/docs/api/java/nio/channels/SelectionKey.html#OP_ACCEPT [2] http://old.nabble.com/Controlling-%22acceptance%22-of-connections-tt27431279r4.html [3] http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/apidocs/org/apache/http/impl/nio/reactor/DefaultListeningIOReactor.html -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris First, evidently, acceptCount almost does not appear in the Tomcat source. It's real name is backlog if you want to do some searching. It's been in there forever. Yes, I found it too; but saw that it didn't perform what an 'end user' would expect from Tomcat. Second, all three connectors (APR, JIO, NIO) (through their appropriate Endpoint implementation classes) faithfully configure the backlog for their various sockets: ... So, barring some JVM bug, the backlog is being set as appropriately as possible. Although the backlog is set, you cannot depend on it alone to make Tomcat behave more gracefully when under too much load. As explained in my previous blog post, this is not because of a defect of Tomcat - but the way things work in reality, causing TCP and HTTP connections to be established, requests to be [partially]sent and subsequently face hard TCP resets. Third is the notion of playing with OP_ACCEPT on a selector. I'm no NIO expert, here, but I don't understand why adding OP_ACCEPT to the SelectionKey would change anything, here: the socket handles the backlog, and the behavior of the selector shouldn't affect the OS's TCP backlog. Doing so would be incredibly foolish: forcing the application to react to all incoming connections before they went into the backlog queue would essentially obviate the need for the backlog queue in the first place. If you can suggest something specific, here, I'd certainly be interested in what your suggestion is. So far, what I'm hearing is that it works with HttpComponents but I have yet to hear what it is. Are you saying that, basically, NIO sockets simply do not have a backlog, and we have to fake it using some other mechanism? Sure, I've written a pure Java example [1] that illustrates what I am proposing. It illustrates how you could turn off accepting new connections, and resume normal operations once load levels returns to normal. [1] http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Esmond You are correct. As I recently found out Tomcat and Java is not causing this explicitly, as I first thought. So there is no 'bug' to be fixed. But I believe there is an elegant way to refuse further connections when under load by turning off just the 'accepting' of new connections, and causing the client to see a 'connection refused' instead of allowing new connections, accepting requests and then resetting connections with a 'connection reset', preventing the client from a clean failover for non-idempotent requests. Apache HttpComponents/NIO library already supports this, so its something that Tomcat too can support if the community thinks it would be useful. cheers asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris I was connecting locally to the same node over the local interface on both EC2 and locally. Since this went unresolved for sometime now for me, I investigated this a bit myself, first looking at the Coyote source code, and then experimenting with plain Java sockets. It seems like the issue is not really Tomcat resetting connections by itself, but rather; letting the underlying OS do it. It seems like it could be a but difficult to prevent this with blocking sockets, but I hope what I've found investigating this issue will help others in future http://esbmagic.blogspot.com/2012/10/does-tomcat-bite-more-than-it-can-chew.html regards asankha On 10/31/2012 09:27 PM, Christopher Schultz wrote: Also, are you using a load balancer, or connecting directly to the EC2 instance? Do you have a public, static IP? If you use a static IP, Amazon proxies your connections. I'm not sure what happens if you use a non-static IP (which are public, but can change). -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi All During some performance testing I've seen that Tomcat resets accepted TCP connections when under load. I had seen this previously too [1], but was not able to analyze the scenario in detail earlier. As per this dump from Wireshark [2], it seemed like Tomcat ACKed the client request, accepted part of the request, and then suddenly decided to close the connection and hence RSTed it. What I would expect Tomcat to have done instead is to refuse a connection when under load, and not accept and RST. The problem occurs for a client that would not know if a RSTed connection could be safely retried. If the connection was not accepted, a fail-over is straight forward. Hope to hear some details from the developer community, to understand this behavior better regards asankha [1] http://markmail.org/message/v7cpj6oqumtn5gtp [2] http://troll.ws/image/6b38f283 -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris Connector port=9000 protocol=org.apache.coyote.http11.Http11NioProtocol connectionTimeout=2 redirectPort=8443 maxKeepAliveRequests=1 processorCache=1 acceptCount=1 maxThreads=1/ I used the above on my notebook to re-produce the issue easily and get a clear Wireshark dump, but the below configuration also caused the same issue with a real load test on a larger EC2 node: Connector port=9000 protocol=org.apache.coyote.http11.Http11NioProtocol connectionTimeout=2 redirectPort=8443 maxKeepAliveRequests=1 processorCache=2560 acceptCount=1000 maxThreads=300/ thanks asankha On 10/29/2012 09:43 PM, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Asankha, On 10/29/12 9:20 AM, Asankha C. Perera wrote: During some performance testing I've seen that Tomcat resets accepted TCP connections when under load. I had seen this previously too [1], but was not able to analyze the scenario in detail earlier. Please post your Connector configuration and let us know if you are using APR/native. Thanks, - -chris -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT
Hi Chris Sorry, also what is your OS (be as specific as possible) and what JVM are you running on? Locally for the Wireshark capture I ran this on: asankha@asankha-dm4:~$ uname -a Linux asankha-dm4 3.2.0-31-generic #50-Ubuntu SMP Fri Sep 7 16:16:45 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux asankha@asankha-dm4:~$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION=Ubuntu 12.04.1 LTS asankha@asankha-dm4:~$ java -version java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) On EC2 nodes (c1.xlarge), I saw this with Ubuntu 10.10, with the same JDK on x64 platforms - but I believe this issue applies across for any OS I'm interested to know if Tomcat can refuse to accept a connection when overloaded - without accepting and closing the ones that it cannot handle. regards asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Expected behavior of Tomcat under load
Something puzzles me since your first post : ... What is this TCP CHECKSUM INCORRECT thing ? This is the output of some protocol analyser thing, right ? Yes, its a capture from tcpdump, analyzed by wireshark So it is totally independent of Tomcat or whatever. This packet is one that comes from whatever your client is, toward Tomcat. Why does it show that message ? And if that message can be believed, is it then not normal that the protocol stack which receives that (bad) TCP packet would reject it, and break the connection ? I guess this is normal. I did a quick search and came across the following: http://www.ethereal.com/lists/ethereal-dev/200406/msg00090.html http://stackoverflow.com/questions/667848/java-socket-tcp-checksum-incorrect http://wiki.wireshark.org/TCP_Checksum_Verification This trace is from a EC2 node ubuntu@ip-10-202-99-31:~/configs$ ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: off generic-receive-offload: off large-receive-offload: off ntuple-filters: off receive-hashing: off thanks asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: Expected behavior of Tomcat under load
On 05/26/2011 09:50 PM, André Warnier wrote: Putting your answer together with the one from Chuck : I understand that if the tcpdump program runs on the same host as the one which is sending the packets, it may not be able to correctly see the TCP checksum, since it captures the packet before it goes out on the network, and it is the NIC which calculates and inserts the TCP checksum just before the packet is sent over the network. Right ? But is this the case here ? Where is/was the tcpdump program run, which captured these packets, as compared to the client and server systems ? I am quite certain this was from the ESB node which was the client to tomcat .. thanks asankha -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Expected behavior of Tomcat under load
Hi All During some performance tests, we've seen that Tomcat resets TCP connections under high load. To reproduce this rather consistently, a thread pool with a maximum of 300 threads could be configured on default Tomcat 6.0.32, and then 1280 ~ 2560 concurrent user requests simulated from a different machine over a real NW interface. I assume this could be reproduced with proportionately smaller numbers for both as well. The implementation uses a Xfire soap service. Tomcat refusing connections, or taking longer to accept new connections, or taking longer to reply (causing a socket timeout) can be expected under such load - but what we see are TCP resets of connections to which a client has already sent a full HTTP request. Is this the default behavior of Tomcat? The problem this presents is that the client cannot safely fail over to another instance, unlike with a refused connection or a connect timeout (i.e. delay in accepting) thanks asankha No. TimeSourceDestination Protocol Src Port Dst Port Info 389961 37.056567 10.77.69.810.101.29.42 TCP 9062 8080 9062 8080 [SYN] Seq=0 Win=5792 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460 TSV=363753 TSER=363574 WS=7 391297 37.108766 10.101.29.42 10.77.69.8 TCP 8080 9062 8080 9062 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=363383 TSER=363753 WS=7 391298 37.108773 10.77.69.810.101.29.42 TCP 9062 8080 9062 8080 [ACK] Seq=1 Ack=1 Win=5888 [TCP CHECKSUM INCORRECT] Len=0 TSV=363758 TSER=363383 391893 37.115809 10.77.69.810.101.29.42 HTTP 9062 8080 POST /xfire/xfire-service HTTP/1.1 391894 37.115837 10.77.69.810.101.29.42 HTTP 9062 8080 Continuation or non-HTTP traffic[Packet size limited during capture] 392677 37.125492 10.101.29.42 10.77.69.8 TCP 8080 9062 8080 9062 [RST] Seq=1 Win=0 Len=0 -- Asankha C. Perera AdroitLogic, http://adroitlogic.org http://esbmagic.blogspot.com - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org