Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-14 Thread Asankha C. Perera



On 11/14/12 11:02 AM, chris derham wrote:

My simple thought was that it sounds like your code isn't working.
You have more load than one tomcat instance can handle, which
overloads that instance. You are trying to write code to handle
this situation, and seem convinced that the only solution is to
alter tomcat such that you can detect/handle this occurrence in a
way that is easier for your software.

I think this is an accurate summary of the proposal. Honestly, it does
make *some* sense because the lb's job is to determine what is going
on with the backend servers and distribute load. If one backend server
is unhealthy, the lb needs to know about it.
Thank you everyone for taking time to comment on this thread and sharing 
your thoughts


I now realize that it would be better to avoid this situation from our 
end as suggested, than trying to detect and overcome it


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-12 Thread Asankha C. Perera

Hi Esmond

To reiterate what Christopher said, if you close the listening socket
because you think you can't service one extra client, you will lose all the
connections on the backlog queue, which could be hundreds of clients, that
you *can* service.
I do not see a problem here. We develop software that routes millions of 
requests to dozens of Tomcat instances. For a single instance of Tomcat 
- the problem is much simpler. It should just handle the maximum number 
of connections its configured for. If each single Tomcat instance 
behaves slightly better (i.e. refuse vs reset), we can avoid making any 
guesses about whether Tomcat has crashed etc - if a TCP level reset on 
the connection is received instead of a refused, and can perform much 
better support independent of whether a service or method is idempotent. 
This context is different from a clients making say JSP calls for a UI.


The end clients connecting to us may use HTTP or HTTPS, with out without 
keep-alives etc. We will handle those connections, and then route, load 
balance and failover against dozens of Tomcats. Now each connection we 
establish to a Tomcat will almost always be a well kept Keep-Alive, 
which is re-used even for different requests originating from multiple 
external clients. So if we are managing say 10K connections, maybe 4K 
with keep-alive and 6K without with our clients, we will still use a 
limited number of keep alive connections to each single Tomcat we load 
balance and fail over against.


Yes, we can and do support connection throttling at a slight cost to 
safeguard a single Tomcat from receiving more connections that it can, 
but if Tomcat was able to not reset connections at the TCP level - we 
can perform our task much better, and I do not think this will cause any 
problem to any other use cases of Tomcat - if we can just enable this 
behavior with a configuration parameter

In addition, those clients will then get exactly the behaviour that you are
complaining about: a successful connection and then a 'connection reset'
when doing I/O.
No, they will not get an ACK to the SYN packet which is much better. 
Else they will get an ACK, and the Tomcat will start receiving a part or 
whole payload - and then reset which is nasty - this is the main difference



There is no possibility of this proposal being accepted.
I do not understand the negativity here.. I was wondering if I should 
take this discussion to the dev@ list since I've already discussed it on 
user@. I wish Tomcat had a Wiki or used JIRA where I could submit this 
as a proposal - maybe with a diagram/screen shots etc - and make end 
users vote for this across a few months - until we find that this 
solution has a value.


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-12 Thread Asankha C. Perera

On 11/12/2012 10:47 PM, Terence M. Bandoian wrote:

On 11/9/2012 1:41 PM, Christopher Schultz wrote:

Closing the listening socket, as you seem to be now suggesting, is
a very poor idea indeed: what happens if some other process grabs
the port in the meantime: what is Tomcat supposed to do then?


I haven't been following this thread closely enough to comment on the 
proposed solution but isn't preventing unintended usage of a port a 
systems administration problem?  What happens when Tomcat is restarted?
Exactly.. also if a node is to restart say on a cloud infra after a 
crash etc.. and if two processes are to fight for the same port - it 
would be quite unpredictable.. I wonder if there are any real users who 
let this happen on a production environment as described on this list


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-09 Thread Asankha C. Perera

Hi Chris

processing 1 connection through completion
(there are 99 others still running), re-binding, accepting a single
connection into the application plus 100 others into the backlog, then
choking again and dropping 100 connections, then processing another
single connection. That's a huge waste of time unbinding and
re-binding to the port, killing the backlog over and over again... and
all for 1-connection-at-a-time pumping. Insanity.
I'm sorry but you've misunderstood what I was saying. Yes the example I 
used showed it for one connection to make it easier to understand what I 
was proposing. But in reality you would not stop and start at each 
connection. Remember the two thresholds I was talking about? You could 
stop listening at 4K connections, and start listening again when the 
connections drops to say 3K - and these could be user specified 
parameters based on the deployment.


HTTP keep-alive from a load balancer in front would work extremely well 
under these conditions as established TCP connections are re-used. Any 
production grade load balancer could immediately fail-over only the 
failing requests to another Tomcat when one is under too much load - and 
this would work for even non-idempotent services.

You want to add all this extra complexity to the code and, IMO, shitty
handling of your incoming connections just so you can say well,
you're getting 'connection refused' instead of hanging... isn't that
better?. I assert that it is *not* better. Clients can set TCP
handshake timeouts and survive. Your server will perform much better
without all this foolishness.
If you can, try to understand what I said better.. Its ok to not accept 
this proposal and/or not understand it..


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-08 Thread Asankha C. Perera

Hi Mark

what happens if some other process grabs the port in the meantime:
what is Tomcat supposed to do then?

In reality I do not know of a single client production deployment that
would allocate the same port to possibly conflicting services, that may
grab another's port when its suffering under load.

Just because it wouldn't cause a problem for a limited subset of Tomcat
users - your clients - does not mean that it would not cause problems
for other Tomcat users.

I cannot see any other issues of turning off accepting - and I am
curious to know if anyone else could share their views on this -
considering real production deployments

The problems have already been explained to you. Another process could
use the port.
I would consider such production deployment as a risk - a Tomcat crash, 
or even a restart might end up in a soup if another process starts using 
the port in the mean time..

Having reviewed this thread the problem you seem to be trying to solve
is this:
- a load-balancer is in use
- Tomcat is under load
- a client attempts a connection
- the connection is added to the TCP backlog
- Tomcat does not process the connection before it times out
- the connection is reset when it times out
- the client can't differentiate between the above and when an error
occurs during processing resulting in a connection reset
- the client doesn't know whether to replay the request or not

Yes, this is correct

First of all, it is extremely rare for Tomcat to reset a connection once
processing has started. The only circumstances where I am aware that
would happen is if Tomcat is shutting down and a long running request
failed to complete or if Tomcat crashes. All other error cases should
receive an appropriate HTTP error code. In a controlled shut down load
can be moved off the Tomcat node before it is shut down. That leaves
differentiating a Tomcat crash during request processing and the request
timing out in the backlog.
For GET requests this should be a non-issue since GET requests are meant
to be idmepotent. GET requests can always be re-tried after a TCP reset.

For POST requests, use of the 100 Continue status can enable the client
to determine if the headers have been received. A TCP reset before the
100 continue response means the request needs to be re-tried. A TCP
reset after the 100 continue response means it is unknown if a retry is
necessary (there is no way for the client to determine the correct
answer in this case).

Given the above I don't see any reason to change Tomcat's current behaviour.
Ok, thank you for considering my proposal. I respect the decision of the 
Tomcat community.


Hopefully someone else will find this thread useful in future to 
understand the issue better and to overcome it


regards
asankha

--
Asankha C. Perera
AdroitLogic,http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-08 Thread Asankha C. Perera

On 11/09/2012 02:16 AM, Pid wrote:

On 08/11/2012 15:03, Asankha C. Perera wrote:

Hi Mark

what happens if some other process grabs the port in the meantime:
what is Tomcat supposed to do then?

In reality I do not know of a single client production deployment that
would allocate the same port to possibly conflicting services, that may
grab another's port when its suffering under load.

Just because it wouldn't cause a problem for a limited subset of Tomcat
users - your clients - does not mean that it would not cause problems
for other Tomcat users.

I cannot see any other issues of turning off accepting - and I am
curious to know if anyone else could share their views on this -
considering real production deployments

The problems have already been explained to you. Another process could
use the port.

I would consider such production deployment as a risk - a Tomcat crash,
or even a restart might end up in a soup if another process starts using
the port in the mean time..

It is not uncommon for monitoring tools to attempt to (re)start a
service when it is observed not to be listening on its designated port.
But that could happen even now, if the backlog fills and connections are 
being reset as seen currently


cheers
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-07 Thread Asankha C. Perera

Hi Mark

maxThreads limits the number of concurrent threads available for
processing requests. connection != concurrent request, primarily
because of HTTP keep-alive.

maxConnections can be used to limit the number of connections.
Thanks for this insight.. I initially missed this when I went through 
the Tomcat source, but now spent some time trying to understand how it 
was expected to work

If you set maxConnections to your desired value and repeat your tests
you will hopefully see different results. Depending on exactly how the
load test is designed, acceptCount may still influence the results. I
would be worth experimenting with different values for that as well (I'd
suggest 100, 1 and 0).
However when I tested with this, the same TCP resets were seen under 
load. After analyzing the source of the NioEndpoint closer I find that 
it only delays calling serverSock.accept() with Thread.sleep()'s - 
which is not going to help as shown in my first Java example.


// Loop until we receive a shutdown command
while (running) {

// Loop if endpoint is paused
while (paused  running) {
state = AcceptorState.PAUSED;
try {
Thread.sleep(50);
} catch (InterruptedException e) {
// Ignore
}
}

if (!running) {
break;
}
state = AcceptorState.RUNNING;

try {
//if we have reached max connections, wait
countUpOrAwaitConnection();

SocketChannel socket = null;
try {
// Accept the next incoming connection from the 
server

// socket
socket = serverSock.accept();
} catch (IOException ioe) {
//we didn't get a socket
countDownConnection();

regards
asankha


--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-07 Thread Asankha C. Perera

On 11/08/2012 04:57 AM, Esmond Pitt wrote:

That wouldn't have any different effect to not calling accept() at all
in blocking mode

Clearly there is a difference.

There isn't a difference. All that deregistering OP_ACCEPT does is prevent
the application from calling accept(). It has exactly the same effect as
thread-starving the accepting thread in blocking mode.
I hope you actually checked the second program I shared [1], and tried 
it. What it does is simply not delay accept(), but stop accepting.


   if (key.isAcceptable()) {
SocketChannel client = server.accept();
client.configureBlocking(false);
client.socket().setTcpNoDelay(true);
client.register(selector, SelectionKey.OP_READ);

System.out.println(I accepted this one.. but not any more now);
key.cancel();
key.channel().close();


When the server is ready to accept more messages, it re-binds to the 
listening socket and re-registers for OP_ACCEPT.


   server = ServerSocketChannel.open();
   server.socket().bind(new InetSocketAddress(8280), 0);
   server.configureBlocking(false);
   server.register(selector, SelectionKey.OP_ACCEPT);
   System.out.println(\nI am ready to listen for new messages now..);



I have written books on Java networking and I do know about this. Your 3-line 
program allows  1
connection at a time because of the backlog queue, as I have been
explaining, and when the backlog queue fills up, as it does when the
application doesn't call accept() fast enough, or at all, you get
platform-dependent behaviour. There is nothing you can do about this in Java
or indeed in C either. A program that created a ServerSocketChannel, didn't
register it for OP_ACCEPT, and then called select(), would behave in exactly
the same way.


Sorry, I do not know to explain it any better - I write code, try it..

[1] 
http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html


cheers
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com





Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-07 Thread Asankha C. Perera

Hi Esmond
I haven't said a word about your second program, that closes the 
listening socket. *Of course* that causes connection refusals, it 
can't possibly not, but it isn't relevant to the misconceptions about 
what OP_ACCEPT does that you have been expressing here and that I have 
been addressing.
I was learning things while discussing this issue over the Tomcat list. 
I started out asking the Tomcat community why I saw the hard RST 
behavior, and then started looking at the source of Tomcat, and then 
referenced the HttpComponents project - where at first I believed it was 
turning off interest in OP_ACCEPT - an assumption I was wrong about - 
since I had looked up only the discussion threads of HttpComponents and 
not the source.


Then I wrote the second program to illustrate how HttpComponents handled 
it after looking at its source code, and to answer the question posed by 
Chris on how it was done in HttpComponents. Since then I was basing my 
discussion around that second program, but I believe you were addressing 
issues from earlier - I apologize.
Closing the listening socket, as you seem to be now suggesting, is a 
very poor idea indeed:
I personally do not think there is anything at all bad about turning it 
off. After all, if you are not ready to accept more, you should be clear 
and upfront about it, even at the TCP level. Having different thresholds 
to stop listening (say at 4K), and to resume (say at 2K) would ensure 
that you do not start acting weirdly by starting/stopping/starting/.. 
acceptance around just one value.
what happens if some other process grabs the port in the meantime: 
what is Tomcat supposed to do then?
In reality I do not know of a single client production deployment that 
would allocate the same port to possibly conflicting services, that may 
grab another's port when its suffering under load.


I cannot see any other issues of turning off accepting - and I am 
curious to know if anyone else could share their views on this - 
considering real production deployments


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com





Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-06 Thread Asankha C. Perera

Hi Esmond

That wouldn't have any different effect to not calling accept() at all in
blocking mode
Clearly there is a difference. Please see the samples in [1]  [2] and 
execute them to see this. The TestAccept1 below allows one to open more 
than one connection at a time, even when only one accept() call is made 
as has been explained in [1]


import java.net.ServerSocket;
import java.net.Socket;

public class TestAccept1 {

public static void main(String[] args) throws Exception {
ServerSocket serverSocket = new ServerSocket(8280, 0);
Socket socket = serverSocket.accept();
Thread.sleep(300); // do nothing
}
}

[1] 
http://esbmagic.blogspot.com/2012/10/does-tomcat-bite-more-than-it-can-chew.html
[2] 
http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com





Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-06 Thread Asankha C. Perera

Hi Chris

My expectation from the backlog is:

1. Connections that can be handled directly will be accepted and work
will begin

2. Connections that cannot be handled will accumulate in the backlog

3. Connections that exceed the backlog will get connection refused

There are caveats, I would imagine. For instance, do the connections in
the backlog have any kind of server-side timeouts associated with them
-- what is, will they ever get discarded from the queue without ever
being handled by the bound process (assuming the bound process doesn't
terminate or anything weird like that)? Do the clients have any timeouts
associated with them?

Does the above *not* happen? On which platform? Is this only with NIO?
I am not a Linux level TCP expert, but what I believe is that the TCP 
layer has its timeouts and older connection requests will get discarded 
from the queue etc. Typically a client will have a TCP level timeout as 
well, i.e. the time it will wait for the other party to accept its SYN 
packet. My testing has been primarily on Linux / Ubuntu.


Leaving everything to the TCP backlog makes the end clients see nasty 
RSTs when Tomcat is under load instead of connection refused - and could 
prevent the client from performing a clean fail-over when one Tomcat 
node is overloaded.

So you are eliminating the backlog entirely? Or are you allowing the
backlog to work as expected? Does closing and re-opening the socket
clear the existing backlog (which would cancel a number of waiting
though not technically accepted connections, I think), or does it retain
the backlog? Since you are re-binding, I would imagine that the backlog
gets flushed every time there is a pause.
I am not sure how the backlog would work under different operating 
systems and conditions etc. However, the code I've shared shows how a 
pure Java program could take better control of the underlying TCP 
behavior - as visible to its clients.

What about performance effects of maintaining a connector-wide counter
of active connections, plus pausing and resuming the channel -- plus
re-connects by clients that have been dropped from the backlog?
What the UltraESB does by default is to stop accepting new connections 
after a threshold is reached (e.g. 4096) and remain paused until the 
active connections drops back to another threshold (e.g. 3073). Each of 
these parameters are user configurable, and depends on the maximum 
number of connections each node is expected to handle. Maintaining 
connector wide counts in my experience does not cause any performance 
effects, neither re-connects by clients - as whats expected in reality 
is for a hardware load balancer to forward requests that are refused 
by one node, to another node, which hopefully is not loaded.


Such a fail-over can take place immediately, cleanly and without any 
cause of confusion even if the backend service is not idempotent. This 
is clearly not the case when a TCP/HTTP connection is accepted and then 
met with a hard RST after a part or a full request has been sent to it.

I'm concerned that all of your bench tests appear to be done using
telnet with a single acceptable connection. What if you allow 1000
simultaneous connections and test it under some real load so we can see
how such a solution would behave.
Clearly the example I shared was just to illustrate this with a pure 
Java program. We usually conduct performance tests over half a dozen 
open source ESBs with concurrency levels of 20,40,80,160,320,640,1280 
and 2560 and payload sizes of 0.5, 1, 5, 10 and 100K bytes. You can see 
some of the scenarios here http://esbperformance.org. We privately 
conduct performance tests beyond 2560 to much higher levels. We used a 
HttpComponents based EchoService as our backend service all this time, 
and it behaved very well with all load levels. However some weeks back 
we accepted a contribution which was an async servlet to be deployed on 
Tomcat as it was considered more real world. The issues I noticed was 
when running high load levels over this servet deployed on Tomcat, 
especially when the response was being delayed to simulate realistic 
behavior.


Although we do not Tomcat ourselves, our customers do. I am also not 
calling this a bug - but as an area for possible improvement. If the 
Tomcat users, developers and the PMC thinks this is worthwhile to 
pursue, I believe it would be a good enhancement - maybe even a good 
GSoc project. As a fellow member of the ASF and a committer on multiple 
projects/years, I believed it was my duty to bring this to the attention 
of the Tomcat community.


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-06 Thread Asankha C. Perera

On 11/07/2012 11:55 AM, Mark Thomas wrote:

Mark Thomas ma...@apache.org wrote:


Asankha C. Perera asan...@apache.org wrote:


My testing has been primarily on Linux / Ubuntu.

With which version of Tomcat?

And which connector implentation?
Tomcat 7.0.29 and possibly 7.0.32 too, but I believe its common to all 
versions


Connector config was already shared
http://tomcat.10.n6.nabble.com/Handling-requests-when-under-load-ACCEPT-and-RST-vs-non-ACCEPT-tt4988693.html#a4988712

asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-05 Thread Asankha C. Perera

Hi Chris / Mark

Or you could just read the configuration documentation for the
connector. Hint: acceptCount - and it has been there since at
least Tomcat 4.
The acceptCount WAS being used, but was not being honored as an end user 
would expect in reality (See the configurations I've shared at the start)

If HttpComponents works as the OP expects, I wonder if he'd be willing
to give us the configuration he uses for *that*? Perhaps there is some
kind of TCP option that HttpComponents is using that Tomcat does not.
Whats done by HttpComponents is essentially turn off interest in 
SelectionKey.OP_ACCEPT [1] if I remember [2]. Check the code of the 
DefaultListeningIOReactor.pause() and resume() [3]


regards
asankha


[1] 
http://docs.oracle.com/javase/6/docs/api/java/nio/channels/SelectionKey.html#OP_ACCEPT 

[2] 
http://old.nabble.com/Controlling-%22acceptance%22-of-connections-tt27431279r4.html
[3] 
http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/apidocs/org/apache/http/impl/nio/reactor/DefaultListeningIOReactor.html


--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-05 Thread Asankha C. Perera

Hi Chris

First, evidently, acceptCount almost does not appear in the Tomcat
source. It's real name is backlog if you want to do some searching.
It's been in there forever.
Yes, I found it too; but saw that it didn't perform what an 'end user' 
would expect from Tomcat.

Second, all three connectors (APR, JIO, NIO) (through their
appropriate Endpoint implementation classes) faithfully configure the
backlog for their various sockets:
...
So, barring some JVM bug, the backlog is being set as appropriately as
possible.
Although the backlog is set, you cannot depend on it alone to make 
Tomcat behave more gracefully when under too much load. As explained in 
my previous blog post, this is not because of a defect of Tomcat - but 
the way things work in reality, causing TCP and HTTP connections to be 
established, requests to be [partially]sent and subsequently face hard 
TCP resets.

Third is the notion of playing with OP_ACCEPT on a selector. I'm no
NIO expert, here, but I don't understand why adding OP_ACCEPT to the
SelectionKey would change anything, here: the socket handles the
backlog, and the behavior of the selector shouldn't affect the OS's
TCP backlog. Doing so would be incredibly foolish: forcing the
application to react to all incoming connections before they went into
the backlog queue would essentially obviate the need for the backlog
queue in the first place.

If you can suggest something specific, here, I'd certainly be
interested in what your suggestion is. So far, what I'm hearing is
that it works with HttpComponents but I have yet to hear what it
is. Are you saying that, basically, NIO sockets simply do not have a
backlog, and we have to fake it using some other mechanism?
Sure, I've written a pure Java example [1] that illustrates what I am 
proposing. It illustrates how you could turn off accepting new 
connections, and resume normal operations once load levels returns to 
normal.


[1] 
http://esbmagic.blogspot.com/2012/11/how-to-stop-biting-when-you-cant-chew.html


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-02 Thread Asankha C. Perera

Hi Esmond

You are correct. As I recently found out Tomcat and Java is not causing 
this explicitly, as I first thought. So there is no 'bug' to be fixed.


But I believe there is an elegant way to refuse further connections when 
under load by turning off just the 'accepting' of new connections, and 
causing the client to see a 'connection refused' instead of allowing new 
connections, accepting requests and then resetting connections with a 
'connection reset', preventing the client from a clean failover for 
non-idempotent requests. Apache HttpComponents/NIO library already 
supports this, so its something that Tomcat too can support if the 
community thinks it would be useful.


cheers
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-11-01 Thread Asankha C. Perera

Hi Chris

I was connecting locally to the same node over the local interface on 
both EC2 and locally.


Since this went unresolved for sometime now for me, I investigated this 
a bit myself, first looking at the Coyote source code, and then 
experimenting with plain Java sockets. It seems like the issue is not 
really Tomcat resetting connections by itself, but rather; letting the 
underlying OS do it. It seems like it could be a but difficult to 
prevent this with blocking sockets, but I hope what I've found 
investigating this issue will help others in future


http://esbmagic.blogspot.com/2012/10/does-tomcat-bite-more-than-it-can-chew.html

regards
asankha

On 10/31/2012 09:27 PM, Christopher Schultz wrote:

Also, are you using a load balancer, or connecting directly to the EC2
instance? Do you have a public, static IP? If you use a static IP,
Amazon proxies your connections. I'm not sure what happens if you use
a non-static IP (which are public, but can change).


--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-10-29 Thread Asankha C. Perera

Hi All

During some performance testing I've seen that Tomcat resets accepted 
TCP connections when under load. I had seen this previously too [1], but 
was not able to analyze the scenario in detail earlier.


As per this dump from Wireshark [2], it seemed like Tomcat ACKed the 
client request, accepted part of the request, and then suddenly decided 
to close the connection and hence RSTed it. What I would expect Tomcat 
to have done instead is to refuse a connection when under load, and not 
accept and RST. The problem occurs for a client that would not know if a 
RSTed connection could be safely retried. If the connection was not 
accepted, a fail-over is straight forward.


Hope to hear some details from the developer community, to understand 
this behavior better


regards
asankha

[1] http://markmail.org/message/v7cpj6oqumtn5gtp
[2] http://troll.ws/image/6b38f283

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-10-29 Thread Asankha C. Perera

Hi Chris

Connector port=9000 
protocol=org.apache.coyote.http11.Http11NioProtocol

   connectionTimeout=2
   redirectPort=8443
   maxKeepAliveRequests=1
   processorCache=1
   acceptCount=1
   maxThreads=1/

I used the above on my notebook to re-produce the issue easily and get a 
clear Wireshark dump, but the below configuration also caused the same 
issue with a real load test on a larger EC2 node:


Connector port=9000 
protocol=org.apache.coyote.http11.Http11NioProtocol

   connectionTimeout=2
   redirectPort=8443
   maxKeepAliveRequests=1
   processorCache=2560
   acceptCount=1000
   maxThreads=300/

thanks
asankha

On 10/29/2012 09:43 PM, Christopher Schultz wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Asankha,

On 10/29/12 9:20 AM, Asankha C. Perera wrote:

During some performance testing I've seen that Tomcat resets
accepted TCP connections when under load. I had seen this
previously too [1], but was not able to analyze the scenario in
detail earlier.

Please post your Connector configuration and let us know if you are
using APR/native.

Thanks,
- -chris



--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Handling requests when under load - ACCEPT and RST vs non-ACCEPT

2012-10-29 Thread Asankha C. Perera

Hi Chris


Sorry, also what is your OS (be as specific as possible) and what JVM
are you running on?

Locally for the Wireshark capture I ran this on:
asankha@asankha-dm4:~$ uname -a
Linux asankha-dm4 3.2.0-31-generic #50-Ubuntu SMP Fri Sep 7 16:16:45 UTC 
2012 x86_64 x86_64 x86_64 GNU/Linux

asankha@asankha-dm4:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION=Ubuntu 12.04.1 LTS
asankha@asankha-dm4:~$ java -version
java version 1.6.0_33
Java(TM) SE Runtime Environment (build 1.6.0_33-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)

On EC2 nodes (c1.xlarge), I saw this with Ubuntu 10.10, with the same 
JDK on x64 platforms - but I believe this issue applies across for any OS


I'm interested to know if Tomcat can refuse to accept a connection 
when overloaded - without accepting and closing the ones that it cannot 
handle.


regards
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Expected behavior of Tomcat under load

2011-05-26 Thread Asankha C. Perera


Something puzzles me since your first post : 
...

What is this TCP CHECKSUM INCORRECT thing ?

This is the output of some protocol analyser thing, right ?

Yes, its a capture from tcpdump, analyzed by wireshark


So it is totally independent of Tomcat or whatever.
This packet is one that comes from whatever your client is, toward 
Tomcat.
Why does it show that message ? And if that message can be believed, 
is it then not normal that the protocol stack which receives that 
(bad) TCP packet would reject it, and break the connection ?


I guess this is normal. I did a quick search and came across the following:
http://www.ethereal.com/lists/ethereal-dev/200406/msg00090.html
http://stackoverflow.com/questions/667848/java-socket-tcp-checksum-incorrect
http://wiki.wireshark.org/TCP_Checksum_Verification

This trace is from a EC2 node
ubuntu@ip-10-202-99-31:~/configs$ ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off
ntuple-filters: off
receive-hashing: off

thanks
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com





-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Expected behavior of Tomcat under load

2011-05-26 Thread Asankha C. Perera

On 05/26/2011 09:50 PM, André Warnier wrote:

Putting your answer together with the one from Chuck :

I understand that if the tcpdump program runs on the same host as the 
one which is sending the packets, it may not be able to correctly see 
the TCP checksum, since it captures the packet before it goes out on 
the network, and it is the NIC which calculates and inserts the TCP 
checksum just before the packet is sent over the network.

Right ?

But is this the case here ?
Where is/was the tcpdump program run, which captured these packets, as 
compared to the client and server systems ?
I am quite certain this was from the ESB node which was the client to 
tomcat ..


thanks
asankha

--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com





-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Expected behavior of Tomcat under load

2011-05-25 Thread Asankha C. Perera

Hi All

During some performance tests, we've seen that Tomcat resets TCP 
connections under high load. To reproduce this rather consistently, a 
thread pool with a maximum of 300 threads could be configured on default 
Tomcat 6.0.32, and then 1280 ~ 2560 concurrent user requests simulated 
from a different machine over a real NW interface. I assume this could 
be reproduced with proportionately smaller numbers for both as well. The 
implementation uses a Xfire soap service.


Tomcat refusing connections, or taking longer to accept new connections, 
or taking longer to reply (causing a socket timeout) can be expected 
under such load - but what we see are TCP resets of connections to which 
a client has already sent a full HTTP request.


Is this the default behavior of Tomcat? The problem this presents is 
that the client cannot safely fail over to another instance, unlike with 
a refused connection or a connect timeout (i.e. delay in accepting)


thanks
asankha


No. TimeSourceDestination   Protocol 
Src Port Dst Port Info
 389961 37.056567   10.77.69.810.101.29.42  
TCP  9062 8080 9062  8080 [SYN] Seq=0 Win=5792 [TCP 
CHECKSUM INCORRECT] Len=0 MSS=1460 TSV=363753 TSER=363574 WS=7
 391297 37.108766   10.101.29.42  10.77.69.8
TCP  8080 9062 8080  9062 [SYN, ACK] Seq=0 Ack=1 Win=5792 
Len=0 MSS=1460 TSV=363383 TSER=363753 WS=7
 391298 37.108773   10.77.69.810.101.29.42  
TCP  9062 8080 9062  8080 [ACK] Seq=1 Ack=1 Win=5888 [TCP 
CHECKSUM INCORRECT] Len=0 TSV=363758 TSER=363383
 391893 37.115809   10.77.69.810.101.29.42  
HTTP 9062 8080 POST /xfire/xfire-service HTTP/1.1
 391894 37.115837   10.77.69.810.101.29.42  
HTTP 9062 8080 Continuation or non-HTTP traffic[Packet size 
limited during capture]
 392677 37.125492   10.101.29.42  10.77.69.8
TCP  8080 9062 8080  9062 [RST] Seq=1 Win=0 Len=0



--
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com





-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org