Rainer Jung wrote:
Am 22.04.2015 um 11:58 schrieb Thomas Boniface:
What concerns me the most is the CLOSE_WAIT on tomcat side because
when an
fd peak appears the web application appears to be stuck. It feels like
all
its connections are consumed and none can be established from nginx
anymore. Shouldn't the CLOSE_WAIT connection be recycled to received new
connections from nginx ?
Just to clarify:
Every connection has two ends. In netstat the "local" end is left, the
"remote" end is right. If a connection is between processes both on the
same system, it will be shown in netstat twice. Once for each endpoint
being the "local" side.
CLOSE_WAIT for a connection between a (local) and b (remote) means, that
b has closed the connection but not a. There is no automatism for a
closing it because b has closed it. If CLOSE_WAIT pile up, then the idea
of b and a when a connection should no longer be used are disparate.
E.g. they might have very different idle timeouts (Keep Alive Timeout on
HTTP speak), or one observed a problem that the other didn't observe.
When I did the counting for
Count IP:Port ConnectionState
8381 127.0.0.1:8080 CLOSE_WAIT
the "127.0.0.1:8080" was left in netstat output, so "local". It means
the other side (whatever is the other side of the connection, likely
nginx) has closed the connection alardy, but not Tomcat.
And the total number of those connections:
Count IP:Port ConnectionState
8381 127.0.0.1:8080 CLOSE_WAIT
1650 127.0.0.1:8080 ESTABLISHED
indeed sums up to the default maxConnections 10000 mentioned by Chris.
What I do not understand is, that the same connections looked at from
nginx being the local end, show a totally different statistics:
Count IP:Port ConnectionState
20119 127.0.0.1:8080 SYN_SENT
4692 127.0.0.1:8080 ESTABLISHED
488 127.0.0.1:8080 FIN_WAIT2
122 127.0.0.1:8080 TIME_WAIT
13 127.0.0.1:8080 FIN_WAIT1
But maybe that's a problem to solve after you fixed the CLOSED_WAIT (or
the 1000 limit) and redo the whole observation.
Pretty big numbers you habe ...
Thomas,
to elaborate on what Rainer is writing above :
A TCP connection consists of 2 "pipes", one in each direction (client to server, server to
client).
From a TCP point of view, the "client" is the one which initially requests the
connection. The "server" is the one which "accepts" that connection. (This is different
from the more general idea of "server", as in "Tomcat server". When Tomcat accepts a HTTP
connection, it acts as "server"; when a Tomcat webapp establishes a connection with an
external HTTP server, the webapp (and by extension Tomcat) is the "client").
These 2 pipes can be closed independently of one another, but both need to be closed for
the connection to be considered as closed and able to "disappear".
When the client wants to close the connection, it will send a "close request" packet on
the client-to-server pipe.
The server receives this, and knows then that the client will not send anything anymore
onto that pipe. For a server application reading that pipe, this would result in the
equivalent of an "end of file" on that datastream.
In response to the client close request, the server is supposed to react by not sending
any more data onto the server-to-client pipe, and in turn to send a "close request" onto
that pipe.
Once these various close messages have been received and acknowledged by both sides of the
connection, the connection is considered as closed, and the resources associated with it
can be reclaimed/recycled/garbage collected etc.. ("closed" is like a virtual state; it
means that there is no connection).
But if one side fails to fulfill its part of that contract, the connection is still there,
and it just remains there forever until something forceful terminates it. And all the
resources tied to that connection also remain tied to it, and are subtracted from the
overall resources which the server has available to perform other tasks.
From a server point of view, the "ideal" situation is when all connections are actually
"active" and really being used to do something useful (sending or receiving data e.g.).
The worst situation is when there are many "useless" connections : connections in some
state or the other, not actually doing anything useful, but tying up resources
nevertheless. This can get to the point where some inherent limit is reached, and the
server cannot accept any more connections, although in theory it still has enough other
resources available which would allow it to process more useful transactions.
Most of the "TCP states" that you see in the netstat output are transient, and last only a
few milliseconds usually. They are just part of the overall "TCP connection lifecycle"
which is cast in stone and which you can do nothing about.
But, for example, if there is a permanent very high number of connections in the
CLOSE_WAIT state, that is not "normal".
See here for an explanation of these TCP states, in particular CLOSE_WAIT :
http://www.tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF-2.htm
According to Rainer's counts above, you have 1650 connections in the ESTABLISHED state
(and for the time being, let's suppose that these are actually busy doing something useful).
But you also have 8381 connections in the CLOSE_WAIT state. These are not doing anything
useful, but they are blocking resources on your server. One essential resource which they
are blocking, is that there is (currently) a maximum *total* of 10,000 connections which
can be in existence at any one time, and these CLOSE_WAIT connections are occupying
(uselessly) 8381 of these "slots" (84%).
The precise reason why there are this many connections in that state is not clear to us,
but my money is on either some misconfiguration of the nginx-tomcat connections, or some
flaw in the application.
One thing which you could try, and which might provide a clue, is to, in quick succession,
do :
1) a "netstat" command to see how many connections are in CLOSE_WAIT state
2) /force/ a GC for Tomcat (*).
3) the same netstat command again, to check how many CLOSE_WAIT connections
there are now
(*) someone else here should be able to contribute the easiest way to achieve
this
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org