Hi Andre, If I am not wrong, if the application in question is monitored in VisualVM through JMX (https://visualvm.java.net/) you could trigger a Force GC from its monitoring console.
In order to do that, these startup params might be necessary in the Java app side : -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9010 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false Thanks, Neill On Wed, Apr 22, 2015 at 3:02 PM, André Warnier <a...@ice-sa.com> wrote: > Rainer Jung wrote: > >> Am 22.04.2015 um 11:58 schrieb Thomas Boniface: >> >>> What concerns me the most is the CLOSE_WAIT on tomcat side because when >>> an >>> fd peak appears the web application appears to be stuck. It feels like >>> all >>> its connections are consumed and none can be established from nginx >>> anymore. Shouldn't the CLOSE_WAIT connection be recycled to received new >>> connections from nginx ? >>> >> >> Just to clarify: >> >> Every connection has two ends. In netstat the "local" end is left, the >> "remote" end is right. If a connection is between processes both on the >> same system, it will be shown in netstat twice. Once for each endpoint >> being the "local" side. >> >> CLOSE_WAIT for a connection between a (local) and b (remote) means, that >> b has closed the connection but not a. There is no automatism for a closing >> it because b has closed it. If CLOSE_WAIT pile up, then the idea of b and a >> when a connection should no longer be used are disparate. E.g. they might >> have very different idle timeouts (Keep Alive Timeout on HTTP speak), or >> one observed a problem that the other didn't observe. >> >> When I did the counting for >> >> Count IP:Port ConnectionState >> 8381 127.0.0.1:8080 CLOSE_WAIT >> >> the "127.0.0.1:8080" was left in netstat output, so "local". It means >> the other side (whatever is the other side of the connection, likely nginx) >> has closed the connection alardy, but not Tomcat. >> >> And the total number of those connections: >> >> Count IP:Port ConnectionState >> 8381 127.0.0.1:8080 CLOSE_WAIT >> 1650 127.0.0.1:8080 ESTABLISHED >> >> indeed sums up to the default maxConnections 10000 mentioned by Chris. >> >> What I do not understand is, that the same connections looked at from >> nginx being the local end, show a totally different statistics: >> >> Count IP:Port ConnectionState >> 20119 127.0.0.1:8080 SYN_SENT >> 4692 127.0.0.1:8080 ESTABLISHED >> 488 127.0.0.1:8080 FIN_WAIT2 >> 122 127.0.0.1:8080 TIME_WAIT >> 13 127.0.0.1:8080 FIN_WAIT1 >> >> But maybe that's a problem to solve after you fixed the CLOSED_WAIT (or >> the 1000 limit) and redo the whole observation. >> >> Pretty big numbers you habe ... >> >> > Thomas, > to elaborate on what Rainer is writing above : > > A TCP connection consists of 2 "pipes", one in each direction (client to > server, server to client). > From a TCP point of view, the "client" is the one which initially requests > the connection. The "server" is the one which "accepts" that connection. > (This is different from the more general idea of "server", as in "Tomcat > server". When Tomcat accepts a HTTP connection, it acts as "server"; when > a Tomcat webapp establishes a connection with an external HTTP server, the > webapp (and by extension Tomcat) is the "client"). > > These 2 pipes can be closed independently of one another, but both need to > be closed for the connection to be considered as closed and able to > "disappear". > When the client wants to close the connection, it will send a "close > request" packet on the client-to-server pipe. > The server receives this, and knows then that the client will not send > anything anymore onto that pipe. For a server application reading that > pipe, this would result in the equivalent of an "end of file" on that > datastream. > In response to the client close request, the server is supposed to react > by not sending any more data onto the server-to-client pipe, and in turn to > send a "close request" onto that pipe. > Once these various close messages have been received and acknowledged by > both sides of the connection, the connection is considered as closed, and > the resources associated with it can be reclaimed/recycled/garbage > collected etc.. ("closed" is like a virtual state; it means that there is > no connection). > > But if one side fails to fulfill its part of that contract, the connection > is still there, and it just remains there forever until something forceful > terminates it. And all the resources tied to that connection also remain > tied to it, and are subtracted from the overall resources which the server > has available to perform other tasks. > From a server point of view, the "ideal" situation is when all connections > are actually "active" and really being used to do something useful (sending > or receiving data e.g.). > The worst situation is when there are many "useless" connections : > connections in some state or the other, not actually doing anything useful, > but tying up resources nevertheless. This can get to the point where some > inherent limit is reached, and the server cannot accept any more > connections, although in theory it still has enough other resources > available which would allow it to process more useful transactions. > > Most of the "TCP states" that you see in the netstat output are transient, > and last only a few milliseconds usually. They are just part of the > overall "TCP connection lifecycle" which is cast in stone and which you can > do nothing about. > But, for example, if there is a permanent very high number of connections > in the CLOSE_WAIT state, that is not "normal". > > See here for an explanation of these TCP states, in particular CLOSE_WAIT : > > http://www.tcpipguide.com/free/t_TCPOperationalOverviewandtheTCPFiniteStateMachineF-2.htm > > According to Rainer's counts above, you have 1650 connections in the > ESTABLISHED state (and for the time being, let's suppose that these are > actually busy doing something useful). > But you also have 8381 connections in the CLOSE_WAIT state. These are not > doing anything useful, but they are blocking resources on your server. One > essential resource which they are blocking, is that there is (currently) a > maximum *total* of 10,000 connections which can be in existence at any one > time, and these CLOSE_WAIT connections are occupying (uselessly) 8381 of > these "slots" (84%). > > The precise reason why there are this many connections in that state is > not clear to us, but my money is on either some misconfiguration of the > nginx-tomcat connections, or some flaw in the application. > > One thing which you could try, and which might provide a clue, is to, in > quick succession, do : > 1) a "netstat" command to see how many connections are in CLOSE_WAIT state > 2) /force/ a GC for Tomcat (*). > 3) the same netstat command again, to check how many CLOSE_WAIT > connections there are now > > (*) someone else here should be able to contribute the easiest way to > achieve this > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >