RE: CLOSE_WAIT and what to do about it
From: André Warnier [mailto:a...@ice-sa.com] public void close() throws SomeException { putEndRequest(); flush(); socket = null; } flush() being another function which reads the socket until there's nothing left to read, and throws away the result. socket is a property of the object created by this class, obtained somewhere else from a java.net.Socket object. Looking at that code above, it is obvious that socket is open, until it is set to null, without previously doing a socket.close(). I don't know Java enough to know if this alone could cause that socket to be lingering until the GC, but I kind of suspect so. Nice piece of detective work, André! Yes, that code's broken - the socket's not referenced but not closed, so it will stay open until a GC tidies it up. $deity only knows what the original developer was thinking when they wrote that. - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: CLOSE_WAIT and what to do about it
From: André Warnier [mailto:a...@ice-sa.com] Subject: Re: CLOSE_WAIT and what to do about it If these sockets disappear during a GC, then it must mean that they are still being referenced by some abandoned objects sitting on the Heap, which have not yet been reclaimed by the GC. Which probably means that the objects in question have gone out of scope, before the socket they used was properly close()'d. Your analysis looks reasonable to me. There are some analysis tools that will examine a live heap (or dump thereof) and find the reachable and unreachable objects; jhat is a free one that comes with JDK 6: http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/tooldescr.html#gblfj - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: CLOSE_WAIT and what to do about it
Caldarale, Charles R wrote: From: André Warnier [mailto:a...@ice-sa.com] Subject: Re: CLOSE_WAIT and what to do about it Relatedly, does there exist any way to force a given JVM process to do a full GC interactively, but from a Linux command-line ? Found a command line tool that will do what you want: http://code.google.com/p/jmxsh/ I've used it to trigger a GC in Tomcat via the following steps. 1) Start Tomcat with the following options: -Dcom.sun.management.jmxremote.port=port -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false (You can, of course, set the authentication and SSL options as needed.) 2) Start jmxsh from the directory its jar is in with this: java -jar jmxsh*.jar 3) Enter the following commands (but not the bracketed bits): jmx_connect -h localhost -p port [blank line to enter browse mode] 5 [selects java.lang] 1 [selects the Memory mbean] 5 [performs a GC] The doc for jmxsh indicates the above steps should be scriptable, but I haven't tried that. It is likely that you could use jmx_connect with a different kind of service and avoid opening up an RMI port; if I figure that out, I'll let you know. Hi. Thanks a million for providing the above info. That jmxsh program is really useful. I don't really know what I'm doing here, but I can at least more or less figure out what happens. To recall, my original issue is that I have some Java applications (among which a Tomcat webapp and a couple of stand-alone Java daemon-like programs) which apparently leave an ever-increasing number of sockets lingering in a CLOSE_WAIT state. And I was wondering if it was possible, as one test, to force the JVM running these applications to perform a GC, right now, from the outside. Well, it is. Following is a trace of a session with jmxsh, with one of these applications. Initial socket situation : r...@arthur:/home/star/xml# netstat -pan | grep CLOSE tcp6 0 0 :::127.0.0.1:48267 :::127.0.0.1:11002 CLOSE_WAIT 7618/java tcp6 12 0 :::127.0.0.1:36936 :::127.0.0.1:11002 CLOSE_WAIT 7816/java tcp6 12 0 :::127.0.0.1:50322 :::127.0.0.1:11002 CLOSE_WAIT 7816/java r...@arthur:/home/star/xml# ps -ef | grep 7618 root 7618 1 1 14:32 pts/300:00:15 ./java -server -Dcom.sun.management.jmxremote.port=11201 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Xms64M -Xmx64M -Dpgm=STARWeb -jar /home//web4/java/xyz.jar -c /home/star/web4/config -p 11101 The above is the process which I am going to stress, in the sense of communicating with it, which has the result of having it itself open a TCP connection with another server listening on port 11002, then closing this socket (in principle), and this multiple times. (As you see, the program was started with the jmxremote options allowing later communication with jmxsh.) Now some interactions with the application pid=7618 ... Situation later on : r...@arthur:/home/star/xml# netstat -pan | grep CLOSE tcp6 0 0 :::127.0.0.1:55798 :::127.0.0.1:11002 CLOSE_WAIT 7618/java tcp6 0 0 :::127.0.0.1:57029 :::127.0.0.1:11002 CLOSE_WAIT 7618/java tcp6 0 0 :::127.0.0.1:48267 :::127.0.0.1:11002 CLOSE_WAIT 7618/java tcp6 0 0 :::127.0.0.1:56781 :::127.0.0.1:11002 CLOSE_WAIT 7618/java tcp6 12 0 :::127.0.0.1:36936 :::127.0.0.1:11002 CLOSE_WAIT 7816/java tcp6 12 0 :::127.0.0.1:58341 :::127.0.0.1:11002 CLOSE_WAIT 7816/java tcp6 0 0 :::127.0.0.1:32972 :::127.0.0.1:11002 CLOSE_WAIT 7618/java tcp6 12 0 :::127.0.0.1:50322 :::127.0.0.1:11002 CLOSE_WAIT 7816/java So this application indeed left a number of sockets in the CLOSE_WAIT state. Now triggering a GC with jmxsh : a...@arthur:~$ java -jar bin/jmxsh-R4.jar jmxsh v1.0, Tue Jan 22 17:23:12 GMT+01:00 2008 Type 'help' for help. Give the option '-?' to any command for usage help. Starting up in shell mode. % jmx_connect -h localhost -p 11201 Connected to service:jmx:rmi:///jndi/rmi://localhost:11201/jmxrmi. % Entering browse mode. Available Domains: 1. java.util.logging 2. JMImplementation 3. java.lang SERVER: service:jmx:rmi:///jndi/rmi://localhost:11201/jmxrmi Select a domain: 3 Available MBeans: 1. java.lang:type=Compilation 2. java.lang:type=MemoryManager,name=CodeCacheManager 3. java.lang:type=GarbageCollector,name=Copy 4. java.lang:type=MemoryPool,name=Eden Space 5. java.lang:type=Runtime 6. java.lang:type=ClassLoading 7. java.lang:type=MemoryPool,name=Survivor Space 8. java.lang:type
Re: CLOSE_WAIT and what to do about it
Caldarale, Charles R wrote: From: André Warnier [mailto:a...@ice-sa.com] Subject: Re: CLOSE_WAIT and what to do about it If these sockets disappear during a GC, then it must mean that they are still being referenced by some abandoned objects sitting on the Heap, which have not yet been reclaimed by the GC. Which probably means that the objects in question have gone out of scope, before the socket they used was properly close()'d. Your analysis looks reasonable to me. There are some analysis tools that will examine a live heap (or dump thereof) and find the reachable and unreachable objects; jhat is a free one that comes with JDK 6: http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/tooldescr.html#gblfj Allright, I have done that too. I generated a Heap dump using jmap -heap:format=b pid That gave me file heap.bin of some 4.5 MB. I then used the jhat program to open it. jhat launches itself by default as a webserver on port 7000, which you can access using a normal browser. That's where my problem starts however, because being a mere Java fiddler I don't really know what I am looking at, and what to look for. I did a lot of guesswork anyway, and using my knowledge of the application more than the links, I came upon the name of a class that looks like it is reponsible for opening/closing the sockets that remain in CLOSE_WAIT. I found the following function in the class : public void close() throws SomeException { putEndRequest(); flush(); socket = null; } flush() being another function which reads the socket until there's nothing left to read, and throws away the result. socket is a property of the object created by this class, obtained somewhere else from a java.net.Socket object. Looking at that code above, it is obvious that socket is open, until it is set to null, without previously doing a socket.close(). I don't know Java enough to know if this alone could cause that socket to be lingering until the GC, but I kind of suspect so. How does a Java expert look at that ? - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: CLOSE_WAIT and what to do about it
From: André Warnier [mailto:a...@ice-sa.com] Subject: Re: CLOSE_WAIT and what to do about it Looking at that code above, it is obvious that socket is open, until it is set to null, without previously doing a socket.close(). I don't know Java enough to know if this alone could cause that socket to be lingering until the GC, but I kind of suspect so. For not being that familiar with Java, you've done an admirable job of tracking this down. What you've found certainly looks like the cause of the problem; the class you encountered appears to be a wrapper for a plain java.net.Socket, and whoever wrote it simply missed putting in a socket.close() call. Perhaps this was originally developed on an older JVM with more frequent non-generational garbage collection, so the problem wasn't noticed then. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: CLOSE_WAIT and what to do about it
Caldarale, Charles R wrote: From: André Warnier [mailto:a...@ice-sa.com] Subject: Re: CLOSE_WAIT and what to do about it Looking at that code above, it is obvious that socket is open, until it is set to null, without previously doing a socket.close(). I don't know Java enough to know if this alone could cause that socket to be lingering until the GC, but I kind of suspect so. For not being that familiar with Java, you've done an admirable job of tracking this down. What you've found certainly looks like the cause of the problem; the class you encountered appears to be a wrapper for a plain java.net.Socket, and whoever wrote it simply missed putting in a socket.close() call. Perhaps this was originally developed on an older JVM with more frequent non-generational garbage collection, so the problem wasn't noticed then. I was standing on the shoulders of giants. Thanks for the help. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: CLOSE_WAIT and what to do about it
From: André Warnier [mailto:a...@ice-sa.com] Subject: Re: CLOSE_WAIT and what to do about it Relatedly, does there exist any way to force a given JVM process to do a full GC interactively, but from a Linux command-line ? Found a command line tool that will do what you want: http://code.google.com/p/jmxsh/ I've used it to trigger a GC in Tomcat via the following steps. 1) Start Tomcat with the following options: -Dcom.sun.management.jmxremote.port=port -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false (You can, of course, set the authentication and SSL options as needed.) 2) Start jmxsh from the directory its jar is in with this: java -jar jmxsh*.jar 3) Enter the following commands (but not the bracketed bits): jmx_connect -h localhost -p port [blank line to enter browse mode] 5 [selects java.lang] 1 [selects the Memory mbean] 5 [performs a GC] The doc for jmxsh indicates the above steps should be scriptable, but I haven't tried that. It is likely that you could use jmx_connect with a different kind of service and avoid opening up an RMI port; if I figure that out, I'll let you know. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: CLOSE_WAIT and what to do about it
Skimmed quickly through your post there while working, so forgive me if this is irrelevant. CLOSE_WAIT is a state where the connection has been closed on the tcp/ip level, but the application (in this case java) has not closed the socket descriptor yet. As a coincidence we just fixed this very same issue in our application, which uses the httpclient library. There is a known issue with the httpclient library where sockets are not closed after the connection ends (issue or feature you be the judge), we worked around this by explicitly calling a close ourselves. If httpclient is used that could be the culprit. See http://www.nabble.com/tcp-connections-left-with-CLOSE_WAIT-td13757202.html for a better description Rgds, Taylan André Warnier wrote: Hi. As a follow-upon another thread originally entitled apache/tomcat communication issues (502 response), I'd like to pursue the CLOSE-WAIT subject. Sorry if this post is a bit long, I want to make sure that I do provide all the necessary information. Like the original poster, I am seeing on my systems a fair number of sockets apparently stuck for a long time in the CLOSE_WAIT state. (Sometimes several hundreds of them). They seem to predominantly concern Tomcat and other java processes, but as Alan pointed out previously and I confirm, my perspective is slanted, because we use a lot of common java programs and webapps on our servers, and the ones mostly affected talk to eachother and come from the same vendor. Unfortunately also, I do not have the sources of these programs/webapps available, and will not get them, and I can't do without these programs. It has been previously established that a socket in a long-time-lingering CLOSE-WAIT status, is due to one or the other side of a TCP connection not properly closing its side of the connection when it is done with it. I also surmise (without having a definite proof of this), that this is essentially bad, as it ties up some resources that could be otherwise freed. I have also been told or discovered that, our servers being Linux Debian servers, programs such as ps, netstat and lsof can help in determining precisely how many such lingering sockets there are, and who the culprit processes are (to some extent). In our case, we know which are the programs involved, because we know which ones open a listening socket and on what fixed port, and we also know which are the other processes talking to them. But, as mentioned previously, we do not have the source of these programs and will not get them, but cannot practically do without them for now. But we do have full root control of the Linux servers where these programs are running. So my question is : considering the situation above, is there something I can do locally to free these lingering CLOSE_WAIT sockets, and under which conditions ? (I must admit that I am a bit lost among the myriad options of lsof) For example, suppose I start with a netstat -pan command and I see the display below (sorry for the line-wrapping). I see a number of sockets in the CLOSE_WAIT state, and for those I have a process-id, which I can associate to a particular process. For example, I see this line : tcp6 12 0 :::127.0.0.1:41764 :::127.0.0.1:11002 CLOSE_WAIT 29649/java which tells me that there is a local process 29649/java, whith a local socket port 41674 in the CLOSE_WAIT state, related to another socket #11002 on the same host. On the other hand, I see this line : tcp0 0 127.0.0.1:11002 127.0.0.1:41764 FIN_WAIT2 - which shows a local socket on port 11002, related to this other local socket port #41764, with no process-id/program displayed. What does that tell me ? I also know that the process-id 29649 corresponds to a local java process, of the daemon variety, multi-threaded. That program talks to another known server program, written in C, of which instances are started on an ad-hoc base by inetd, and which listens on port 11002 (in fact it is inetd who does, and it passes this socket on to the process it forks, I understand that). (The link with Tomcat is that I also see frequently the same situation, where the process owning the CLOSE_WAIT socket is Tomcat, more specifically one webapp running inside it. It's just that in this particular snapshot it isn't.) What it looks like to me in this case, is that at some point one of the threads of process # 29649 opened a client socket #41674 to the local inetd port #11002; that inetd then started the underlying server process (the C program); that the underlying C program then at some point exited; but that process #41674 never closes one of the sides of its connection with port #11002. Can I somehow detect this condition, and force the offending thread of process #29649 to close that socket (or just force this thread to exit) ? I realise this may be a complex question, and that the answers may be
RE: CLOSE_WAIT and what to do about it
From: André Warnier [mailto:a...@ice-sa.com] It has been previously established that a socket in a long-time-lingering CLOSE-WAIT status, is due to one or the other side of a TCP connection not properly closing its side of the connection when it is done with it. I also surmise (without having a definite proof of this), that this is essentially bad, as it ties up some resources that could be otherwise freed. At the very least it'll tie up a kernel data structure for the socket itself. I don't know modern Linux kernels well enough to know how buffers are allocated, but I suspect you won't be wasting much memory on buffers as they'll be allocated on-demand. You're probably talking tens to low hundreds of bytes for each one of these. You will also be consuming resources in whichever program is not closing the sockets correctly. So my question is : considering the situation above, is there something I can do locally to free these lingering CLOSE_WAIT sockets, and under which conditions ? For example, I see this line : tcp6 12 0 :::127.0.0.1:41764 :::127.0.0.1:11002 CLOSE_WAIT 29649/java which tells me that there is a local process 29649/java, whith a local socket port 41674 in the CLOSE_WAIT state, related to another socket #11002 on the same host. On the other hand, I see this line : tcp0 0 127.0.0.1:11002 127.0.0.1:41764 FIN_WAIT2 - which shows a local socket on port 11002, related to this other local socket port #41764, with no process-id/program displayed. What does that tell me ? The process that was on port 11002 closed its end of the socket and sent a FIN. Process 29649 hasn't closed its end of the socket yet. I also know that the process-id 29649 corresponds to a local java process, of the daemon variety, multi-threaded. That program talks to another known server program, written in C, of which instances are started on an ad-hoc base by inetd, and which listens on port 11002 (in fact it is inetd who does, and it passes this socket on to the process it forks, I understand that). The local Java process may have a resource leak. It appears not to have closed the socket it was using to communicate with the server. A possible reason for the lack of a PID on port 11002 is that the socket was handed across from inetd to the C daemon - not sure about this. What it looks like to me in this case, is that at some point one of the threads of process # 29649 opened a client socket #41674 to the local inetd port #11002; that inetd then started the underlying server process (the C program); that the underlying C program then at some point exited; but that process #41674 never closes one of the sides of its connection with port #11002. Agree. Can I somehow detect this condition, and force the offending thread of process #29649 to close that socket (or just force this thread to exit) ? Threads are flows of control. Threads do not reference objects other than from their stack and any thread-local storage - and there are plenty of other places that can hold onto objects! The socket may well be referenced from an object on the heap (not the stack) that's ultimately referenced by a static variable in a class, for example, in which case zapping a thread may well do nothing. You need to find out what, if anything, is holding onto the socket. If you have some way of forcing that Java process to collect garbage, you should do so. It's possible for sockets that haven't been close()d to hang around, unreferenced but not yet garbage collected. A full GC would collect any of these, finalizing them as it does and hence closing the socket. If a full GC doesn't close the socket, some other object is still referencing it. If a full GC doesn't clear the problem, you may need to go in with some memory-tracing tool and find out what's holding onto the socket. It's a long, long time since I had to do this in Java, so I have no idea of the appropriate tools - my brain's telling me Son of Strike, which is for the .Net CLR and *definitely* wrong! Does that help? Or is it clear as mud? - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: CLOSE_WAIT and what to do about it
Peter Crowther wrote: [...] Does that help? Or is it clear as mud? For no-java-expert-me, it is indeed of the hazy category. But it helps a lot, in the sense of adding a +3 in the column get back to the vendor and ask them to fix their code. ;-) Thanks. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: CLOSE_WAIT and what to do about it
Peter Crowther wrote: [...] If you have some way of forcing that Java process to collect garbage, you should do so. It's possible for sockets that haven't been close()d to hang around, unreferenced but not yet garbage collected. A full GC would collect any of these, finalizing them as it does and hence closing the socket. If a full GC doesn't close the socket, some other object is still referencing it. Hopping on that idea, and still considering the try something from the outside, without modifying the code kind of view : This process is started as a daemon, with a java command-line. Is it possible to add some arguments to that command-line to induce the JVM to do a GC more often ? (I don't think that in this case it would have a very negative impact on performance.) It currently starts without any -D switches at all to the command-line, basically : path/to/java/java -jar theapp.jar The same question for the related Tomcat webapp (which I suspect of having the same issue). But in that case I do have to be a bit more careful regarding the performance impact, although this webapp is pretty much all that is running in this Tomcat. And that Tomcat (on some of our systems) starts under jsvc, and I don't really know where to set the parameters for that one under Linux. Relatedly, does there exist any way to force a given JVM process to do a full GC interactively, but from a Linux command-line ? I have full access to these systems, but usually only in SSH console mode, and I don't know if there is any kind of graphical GUI installed or accessible on them. Basically, I'd like to see if triggering a GC reduces this number of lingering sockets. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: CLOSE_WAIT and what to do about it
From: André Warnier [mailto:a...@ice-sa.com] This process is started as a daemon, with a java command-line. Is it possible to add some arguments to that command-line to induce the JVM to do a GC more often ? http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html - I don't think so, although the RMI option under Explicit Garbage Collection might work. The same question for the related Tomcat webapp (which I suspect of having the same issue). But in that case I do have to be a bit more careful regarding the performance impact, although this webapp is pretty much all that is running in this Tomcat. That one's easy. Add another webapp with one page. When the page is requested, call System.GC(). Job done! Relatedly, does there exist any way to force a given JVM process to do a full GC interactively, but from a Linux command-line ? I'm not aware of one, but I'm not an expert. I await the experts' comments with interest! - Peter - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
RE: CLOSE_WAIT and what to do about it
From: André Warnier [mailto:a...@ice-sa.com] Subject: Re: CLOSE_WAIT and what to do about it Relatedly, does there exist any way to force a given JVM process to do a full GC interactively, but from a Linux command-line ? I haven't found one yet, but there are numerous command-line monitoring utilities included with the JDK that display all sorts of GC information, using the same connection mechanism as JConsole. Since JConsole can force a GC in a JVM its monitoring, doing it from the command line is feasible. Might have to do a little coding... - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: CLOSE_WAIT and what to do about it
Hi André, I didn't fully read all responses, so I hope i don't repeat to much (or worse contradict statements contained in other replies). On 08.04.2009 12:32, André Warnier wrote: Like the original poster, I am seeing on my systems a fair number of sockets apparently stuck for a long time in the CLOSE_WAIT state. (Sometimes several hundreds of them). They seem to predominantly concern Tomcat and other java processes, but as Alan pointed out previously and I confirm, my perspective is slanted, because we use a lot of common java programs and webapps on our servers, and the ones mostly affected talk to eachother and come from the same vendor. Unfortunately also, I do not have the sources of these programs/webapps available, and will not get them, and I can't do without these programs. It has been previously established that a socket in a long-time-lingering CLOSE-WAIT status, is due to one or the other side of a TCP connection not properly closing its side of the connection when it is done with it. CLOSE_WAIT says the other side shut down the connection. TCP connections are allowed to stay for an arbitrary time in half-closed state. In general TCP connection can be used in a duplex way. But assume one end has finished communication (sending data). Then it can already close its side of the connection. The nice TCP state diagram is contained in the fundamental book of Stevens, and can be seen e.g. at http://www.cse.iitb.ac.in/perfnet/cs456/tcp-state-diag.pdf As you can see, CLOSE_WAIT on one end always implies FIN_WAIT2 on the other end (except, when between the two ends there's yet another component, that interferes with the communication like maybe a firewall). In the special situation where both ends of the communication are on the same system, one finds each connection twice, one from the point of view of each side of the connection. It is always important to think about which end one is looking at, when interpreting the two lines. I also surmise (without having a definite proof of this), that this is essentially bad, as it ties up some resources that could be otherwise freed. I have also been told or discovered that, our servers being Linux Debian servers, programs such as ps, netstat and lsof can help in determining precisely how many such lingering sockets there are, and who the culprit processes are (to some extent). True. In our case, we know which are the programs involved, because we know which ones open a listening socket and on what fixed port, and we also know which are the other processes talking to them. But, as mentioned previously, we do not have the source of these programs and will not get them, but cannot practically do without them for now. But we do have full root control of the Linux servers where these programs are running. The details may depend on the used protocols and sometimes you can get information about timeouts you can set in the application, like idle timeouts for persistent connections. So my question is : considering the situation above, is there something I can do locally to free these lingering CLOSE_WAIT sockets, and under which conditions ? (I must admit that I am a bit lost among the myriad options of lsof) I would say no, if you can't change the application and the developper of it didn't provide any configuration options. CLOSE_WAIT from the point of view of tcp is a legitimate state without any builtin timeout. For example, suppose I start with a netstat -pan command and I see the display below (sorry for the line-wrapping). I see a number of sockets in the CLOSE_WAIT state, and for those I have a process-id, which I can associate to a particular process. For example, I see this line : tcp6 12 0 :::127.0.0.1:41764 :::127.0.0.1:11002 CLOSE_WAIT 29649/java which tells me that there is a local process 29649/java, whith a local socket port 41674 in the CLOSE_WAIT state, related to another socket #11002 on the same host. On the other hand, I see this line : tcp0 0 127.0.0.1:11002 127.0.0.1:41764 FIN_WAIT2 - which shows a local socket on port 11002, related to this other local socket port #41764, with no process-id/program displayed. What does that tell me ? My interpretation (not 100% sure): Not sure, what your OS shows in netstat after closing the local side of a connection, more precisely whether the pid is still shown, or is removed. Depending on this answer, either we have a simple one-sided shutdown, or even a process exit. In both cases the process 41764 didn't have any reason to use the established connection in the meantime, so it didn't realise, that the connection is only half-open. As soon as it tried to use it, it should/would detect that and most likely (if programmed correctly) close it. I also know that the process-id 29649 corresponds to a local java process, of the daemon variety, multi-threaded. That program talks to another known server