[ 
https://issues.apache.org/jira/browse/HBASE-11142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13993260#comment-13993260
 ] 

Andrew Purtell commented on HBASE-11142:
----------------------------------------

CLOSE_WAIT state is entered when the remote has closed the socket (a FIN packet 
has been sent) but the local socket has not yet been closed. Port 50010 is the 
default HDFS DataNode data transfer socket (dfs.datanode.address). Perhaps the 
master's DFS client is caching sockets for reuse but somehow not expiring the 
entries / closing the socket. 

> Taking snapshots can leave sockets on the master stuck in CLOSE_WAIT state
> --------------------------------------------------------------------------
>
>                 Key: HBASE-11142
>                 URL: https://issues.apache.org/jira/browse/HBASE-11142
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.2, 0.99.0, 0.96.1.1, 0.98.2
>            Reporter: Andrew Purtell
>
> As reported by Hansi Klose on user@. 
> {quote}
> we use a script to take on a regular basis snapshot's and delete old one's.
> We recognizes that the web interface of the hbase master was not working any 
> more because of too many open files.
> The master reaches his number of open file limit of 32768
> When I run lsof I saw that there where a lot of TCP CLOSE_WAIT handles open 
> with the regionserver as target.
> On the regionserver there is just one connection to the hbase master.
> I can see that the count of the CLOSE_WAIT handles grow each time
> i take a snapshot. When i delete on nothing changes.
> Each time i take a snapshot  there are 20 - 30 new CLOSE_WAIT handles.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to