[ 
https://issues.apache.org/jira/browse/HBASE-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113749#comment-17113749
 ] 

Mark Robert Miller commented on HBASE-24155:
--------------------------------------------

It took me a bit longer, but I ended up tracking this down a bit further. 
Raising the socket cache size and expiration for hdfs had helped a fair amount, 
but there still 50% the number of sockets getting made, a lot of it I tracked 
to *ReplicationSourceWALReader*  and it's reset to look for additional data to 
read.

> When running the tests, a tremendous number of connections are put into 
> TIME_WAIT.
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-24155
>                 URL: https://issues.apache.org/jira/browse/HBASE-24155
>             Project: HBase
>          Issue Type: Test
>          Components: test
>            Reporter: Mark Robert Miller
>            Priority: Major
>
> When you run the test suite and monitor the number of connections in 
> TIME_WAIT, it appears that a very large number of connections do not end up 
> with a proper connection close lifecycle or perhaps proper reuse.
> Given connections can stay in TIME_WAIT from 1-4 minutes depending on OS/Env, 
> running the tests faster or with more tests in parallel increases the 
> TIME_WAIT connection buildup. Some tests spin up a very, very large number of 
> connections and if the wrong ones run at the same time, this can also greatly 
> increase the number of connections put into TIME_WAIT. This can have a 
> dramatic affect on performance (as it can take longer to create a new 
> connection) or flat out fail or timeout.
> In my experience, a much, much smaller number of connections in a test suite 
> would end up in TIME_WAIT when connection handling is all correct.
> Notes to come in comments below.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to