after replication

Bernd Fehling (Commented) (JIRA) Wed, 04 Apr 2012 05:12:51 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246196#comment-13246196
 ]


Bernd Fehling commented on SOLR-3280:
-------------------------------------

Sorry I can't specify it any closer, a "network hiccup" or the computing center 
is configuring something at the network. I don't know. There is nothing in the 
solr logs, just hanging. The old index is still at work and serving the 
requests.
I located this with the server sys logs because the space the index located in 
data directory had doubled its size for longer than 1 day. One slave had this 
in August and October last year (solr 3.3) the other slave in October (solr 
3.3) and January this year (solr 3.5). After seeing with netstat the CLOSE_WAIT 
and forcing it to close the system went back to normal operation, started a new 
searcher with new index and close the old searcher and deleted the old index.


                
> to many / sometimes stale CLOSE_WAIT connections from SnapPuller during / 
> after replication
> -------------------------------------------------------------------------------------------
>
>                 Key: SOLR-3280
>                 URL: https://issues.apache.org/jira/browse/SOLR-3280
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 3.5, 3.6, 4.0
>            Reporter: Bernd Fehling
>            Assignee: Robert Muir
>            Priority: Minor
>         Attachments: SOLR-3280.patch
>
>
> There are sometimes to many and also stale CLOSE_WAIT connections 
> during/after replication left over on SLAVE server.
> Normally GC should clean up this but this is not always the case.
> Also if a CLOSE_WAIT is hanging then the new replication won't load.
> Dirty work around so far is to fake a TCP connection as root to that 
> connection and close it. 
> After that the new replication will load, the old index and searcher released 
> and the system will
> return to normal operation.
> Background:
> The SnapPuller is using Apache httpclient 3.x and uses the 
> MultiThreadedHttpConnectionManager.
> The manager holds a connection in CLOSE_WAIT after its use for further 
> requests.
> This is done by calling releaseConnection. But if a connection is stuck it is 
> not available any more and a new
> connection from the pool is used.
> Solution:
> After calling releaseConnection clean up with closeIdleConnections(0).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-3280) to many / sometimes stale CLOSE_WAIT connections from SnapPuller during / after replication

Reply via email to