[ https://issues.apache.org/jira/browse/SOLR-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374779#comment-15374779 ]
Shai Erera commented on SOLR-9290: ---------------------------------- bq. Interestingly, the number of connections stuck in CLOSE_WAIT decrease during indexing and increase again about 10 or so seconds after the indexing is stopped. I've observed that too and it's not that they decrease, but rather that the connections change their state from CLOSE_WAIT to ESTABLISHED, then when indexing is done to TIME_WAIT and then finally to CLOSE_WAIT again. I believe this aligns with what the HC documentation says -- the connections are not necessarily released, but kept in the pool. When you re-index again, they are reused and go back to the pool. bq. However, this commit only increases the limits on how many update connections that can be open That's interesting and might be a temporary workaround for the problem, which I intend to test shortly. In 5.4.1 they were both modified to 100,000: {noformat} - public static final int DEFAULT_MAXUPDATECONNECTIONS = 10000; - public static final int DEFAULT_MAXUPDATECONNECTIONSPERHOST = 100; + public static final int DEFAULT_MAXUPDATECONNECTIONS = 100000; + public static final int DEFAULT_MAXUPDATECONNECTIONSPERHOST = 100000; {noformat} This can explain why we run into trouble with 5.5.1 but not with 5.4.1. Though even in 5.4.1 there are few hundreds of CLOSE_WAIT connections, with 5.5.1 they reach (in our case) the orders of 35-40K, at which point Solr became useless, not being able to talk to the replica or pretty much anything else. I see these can be defined in solr.xml, though it's not documented how, so I'm going to give it a try and will report back here. > TCP-connections in CLOSE_WAIT spikes during heavy indexing when SSL is enabled > ------------------------------------------------------------------------------ > > Key: SOLR-9290 > URL: https://issues.apache.org/jira/browse/SOLR-9290 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 5.5.1, 5.5.2 > Reporter: Anshum Gupta > Priority: Critical > Attachments: SOLR-9290-debug.patch, setup-solr.sh > > > Heavy indexing on Solr with SSL leads to a lot of connections in CLOSE_WAIT > state. > At my workplace, we have seen this issue only with 5.5.1 and could not > reproduce it with 5.4.1 but from my conversation with Shalin, he knows of > users with 5.3.1 running into this issue too. > Here's an excerpt from the email [~shaie] sent to the mailing list (about > what we see: > {quote} > 1) It consistently reproduces on 5.5.1, but *does not* reproduce on 5.4.1 > 2) It does not reproduce when SSL is disabled > 3) Restarting the Solr process (sometimes both need to be restarted), the > count drops to 0, but if indexing continues, they climb up again > When it does happen, Solr seems stuck. The leader cannot talk to the > replica, or vice versa, the replica is usually put in DOWN state and > there's no way to fix it besides restarting the JVM. > {quote} > Here's the mail thread: > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201607.mbox/%3c46cc66220a8143dc903fa34e79205...@vp-exc01.dips.local%3E > Creating this issue so we could track this and have more people comment on > what they see. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org