[ 
https://issues.apache.org/jira/browse/SOLR-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083141#comment-15083141
 ] 

Mark Miller commented on SOLR-8453:
-----------------------------------

I think it's more related to using HttpClient than http. We see random 
connection resets in many tests that go away with this, but looking at the 
consistent test we have that fails (SolrExampleStreamingTest#testUpdateField), 
we seem to hit the problem when HttpClient is cleaning up and closing the 
outputstream, which flushes a buffer.

{code}
java.net.SocketException: Connection reset
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        at 
org.apache.http.impl.io.AbstractSessionOutputBuffer.flushBuffer(AbstractSessionOutputBuffer.java:159)
        at 
org.apache.http.impl.io.AbstractSessionOutputBuffer.flush(AbstractSessionOutputBuffer.java:166)
        at 
org.apache.http.impl.io.ChunkedOutputStream.close(ChunkedOutputStream.java:205)
        at 
org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:118)
        at 
org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:265)
        at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:203)
        at 
org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:237)
        at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:122)
        at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685)
        at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487)
        at 
org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:882)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
        at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
        at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:280)
        at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:161)
{code}

It may also be that it only happens with this chunked encoding. On close, it 
tries to write a 'closing chunk' and then flush: 
https://github.com/apache/httpcore/blob/4.0.x/httpcore/src/main/java/org/apache/http/impl/io/ChunkedOutputStream.java

If there is a problem here we get the connection reset.

It does actually seem like a bit of a race to me and I'm not sure how to 
address that yet (other than this patch). If you remove the 250ms poll that 
happens in 
ConcurrentUpdateSolrClient->sendUpdateStream->EntityTemplate->writeTo, it seems 
to go away. But that would indicate our connection management is a bit fragile, 
with the client kind of racing the server.

Still playing around to try and find other potential fixes.

> Local exceptions in DistributedUpdateProcessor should not cut off an ongoing 
> request.
> -------------------------------------------------------------------------------------
>
>                 Key: SOLR-8453
>                 URL: https://issues.apache.org/jira/browse/SOLR-8453
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>         Attachments: SOLR-8453.patch, SOLR-8453.patch, SOLR-8453.patch, 
> SOLR-8453.patch, SOLR-8453.patch, SOLR-8453.patch, SOLR-8453.patch
>
>
> The basic problem is that when we are streaming in updates via a client, an 
> update can fail in a way that further updates in the request will not be 
> processed, but not in a way that causes the client to stop and finish up the 
> request before the server does something else with that connection.
> This seems to mean that even after the server stops processing the request, 
> the concurrent update client is still in the process of sending the request. 
> It seems previously, Jetty would not go after the connection very quickly 
> after the server processing thread was stopped via exception, and the client 
> (usually?) had time to clean up properly. But after the Jetty upgrade from 
> 9.2 to 9.3, Jetty closes the connection on the server sooner than previous 
> versions (?), and the client does not end up getting notified of the original 
> exception at all and instead hits a connection reset exception. The result 
> was random fails due to connection reset throughout our tests and one 
> particular test failing consistently. Even before this update, it does not 
> seem like we are acting in a safe or 'behaved' manner, but our version of 
> Jetty was relaxed enough (or a bug was fixed?) for our tests to work out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to