[
https://issues.apache.org/jira/browse/SOLR-16099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621333#comment-17621333
]
Kevin Risden edited comment on SOLR-16099 at 11/10/22 5:45 PM:
---------------------------------------------------------------
SOLR-15955 is trying to get to Jetty 10 - based on the commit in the fixed
issue upstream -
https://github.com/eclipse/jetty.project/commit/dab4fe60d305416dd1d4dbd8da855e0482b63de9
isn't in Jetty 10.0.12 yet.
was (Author: risdenk):
https://issues.apache.org/jira/browse/SOLR-15955 is trying to get to Jetty 10 -
based on the commit in the fixed issue upstream -
https://github.com/eclipse/jetty.project/commit/dab4fe60d305416dd1d4dbd8da855e0482b63de9
isn't in Jetty 10.0.12 yet.
> HTTP Client threads can hang in Jetty's InputStreamResponseListener when
> using HTTP2 - impacts intra-node communication
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-16099
> URL: https://issues.apache.org/jira/browse/SOLR-16099
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud, SolrJ
> Reporter: Chris M. Hostetter
> Priority: Major
>
> There apearrs to be a Jetty HttpClient bug that makes it possible for a
> request thread to hang indefinitely while waiting to parse the response from
> remote jetty servers. The cause of the hung thread is because it calls
> {{.wait()}} on monitor lock that _should_ be notified by another (internal
> jetty client) thread when a chunk of data is available from the wire – but in
> some cases this evidently may not happen.
> In the case of {{distrib=true}} requests processed by Solr (aggregating
> multiple per-shard responses from other nodes) this can manifest with stack
> traces that look like the following (taken from Solr 8.8.2)...
> {noformat}
> "thread",{
> "id":14253,
> "name":"httpShardExecutor-7-thread-13819-...",
> "state":"WAITING",
>
> "lock":"org.eclipse.jetty.client.util.InputStreamResponseListener@12b59075",
> "lock-waiting":{
>
> "name":"org.eclipse.jetty.client.util.InputStreamResponseListener@12b59075",
> "owner":null},
>
> "synchronizers-locked":["java.util.concurrent.ThreadPoolExecutor$Worker@1ec1aed0"],
> "cpuTime":"65.4882ms",
> "userTime":"60.0000ms",
> "stackTrace":["[email protected]/java.lang.Object.wait(Native Method)",
> "[email protected]/java.lang.Object.wait(Unknown Source)",
>
> "org.eclipse.jetty.client.util.InputStreamResponseListener$Input.read(InputStreamResponseListener.java:318)",
>
> "org.apache.solr.common.util.FastInputStream.readWrappedStream(FastInputStream.java:90)",
>
> "org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:99)",
>
> "org.apache.solr.common.util.FastInputStream.readByte(FastInputStream.java:217)",
> "org.apache.solr.common.util.JavaBinCodec._init(JavaBinCodec.java:211)",
>
> "org.apache.solr.common.util.JavaBinCodec.initRead(JavaBinCodec.java:202)",
>
> "org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:195)",
>
> "org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:51)",
>
> "org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:696)",
>
> "org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:412)",
>
> "org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:761)",
>
> "org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:369)",
>
> "org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:297)",
>
> "org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:371)",
>
> "org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:132)",
>
> "org.apache.solr.handler.component.ShardRequestor.call(ShardRequestor.java:41)",
> "[email protected]/java.util.concurrent.FutureTask.run(Unknown Source)",
>
> "[email protected]/java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> Source)",
> "[email protected]/java.util.concurrent.FutureTask.run(Unknown Source)",
>
> "com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:180)",
>
> "org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)",
>
> "org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$175/0x0000000840243c40.run(Unknown
> Source)",
>
> "[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
> Source)",
>
> "[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source)",
> "[email protected]/java.lang.Thread.run(Unknown Source)"]},
> {noformat}
> ...these {{httpShardExecutor}} threads can stay hung, tying up system
> resources, indefinitely (unless they get a spuriuos {{notify()}} from the
> JVM). (In once case, it seems to have caused a request to hang for {*}10.3
> hours{*})
> Anecdotally:
> * There is some evidence that this problem did _*NOT*_ affect Solr 8.6.3,
> but does affect later versions
> ** suggesting the bug didn't exist in Jetty until _after_ 9.4.27.v20200227
> * Forcing the Jetty HttpClient to use HTTP1.1 transport seems to prevent
> this problem from happening
> ** In Solr this can be done by setting the {{"solr.http1"}} system property
> ** Or using the {{Http2SolrClient.Builder.useHttp1_1()}} method in client
> application code
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]