[jira] [Commented] (SOLR-18087) HTTP/2 Struggles With Streaming Large Responses

Luke Kot-Zaniewski (Jira) Tue, 10 Feb 2026 08:28:36 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18057623#comment-18057623
 ]


Luke Kot-Zaniewski commented on SOLR-18087:
-------------------------------------------

[~nazerke] Thanks for taking a look. Were you able to reproduce the slowness 
and stalling with large payloads? Depending on that if we say the issue is:

{quote} Fix client side first, so consumes fast (maybe the client is slow due 
to parsing/gc) {quote}

Then it is still somewhat interesting that this parsing bug/inefficiency 
affects http 2 _more_ than http 1 with large document payloads. Making parsing 
more resource efficient would strictly be a win. However, if we aren't able to 
find such a massive gain from this the other option down this path would be to 
actually buffer _more_ payload in _application_ memory which eagerly frees the 
client window but at the cost of having more on-heap. This, of course is tricky 
because you want some form of back-pressure but the question becomes "how much" 
to ensure stability.

I do confirm that tuning flow control windows avoids stalling (I believe I had 
that somewhere in the report) but I was still surprised by how much slower HTTP 
2 was in the test configurations I presented. This is why I would also 
appreciate to know if you were able to recreate those.


[~dsmiley] 

{quote} We have a work-around. Many people simply won't notice (I think) – we 
generally haven't noticed yet the problems seems to be for a number of 
releases. {quote}

The workaround is to use an older protocol which may not be sustainable in the 
long term. I do wonder if some users are overriding solr.http=true in their 
launch configuration because this workaround has been tribal knowledge for some 
time:

https://stackoverflow.com/questions/63335013/solr-max-requests-queued-per-destination-3000-exceeded-for-httpdestination-tim

https://issues.apache.org/jira/browse/SOLR-16229

https://lists.apache.org/thread/m0z7jnpll6fv110pw8mm8p9qx1dfodwn

https://apachesolr.slack.com/archives/C01GVPZSSK0/p1681395076935209?thread_ts=1681392590.337879&cid=C01GVPZSSK0

Btw we still see http2-related "SOLR Max requests queued per destination 3000 
exceeded" for clouds we haven't switched to http1 yet (while running latest 9.X)

> HTTP/2 Struggles With Streaming Large Responses
> -----------------------------------------------
>
>                 Key: SOLR-18087
>                 URL: https://issues.apache.org/jira/browse/SOLR-18087
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Luke Kot-Zaniewski
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: flow-control-stall.log, index-recovery-tests.md, 
> stream-benchmark-results.md
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There appear to be some severe regressions after expansion of HTTP/2 client 
> usage since at least 9.8, most notably with the stream handler as well as 
> index recovery. The impact is at the very least slowness and in some cases 
> outright response stalling. The obvious thing these two very different 
> workloads share in common is that they stream large responses. This means, 
> among other things, that they may be more directly impacted by HTTP2's flow 
> control mechanism. More specifically, the response stalling appears to be 
> caused by session window "cannibalization", i.e.  shards 1 and 2's responses 
> occupy the entirety of the session window *but* haven't been consumed yet, 
> and then, say, TupleStream calls next on shard N (because it is at the top of 
> the priority queue) but the server has nowhere to put this response since 
> shards 1 and 2 have exhausted the client buffer.
> In my testing I have tweaked the following parameters:
>  # http1 vs http2 - as stated, http1 seems to be strictly better as in faster 
> and more stable.
>  # shards per node - the greater the number of shards per node the more 
> (large, simultaneous) responses share a single connection during inter-node 
> communication. This has generally resulted in poorer performance.
>  # maxConcurrentStreams - reducing this to, say 1, can effectively circumvent 
> multiplexing. Circumventing multiplexing does seem to improve index recovery 
> in HTTP/2 but this is not a good setting to keep for production use because 
> it is global and affects *everything*, not just recovery or streaming.
>  # initialSessionRecvWindow - This is the amount of buffer the client gets 
> initially for each connection. This gets shared by the many responses that 
> share the multiplexed connection.
>  #  initialStreamRecvWindow - This is the amount of buffer each stream gets 
> initially within a single HTTP/2 session. I've found that when this is too 
> big relative to initialSessionRecvWindow it can lead to stalling because of 
> flow control enforcement
> # Simple vs Buffering Flow Control Strategy - Controls how frequently the 
> client sends a WINDOW_UPDATE frame to signal the server to send more data. 
> "Simple" sends the frame after consuming any amount of bytes while 
> "Buffering" waits until a consumption threshold is met. So far "Simple" has 
> NOT worked reliably for me and probably why the default is "Buffering".
> I’m attaching summaries of my findings, some of which can be reproduced by 
> running the appropriate benchmark in this 
> [branch|https://github.com/kotman12/solr/tree/http2-shenanigans|https://github.com/kotman12/solr/tree/http2-shenanigans].
>  The stream benchmark results md file includes the command I ran to achieve 
> the result described. 
> Next steps:
> Reproduce this in a pure jetty example. I am beginning to think multiple 
> large responses getting streamed simultaneously between the same client and 
> server may some kind of edge case in the library or protocol, itself. It may 
> have something to do with how Jetty's InputStreamResponseListener is 
> implemented although according to the docs it _should_ be compatible with 
> HTTP/2. Furthermore, there may be some other levers offered by HTTP/2 which 
> are not yet exposed by the Jetty API.
> On the other hand, we could consider having separate connection pools for 
> HTTP clients that stream large responses. There seems to be at least [some 
> precedent|https://www.akamai.com/site/en/documents/research-paper/domain-sharding-for-faster-http2-in-lossy-cellular-networks.pdf]
>  for doing this.
> > We investigate and develop a new domain-sharding technique that isolates 
> > large downloads on separate TCP connections, while keeping downloads of 
> > small objects on a single connection.
> HTTP/2 seems designed for [bursty, small 
> traffic|https://hpbn.co/http2/?utm_source=chatgpt.com#one-connection-per-origin]
>  which is why flow-control may not impact it as much. Also, if your payload 
> is small relative to your header then HTTP/2's header compression might be a 
> big win for you but in the case of large responses, not as much. 
> > Most HTTP transfers are short and bursty, whereas TCP is optimized for 
> > long-lived, bulk data transfers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-18087) HTTP/2 Struggles With Streaming Large Responses

Reply via email to