[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218982#comment-17218982
 ] 

David Smiley commented on SOLR-14354:
-------------------------------------

I started to revert but there are some difficult merge conflicts.  Can you 
please handle [~caomanhdat]?  If not, I shall and share a PR with you.

> HttpShardHandler send requests in async
> ---------------------------------------
>
>                 Key: SOLR-14354
>                 URL: https://issues.apache.org/jira/browse/SOLR-14354
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Cao Manh Dat
>            Assignee: Cao Manh Dat
>            Priority: Blocker
>             Fix For: master (9.0), 8.7
>
>         Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path";)
>         // Add request hooks
>         .onRequestQueued(request -> { ... })
>         .onRequestBegin(request -> { ... })
>         // Add response hooks
>         .onResponseBegin(response -> { ... })
>         .onResponseHeaders(response -> { ... })
>         .onResponseContent((response, buffer) -> { ... })
>         .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
>     // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
>     try (is) {
>       // Read the content from InputStream
>     }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
>     try {
>       byte[] response = getContent();
>       //handling response
>     }
>   }
> }); {code}
> On receiving data, Jetty (one of its thread) will call the listener with the 
> given data (data here is just byte[] represent part of the response). The 
> listener will then buffer that byte[] into an internal buffer. When all the 
> data are received, Jetty will call onComplete of the listener and inside that 
> method we will get all the response.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handle.send(req, (byte[]) -> {
>   // handling data here
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-12-00-661.png!
> Pros:
>  * We don’t need additional threads for each request → Less threads
>  * No thread are activately waiting for data from an InputStream → Threads 
> are more busy
> Cons
>  * Data must be buffered all before able being to parse → double memory being 
> used for parsing a response.
> h2. 5. Solution 3: Why not both?
> Solution 1 is good for parsing very large response or sometimes _unbounded_ 
> (like in StreamingExpression) response.
> Solution 2 is good for parsing small response (may be < 10KB) since overhead 
> is little.
> Should we combine both solutions above? After all what is returned by 
> HttpSolrClient so far for all requests is a NamedList<>, so as long as we can 
> return a NamedList<> using Solution 1 or Solution 2 are not matter with users.
> Therefore the idea here is based on “CONTENT_LENGTH” of the response’s 
> headers. If the response body less than a certain size we will go with 
> solution 2 and vice versa.
> _Note:_ Solr seems doesn’t return content-length accurately, need more 
> investigation.
> h2. 6. Further improvement
>  The best approach to solve this problem is instead of converting InputStream 
> to NamedList, why don’t we just converting byte by byte and make it 
> resumable. Like this
> {code:java}
> Parser parser = new Parser();
> public void onContent(ByteBuffer buffer) {
>   parser.parse(buffer)
> }
> public void onComplete() {
>   NamedList<> result = parser.getResult();
> } {code}
>  Therefore, there will be no blocking operation inside parser, thus making a 
> very efficient model. But doing this requires tons of change in Solr, rewrite 
> all ResponseParsers in Solr, not mention the flow here must be rewritten. Not 
> sure it is worth it for doing that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to