[jira] [Commented] (SOLR-14991) tag and remove obsolete branches

2020-11-12 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17230628#comment-17230628
 ] 

Cao Manh Dat commented on SOLR-14991:
-

thank you Erick, I am ok on removing that!

> tag and remove obsolete branches
> 
>
> Key: SOLR-14991
> URL: https://issues.apache.org/jira/browse/SOLR-14991
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I'm going to gradually work through the branches, tagging and removing
> 1> anything with a Jira name that's fixed
> 2> anything that I'm certain will never be fixed (e.g. the various gradle 
> build branches)
> So the changes will still available, they just won't pollute the branch list.
> I'll list the branches here, all the tags will be
> history/branches/lucene-solr/
>  
> This specifically will _not_ include
> 1> any release, e.g. branch_8_4
> 2> anything I'm unsure about. People who've created branches should expect 
> some pings about this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14354) HttpShardHandler send requests in async

2020-10-26 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14354:

Fix Version/s: (was: 8.7)

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0)
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png, 
> image-2020-10-23-16-45-20-034.png, image-2020-10-23-16-45-21-789.png, 
> image-2020-10-23-16-45-37-628.png
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] 

[jira] [Resolved] (SOLR-14354) HttpShardHandler send requests in async

2020-10-26 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-14354.
-
Resolution: Fixed

[~atri] yes it is no longer a blocker now.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0)
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png, 
> image-2020-10-23-16-45-20-034.png, image-2020-10-23-16-45-21-789.png, 
> image-2020-10-23-16-45-37-628.png
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) 

[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-10-25 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220313#comment-17220313
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Done on reverting, thank you [~dsmiley]

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png, 
> image-2020-10-23-16-45-20-034.png, image-2020-10-23-16-45-21-789.png, 
> image-2020-10-23-16-45-37-628.png
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void 

[jira] [Issue Comment Deleted] (SOLR-14354) HttpShardHandler send requests in async

2020-10-22 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14354:

Comment: was deleted

(was: Commit 0226b16ce88918482d00b2cc49c092e96d0339bf in lucene-solr's branch 
refs/heads/jira/SOLR-14684-revert from Cao Manh Dat
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0226b16 ]

Revert "SOLR-14354: Fix compile errors after cherry-pick"

This reverts commit 21d811d29615ac47bc51e12a4f9f216af8463c3f.
)

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called 

[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-10-22 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219012#comment-17219012
 ] 

Cao Manh Dat commented on SOLR-14354:
-

[~dsmiley] can you review the PR that I just pushed? 
(https://github.com/apache/lucene-solr/pull/2019)

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   

[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-10-19 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17216711#comment-17216711
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Let's do a vote here, shall we?

I'm +1 with keeping this as it is now.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   

[jira] [Resolved] (SOLR-10370) ReplicationHandler should fetch index at fixed delay instead of fixed rate

2020-10-07 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-10370.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

> ReplicationHandler should fetch index at fixed delay instead of fixed rate
> --
>
> Key: SOLR-10370
> URL: https://issues.apache.org/jira/browse/SOLR-10370
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: SOLR-10370.patch
>
>
> Right now polling is scheduledAtFixedRate, so if we have a replication 
> process take a long time to run + very short poll interval. Many replication 
> processes will be started immediately one after another



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-09-24 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201437#comment-17201437
 ] 

Cao Manh Dat commented on SOLR-14354:
-

[~ichattopadhyaya] sorry but I gave solr-bench a shot and it seems with the 
default config-local.json finish too quick – even when I tried to increase the 
size of queryLog 56k queries, but the total time seems unchanged. With the 
total time of 500ms for query benchmark, the result can't say anything at all. 

Again I'm ok with reverting, but I kinda sad when good things unable to reach 
users soon. 

[~sarkaramr...@gmail.com] I heard that you have the toolbox for doing benchmark 
in a heavy scenario. If possible can you do that and post the result here? It 
will be a big help for us and our users.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With 

[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-09-23 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201179#comment-17201179
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Thank Mark for your nice words. 

[~ichattopadhyaya] I will try to do benchmark based on your project above. If 
I'm not be able to finish it before 8.7 release then reverting it will be a 
good option.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new 

[jira] [Commented] (SOLR-14776) Precompute the fingerprint during PeerSync

2020-09-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193951#comment-17193951
 ] 

Cao Manh Dat commented on SOLR-14776:
-

> Is there a chance that the core is changing between subsequent calls and so 
> it is not safe to use the same, maybe from incoming updates?

During leader election there shouldn't be any updates coming from the old 
leader (we are not doing anything to prevent that), if there are the current 
fingerprint strategy's here is also fail

>  It looks like there is already caching on SolrCore.getIndexFingerprint, is 
>that broken or insufficient in some way?

Yeah, but we only compute fingerprint during leader election. Therefore after a 
very heavy indexing, then the leader goes away the first call to fingerprint 
gonna takes awhile then slowdown the leader election

>  Thinking about this more, is the big win that we compute our own fingerprint 
>during the time that would normally be spent waiting, and we decrease latency 
>that way?

This can be a way to do that, another solution here is on commit the segment we 
already compute fingerprint for that segment, but I think it is worth to do in 
another issue.

> Precompute the fingerprint during PeerSync
> --
>
> Key: SOLR-14776
> URL: https://issues.apache.org/jira/browse/SOLR-14776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Computing fingerprint can very costly and take time. But the current 
> implementation will send requests for getting fingerprint for multiple 
> replicas, then on the first response it will then compute its own fingerprint 
> for comparison. A very simple but effective improvement here is compute its 
> own fingerprint right after send requests to other replicas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14776) Precompute the fingerprint during PeerSync

2020-08-27 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14776:

Description: Computing fingerprint can very costly and take time. But the 
current implementation will send requests for getting fingerprint for multiple 
replicas, then on the first response it will then compute its own fingerprint 
for comparison. A very simple but effective improvement here is compute its own 
fingerprint right after send requests to other replicas.  (was: Computing 
fingerprint can very costly and take time. )

> Precompute the fingerprint during PeerSync
> --
>
> Key: SOLR-14776
> URL: https://issues.apache.org/jira/browse/SOLR-14776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Computing fingerprint can very costly and take time. But the current 
> implementation will send requests for getting fingerprint for multiple 
> replicas, then on the first response it will then compute its own fingerprint 
> for comparison. A very simple but effective improvement here is compute its 
> own fingerprint right after send requests to other replicas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-25 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183767#comment-17183767
 ] 

Cao Manh Dat commented on SOLR-14684:
-

I would like to wait for couple of days for any new failures caused by this 
change. If everything is good then I will backport to branch_8x and close this 
issue.

> CloudExitableDirectoryReaderTest failing about 25% of the time
> --
>
> Key: SOLR-14684
> URL: https://issues.apache.org/jira/browse/SOLR-14684
> Project: Solr
>  Issue Type: Test
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: stdout
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If I beast this on my local machine, it fails (non reproducibly of course) 
> about 1/4 of the time. Log attached. The test itself hasn't changed in 11 
> months or so.
> It looks like occasionally the calls throw an error rather than return 
> partial results with a message: "Time allowed to handle this request 
> exceeded:[]".
> It's been failing very intermittently for a couple of years, but the failure 
> rate really picked up in the last couple of weeks. IDK whether the failures 
> prior to the last couple of weeks are the same root cause.
> I'll do some spelunking to see if I can pinpoint the commit that made this 
> happen, but it'll take a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14776) Precompute the fingerprint during PeerSync

2020-08-24 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14776:
---

 Summary: Precompute the fingerprint during PeerSync
 Key: SOLR-14776
 URL: https://issues.apache.org/jira/browse/SOLR-14776
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Computing fingerprint can very costly and take time. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14771) Reproducible failure for LBSolrClientTest

2020-08-23 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-14771.
-
Resolution: Fixed

Solved by latest commit of SOLR-14684

> Reproducible failure for LBSolrClientTest
> -
>
> Key: SOLR-14771
> URL: https://issues.apache.org/jira/browse/SOLR-14771
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
>
> ./gradlew :solr:solrj:test --tests 
> "org.apache.solr.client.solrj.impl.LBSolrClientTest" 
> -Ptests.seed=E6AFE16CC61929A6 -Ptests.file.encoding=US-ASCII



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14713) Single thread on streaming updates

2020-08-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181534#comment-17181534
 ] 

Cao Manh Dat commented on SOLR-14713:
-

I will post a report generated by solr-bench, but our internal run by 
[~sarkaramr...@gmail.com] did not introduce any hurt in performance.

> Single thread on streaming updates
> --
>
> Key: SOLR-14713
> URL: https://issues.apache.org/jira/browse/SOLR-14713
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Or great simplify SolrCmdDistributor
> h2. Current way for fan out updates of Solr
> Currently on receiving an updateRequest, Solr will create a new 
> UpdateProcessors for handling that request, then it parses one by one 
> document from the request and let’s processor handle it.
> {code:java}
> onReceiving(UpdateRequest update):
>   processors = createNewProcessors();
>   for (Document doc : update) {
> processors.handle(doc)
> }
> {code}
> Let’s say the number of replicas in the current shard is N, updateProcessor 
> will create N-1 queues and runners for each other replica.
>  Runner is basically a thread that dequeues updates from its corresponding 
> queue and sends it to a corresponding replica endpoint.
> Note 1: all Runners share the same client hence connection pool and same 
> thread pool. 
>  Note 2: A runner will send all documents of its UpdateRequest in a single 
> HTTP POST request (to reduce the number of threads for handling requests on 
> the other side). Therefore its lifetime equals the total time of handling its 
> UpdateRequest. Below is a typical activity that happens in a runner's life 
> cycle.
> h2. Problems of current approach
> The current approach have two problems:
>  - Problem 1: It uses lots of threads for fan out requests.
>  - Problem 2 which is more important: it is very complex. Solr is also using 
> ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation 
> allows using a single queue but multiple runners for same queue (although we 
> only use one runner at max) this raise the complexity of the whole flow up to 
> the top. Single fix for a problem can raise multiple problems later, i.e: in 
> SOLR-13975 on trying to handle the problem when the other endpoint is hanging 
> out for so long, we introduced a bug that lets the runner keep running even 
> when the updateRequest is fully handled in the leader.
> h2. Doing everything in single thread
> Since we are already supporting sending requests in an async manner, why 
> don’t we let the main thread which is handling the update request to send 
> updates to all others without the need of runners or queues. The code will be 
> something like this
> {code:java}
>  Class UpdateProcessor:
>Map pendingOutStreams
>
>func handleAddDoc(doc):
>   for (replica: replicas):
>   pendingOutStreams.get(replica).send(doc)
>
>func onEndUpdateRequest():
>   pendingOutStreams.values().forEach(out -> 
> closeAndHandleResponse(out)){code}
>  
> By doing this we will use less threads and the code is much more simpler and 
> cleaner. Of course that there will be some downgrade in the time for handling 
> an updateRequest since we are doing it serially instead of concurrently. In a 
> formal way it will be like this
> {code:java}
>  oldTime = timeForIndexing(update) + timeForSendingUpdates(update)
>  newTime = timeForIndexing(update) + (N-1) * 
> timeForSendingUpdates(update){code}
> But I believe that timeForIndexing is much more than timeForSendingUpdates so 
> we do not really need to be concerned about this. Even that is really a 
> problem users can simply create more threads for indexing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14713) Single thread on streaming updates

2020-08-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17181016#comment-17181016
 ] 

Cao Manh Dat commented on SOLR-14713:
-

[~dsmiley] The patch already introduced an async api like this
{code}
// this already make request to url endpoint, without any data yet
Outstream outstream = client.initOutstream(url);
// subsequence call, async call, no need to wait for response from other 
endpoint
outstream.send(updateRequest) //many call
// close stream and parsing response
outstream.close()
{code}
This is totally a different thing comparing to async API for query.


> Single thread on streaming updates
> --
>
> Key: SOLR-14713
> URL: https://issues.apache.org/jira/browse/SOLR-14713
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Or great simplify SolrCmdDistributor
> h2. Current way for fan out updates of Solr
> Currently on receiving an updateRequest, Solr will create a new 
> UpdateProcessors for handling that request, then it parses one by one 
> document from the request and let’s processor handle it.
> {code:java}
> onReceiving(UpdateRequest update):
>   processors = createNewProcessors();
>   for (Document doc : update) {
> processors.handle(doc)
> }
> {code}
> Let’s say the number of replicas in the current shard is N, updateProcessor 
> will create N-1 queues and runners for each other replica.
>  Runner is basically a thread that dequeues updates from its corresponding 
> queue and sends it to a corresponding replica endpoint.
> Note 1: all Runners share the same client hence connection pool and same 
> thread pool. 
>  Note 2: A runner will send all documents of its UpdateRequest in a single 
> HTTP POST request (to reduce the number of threads for handling requests on 
> the other side). Therefore its lifetime equals the total time of handling its 
> UpdateRequest. Below is a typical activity that happens in a runner's life 
> cycle.
> h2. Problems of current approach
> The current approach have two problems:
>  - Problem 1: It uses lots of threads for fan out requests.
>  - Problem 2 which is more important: it is very complex. Solr is also using 
> ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation 
> allows using a single queue but multiple runners for same queue (although we 
> only use one runner at max) this raise the complexity of the whole flow up to 
> the top. Single fix for a problem can raise multiple problems later, i.e: in 
> SOLR-13975 on trying to handle the problem when the other endpoint is hanging 
> out for so long, we introduced a bug that lets the runner keep running even 
> when the updateRequest is fully handled in the leader.
> h2. Doing everything in single thread
> Since we are already supporting sending requests in an async manner, why 
> don’t we let the main thread which is handling the update request to send 
> updates to all others without the need of runners or queues. The code will be 
> something like this
> {code:java}
>  Class UpdateProcessor:
>Map pendingOutStreams
>
>func handleAddDoc(doc):
>   for (replica: replicas):
>   pendingOutStreams.get(replica).send(doc)
>
>func onEndUpdateRequest():
>   pendingOutStreams.values().forEach(out -> 
> closeAndHandleResponse(out)){code}
>  
> By doing this we will use less threads and the code is much more simpler and 
> cleaner. Of course that there will be some downgrade in the time for handling 
> an updateRequest since we are doing it serially instead of concurrently. In a 
> formal way it will be like this
> {code:java}
>  oldTime = timeForIndexing(update) + timeForSendingUpdates(update)
>  newTime = timeForIndexing(update) + (N-1) * 
> timeForSendingUpdates(update){code}
> But I believe that timeForIndexing is much more than timeForSendingUpdates so 
> we do not really need to be concerned about this. Even that is really a 
> problem users can simply create more threads for indexing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175118#comment-17175118
 ] 

Cao Manh Dat commented on SOLR-14354:
-

[~rishisankar] sure, if you can also do the benchmark that Ishan ask, it will 
be even better :D 

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response 

[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174251#comment-17174251
 ] 

Cao Manh Dat commented on SOLR-14641:
-

bq. I disagree. In general, whoever wishes to introduce a change should own the 
performance testing, no matter who actually does it. Others can volunteer, but 
ultimate obligation should remain with the committer introducing the change.

I said that because I feel that you did not even take a look at the commit, if 
you do you will see that the needs for a perf run here is not neccessary. 

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174241#comment-17174241
 ] 

Cao Manh Dat commented on SOLR-14641:
-

But that quite non-sense to me from the point of who did the commit to do 
performance test for this one since this change just basically remove 
deprecated code rather than optimization. Basically what we used to do here is
 * asking nodes wether they support versionRanges or not
 * if true (this is the default value since 7.0) go with versionRanges handling 
(instead of concerte versions).

Changes made by this issue is
 * always go with verisionRanges since we know that all other nodes can support 
that so it quite wasteful to ask first.

So if there are any performance regression it already happen long time ago.

Anyway I'm ok with revert the change and letting your benchmark work finish if 
that makes thing easier.

 

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174236#comment-17174236
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Ok then I will try my best to run it.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   //handling response
> }
>   }
> 

[jira] [Comment Edited] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174228#comment-17174228
 ] 

Cao Manh Dat edited comment on SOLR-14641 at 8/10/20, 10:29 AM:


I believe the right way to ensure performance is coming up with something like 
lucene bench, so every downgrade and upgrade will be recorded and can be 
watched (per multiple commits). It doesn't make sense to asking everyone do a 
dedicated performance test before and after their commits.


was (Author: caomanhdat):
I believe the right way to ensure performance is coming up with something like 
lucene bench, so every downgrade and upgrade will be recorded and can be 
watched. It doesn't make sense to asking everyone do a dedicated performance 
test before and after their commits.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174228#comment-17174228
 ] 

Cao Manh Dat commented on SOLR-14641:
-

I believe the right way to ensure performance is coming up with something like 
lucene bench, so every downgrade and upgrade will be recorded and can be 
watched. It doesn't make sense to asking everyone do a dedicated performance 
test before and after their commits.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174226#comment-17174226
 ] 

Cao Manh Dat commented on SOLR-14641:
-

I kinda hesitate to do such performance testing for this one, what is the 
reason behind that? This issue just simply remove codepath that no longer used.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174224#comment-17174224
 ] 

Cao Manh Dat commented on SOLR-14354:
-

[~ichattopadhyaya], fair enough, do you want to do the benchmark?

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   

[jira] [Commented] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174220#comment-17174220
 ] 

Cao Manh Dat commented on SOLR-14641:
-

[~ichattopadhyaya] I don't think this will be a seeable boost in time, since 
this request is very lightweight.

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-08-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17174184#comment-17174184
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Hi [~dsmiley], I will try my best to spend sometime to do this, can't promise.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   

[jira] [Resolved] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-08-10 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-14641.
-
Fix Version/s: 8.7
   Resolution: Fixed

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.7
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-07 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173089#comment-17173089
 ] 

Cao Manh Dat commented on SOLR-14684:
-

I created a PR for this one, can you trying to reproduce the problem 
[~erickerickson]? Thanks a lot!

> CloudExitableDirectoryReaderTest failing about 25% of the time
> --
>
> Key: SOLR-14684
> URL: https://issues.apache.org/jira/browse/SOLR-14684
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: stdout
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If I beast this on my local machine, it fails (non reproducibly of course) 
> about 1/4 of the time. Log attached. The test itself hasn't changed in 11 
> months or so.
> It looks like occasionally the calls throw an error rather than return 
> partial results with a message: "Time allowed to handle this request 
> exceeded:[]".
> It's been failing very intermittently for a couple of years, but the failure 
> rate really picked up in the last couple of weeks. IDK whether the failures 
> prior to the last couple of weeks are the same root cause.
> I'll do some spelunking to see if I can pinpoint the commit that made this 
> happen, but it'll take a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-07 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172992#comment-17172992
 ] 

Cao Manh Dat commented on SOLR-14684:
-

Opened SOLR-14719 for that.

> CloudExitableDirectoryReaderTest failing about 25% of the time
> --
>
> Key: SOLR-14684
> URL: https://issues.apache.org/jira/browse/SOLR-14684
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: stdout
>
>
> If I beast this on my local machine, it fails (non reproducibly of course) 
> about 1/4 of the time. Log attached. The test itself hasn't changed in 11 
> months or so.
> It looks like occasionally the calls throw an error rather than return 
> partial results with a message: "Time allowed to handle this request 
> exceeded:[]".
> It's been failing very intermittently for a couple of years, but the failure 
> rate really picked up in the last couple of weeks. IDK whether the failures 
> prior to the last couple of weeks are the same root cause.
> I'll do some spelunking to see if I can pinpoint the commit that made this 
> happen, but it'll take a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14719) Handling exceeding timeAllowed consistently

2020-08-07 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14719:
---

 Summary: Handling exceeding timeAllowed consistently
 Key: SOLR-14719
 URL: https://issues.apache.org/jira/browse/SOLR-14719
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat


Continue from SOLR-14684 where HttpShardHandler should skipping routing 
requests to other shards if timeAllowed already exceeded.

But I kinda feel we do not handling exceeding in timeAllowed consitently 
between different places in code (node that is aggregate query result vs nodes 
that execute the query). That leads to different error/response.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-07 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172990#comment-17172990
 ] 

Cao Manh Dat commented on SOLR-14684:
-

Ok, I figured out the problem for this issue. In the past we never check wether 
a request is already timeout or not before sending request to other shards. So 
in the test we can set timeAllowed with minimal value like 1 (1ms), SOLR-14354 
actually do the check before sending query request to other shards.

Therefore it leads to above failure, my gut feeling that change made by 
SOLR-14354 is correct, but I did not know a quick way to solving the test 
failure properly. Therefore I will revert the behaviour like we used to do and 
fire another issue for this which needs more thought and checks.

> CloudExitableDirectoryReaderTest failing about 25% of the time
> --
>
> Key: SOLR-14684
> URL: https://issues.apache.org/jira/browse/SOLR-14684
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: stdout
>
>
> If I beast this on my local machine, it fails (non reproducibly of course) 
> about 1/4 of the time. Log attached. The test itself hasn't changed in 11 
> months or so.
> It looks like occasionally the calls throw an error rather than return 
> partial results with a message: "Time allowed to handle this request 
> exceeded:[]".
> It's been failing very intermittently for a couple of years, but the failure 
> rate really picked up in the last couple of weeks. IDK whether the failures 
> prior to the last couple of weeks are the same root cause.
> I'll do some spelunking to see if I can pinpoint the commit that made this 
> happen, but it'll take a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14713) Single thread on streaming updates

2020-08-06 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172203#comment-17172203
 ] 

Cao Manh Dat commented on SOLR-14713:
-

I created a PR for this, it is not finish yet and no tests so far. But the PR 
also solve the problem of incorrectly handling retry request. Here is the 
scenario:
 * {{UpdateRequest}} is converted to multiple {{Req}}s
 * Solr failed to send the second Req
 * Solr retry the first Req (since we only refer/point to the first one)
 * It success 
 * The whole UpdateRequest becomes success.

 

> Single thread on streaming updates
> --
>
> Key: SOLR-14713
> URL: https://issues.apache.org/jira/browse/SOLR-14713
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Or great simplify SolrCmdDistributor
> h2. Current way for fan out updates of Solr
> Currently on receiving an updateRequest, Solr will create a new 
> UpdateProcessors for handling that request, then it parses one by one 
> document from the request and let’s processor handle it.
> {code:java}
> onReceiving(UpdateRequest update):
>   processors = createNewProcessors();
>   for (Document doc : update) {
> processors.handle(doc)
> }
> {code}
> Let’s say the number of replicas in the current shard is N, updateProcessor 
> will create N-1 queues and runners for each other replica.
>  Runner is basically a thread that dequeues updates from its corresponding 
> queue and sends it to a corresponding replica endpoint.
> Note 1: all Runners share the same client hence connection pool and same 
> thread pool. 
>  Note 2: A runner will send all documents of its UpdateRequest in a single 
> HTTP POST request (to reduce the number of threads for handling requests on 
> the other side). Therefore its lifetime equals the total time of handling its 
> UpdateRequest. Below is a typical activity that happens in a runner's life 
> cycle.
> h2. Problems of current approach
> The current approach have two problems:
>  - Problem 1: It uses lots of threads for fan out requests.
>  - Problem 2 which is more important: it is very complex. Solr is also using 
> ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation 
> allows using a single queue but multiple runners for same queue (although we 
> only use one runner at max) this raise the complexity of the whole flow up to 
> the top. Single fix for a problem can raise multiple problems later, i.e: in 
> SOLR-13975 on trying to handle the problem when the other endpoint is hanging 
> out for so long, we introduced a bug that lets the runner keep running even 
> when the updateRequest is fully handled in the leader.
> h2. Doing everything in single thread
> Since we are already supporting sending requests in an async manner, why 
> don’t we let the main thread which is handling the update request to send 
> updates to all others without the need of runners or queues. The code will be 
> something like this
> {code:java}
>  Class UpdateProcessor:
>Map pendingOutStreams
>
>func handleAddDoc(doc):
>   for (replica: replicas):
>   pendingOutStreams.get(replica).send(doc)
>
>func onEndUpdateRequest():
>   pendingOutStreams.values().forEach(out -> 
> closeAndHandleResponse(out)){code}
>  
> By doing this we will use less threads and the code is much more simpler and 
> cleaner. Of course that there will be some downgrade in the time for handling 
> an updateRequest since we are doing it serially instead of concurrently. In a 
> formal way it will be like this
> {code:java}
>  oldTime = timeForIndexing(update) + timeForSendingUpdates(update)
>  newTime = timeForIndexing(update) + (N-1) * 
> timeForSendingUpdates(update){code}
> But I believe that timeForIndexing is much more than timeForSendingUpdates so 
> we do not really need to be concerned about this. Even that is really a 
> problem users can simply create more threads for indexing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14713) Single thread on streaming updates

2020-08-06 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14713:
---

 Summary: Single thread on streaming updates
 Key: SOLR-14713
 URL: https://issues.apache.org/jira/browse/SOLR-14713
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Or great simplify SolrCmdDistributor
h2. Current way for fan out updates of Solr

Currently on receiving an updateRequest, Solr will create a new 
UpdateProcessors for handling that request, then it parses one by one document 
from the request and let’s processor handle it.
{code:java}
onReceiving(UpdateRequest update):
  processors = createNewProcessors();
  for (Document doc : update) {
processors.handle(doc)
}
{code}
Let’s say the number of replicas in the current shard is N, updateProcessor 
will create N-1 queues and runners for each other replica.
 Runner is basically a thread that dequeues updates from its corresponding 
queue and sends it to a corresponding replica endpoint.

Note 1: all Runners share the same client hence connection pool and same thread 
pool. 
 Note 2: A runner will send all documents of its UpdateRequest in a single HTTP 
POST request (to reduce the number of threads for handling requests on the 
other side). Therefore its lifetime equals the total time of handling its 
UpdateRequest. Below is a typical activity that happens in a runner's life 
cycle.
h2. Problems of current approach

The current approach have two problems:
 - Problem 1: It uses lots of threads for fan out requests.
 - Problem 2 which is more important: it is very complex. Solr is also using 
ConcurrentUpdateSolrClient (CUSC for short) for that, CUSC implementation 
allows using a single queue but multiple runners for same queue (although we 
only use one runner at max) this raise the complexity of the whole flow up to 
the top. Single fix for a problem can raise multiple problems later, i.e: in 
SOLR-13975 on trying to handle the problem when the other endpoint is hanging 
out for so long, we introduced a bug that lets the runner keep running even 
when the updateRequest is fully handled in the leader.

h2. Doing everything in single thread

Since we are already supporting sending requests in an async manner, why don’t 
we let the main thread which is handling the update request to send updates to 
all others without the need of runners or queues. The code will be something 
like this
{code:java}
 Class UpdateProcessor:
   Map pendingOutStreams
   
   func handleAddDoc(doc):
  for (replica: replicas):
  pendingOutStreams.get(replica).send(doc)
   
   func onEndUpdateRequest():
  pendingOutStreams.values().forEach(out -> 
closeAndHandleResponse(out)){code}
 

By doing this we will use less threads and the code is much more simpler and 
cleaner. Of course that there will be some downgrade in the time for handling 
an updateRequest since we are doing it serially instead of concurrently. In a 
formal way it will be like this
{code:java}
 oldTime = timeForIndexing(update) + timeForSendingUpdates(update)
 newTime = timeForIndexing(update) + (N-1) * timeForSendingUpdates(update){code}
But I believe that timeForIndexing is much more than timeForSendingUpdates so 
we do not really need to be concerned about this. Even that is really a problem 
users can simply create more threads for indexing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time

2020-07-28 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166837#comment-17166837
 ] 

Cao Manh Dat commented on SOLR-14684:
-

I will take a look.

> CloudExitableDirectoryReaderTest failing about 25% of the time
> --
>
> Key: SOLR-14684
> URL: https://issues.apache.org/jira/browse/SOLR-14684
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: stdout
>
>
> If I beast this on my local machine, it fails (non reproducibly of course) 
> about 1/4 of the time. Log attached. The test itself hasn't changed in 11 
> months or so.
> It looks like occasionally the calls throw an error rather than return 
> partial results with a message: "Time allowed to handle this request 
> exceeded:[]".
> It's been failing very intermittently for a couple of years, but the failure 
> rate really picked up in the last couple of weeks. IDK whether the failures 
> prior to the last couple of weeks are the same root cause.
> I'll do some spelunking to see if I can pinpoint the commit that made this 
> happen, but it'll take a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-07-10 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155836#comment-17155836
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Thank you [~erickerickson], I gonna push a fix for this soon. I should be more 
careful on cherry-pick changes.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   

[jira] [Updated] (SOLR-14641) PeerSync Remove canHandleVersionRanges check

2020-07-10 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14641:

Summary: PeerSync Remove canHandleVersionRanges check  (was: Remove 
canHandleVersionRanges check)

> PeerSync Remove canHandleVersionRanges check
> 
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14641) Remove canHandleVersionRanges check

2020-07-10 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14641:
---

 Summary: Remove canHandleVersionRanges check
 Key: SOLR-14641
 URL: https://issues.apache.org/jira/browse/SOLR-14641
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
7.0. To maintain backward compatibility at the time we introduce an endpoint in 
RealTimeGetComponent to check whether a node support that feature or not. 
It served well its purpose and it should be removed to reduce complexity and a 
request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14641) PeerSync, remove canHandleVersionRanges check

2020-07-10 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14641:

Summary: PeerSync, remove canHandleVersionRanges check  (was: PeerSync 
Remove canHandleVersionRanges check)

> PeerSync, remove canHandleVersionRanges check
> -
>
> Key: SOLR-14641
> URL: https://issues.apache.org/jira/browse/SOLR-14641
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> SOLR-9207 introduces PeerSync with updates range which committed in 6.2 and 
> 7.0. To maintain backward compatibility at the time we introduce an endpoint 
> in RealTimeGetComponent to check whether a node support that feature or not. 
> It served well its purpose and it should be removed to reduce complexity and 
> a request-response trip for asking that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-07-07 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152611#comment-17152611
 ] 

Cao Manh Dat commented on SOLR-14354:
-

Hi [~noble.paul] your comment is true but here are few things
* In the past HttpShardHandler already send requests in async (spin a new 
thread and call request in sync manner) to prevent blocking the caller thread. 
After this commit HttpShardHandler still send requests in async manner, so no 
change from caller's view.
* The difference here is in the past, HttpShardHandler based on the size of 
{{urls}} input will decide to use LBClient or HttpClient, now we only use 
LBClient regardless of cases. I think that make things clearer (you can see me 
comment on the PR for reasons)
* I think HttpShardHandler should only be used in the case of sending multiple 
independence requests. If someplaces want to send its request in sync manner, 
it should use Http2SolrClient instead. And yes we should fix these places.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The 

[jira] [Resolved] (SOLR-14354) HttpShardHandler send requests in async

2020-07-06 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-14354.
-
Resolution: Fixed

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   //handling response
> }
>   }
> }); {code}
> On receiving data, Jetty (one of its 

[jira] [Updated] (SOLR-14354) HttpShardHandler send requests in async

2020-07-06 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14354:

Fix Version/s: 8.7
   master (9.0)

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   //handling response
> }
>   }
> }); {code}
> On 

[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-07-06 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152435#comment-17152435
 ] 

Cao Manh Dat commented on SOLR-14354:
-

The commit end up with solution 1 and letting further exploration in another 
issue.

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   //handling response
> }
>   

[jira] [Updated] (SOLR-14354) HttpShardHandler send requests in async

2020-06-26 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14354:

Summary: HttpShardHandler send requests in async  (was: Async or using 
threads in better way for HttpShardHandler)

> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it 
> is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
> try {
>   byte[] response = getContent();
>   //handling response
> }
>   }

[jira] [Commented] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-21 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112881#comment-17112881
 ] 

Cao Manh Dat commented on SOLR-14419:
-

Ok then it +1 from me.

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112720#comment-17112720
 ] 

Cao Manh Dat edited comment on SOLR-14419 at 5/21/20, 2:12 AM:
---

{quote}
Query DSL objects need to go into dedicated {{queries}} property see SOLR-12490:
{quote}
If that is the case, will it confusing? It will be simpler for user to assume 
that, this part (inside main query or filters)
{code}
{"param": "paramName"} 
{code}
will be translated to
{code}
paramValue // can be a string, a list, a json object picked from params
{code}

{quote}
 I recently get to solving this puzzle it's really tricky. I can share how to 
if you wish to see.
{quote}
Yes, this makes me curious.




was (Author: caomanhdat):
{quote}
Query DSL objects need to go into dedicated {{queries}} property see SOLR-12490:
{quote}
If that is the case, will it confusing? It will be simpler for user to assume 
that, this part
{code}
{"param": "paramName"} 
{code}
will be translated to
{code}
paramValue // can be a string, a list, a json object.
{code}

{quote}
 I recently get to solving this puzzle it's really tricky. I can share how to 
if you wish to see.
{quote}
Yes, this makes me curious.



> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112720#comment-17112720
 ] 

Cao Manh Dat commented on SOLR-14419:
-

{quote}
Query DSL objects need to go into dedicated {{queries}} property see SOLR-12490:
{quote}
If that is the case, will it confusing? It will be simpler for user to assume 
that, this part
{code}
{"param": "paramName"} 
{code}
will be translated to
{code}
paramValue // can be a string, a list, a json object.
{code}

{quote}
 I recently get to solving this puzzle it's really tricky. I can share how to 
if you wish to see.
{quote}
Yes, this makes me curious.



> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112182#comment-17112182
 ] 

Cao Manh Dat edited comment on SOLR-14419 at 5/20/20, 1:15 PM:
---

When I say paramValue as a JsonObject, I mean this
{code:json}
{ "query": { 
   "bool":{ "must": {"param":"must_clauses"}
  , "must_not":{"param":{"must_not_clauses"}}
  }},
  "params": { 
  "must_clauses":["type:parent", "type2:parent"],
  "must_not_clauses" : {"bool": {...}}
}
}
 {code}
 


was (Author: caomanhdat):
When I say paramValue as a JsonObject, I mean this
{ "query": { "bool":{ "must":{"param":"must_clauses"}, 
"must_not":{"param":\{"must_not_clauses"  "params": {  
"must_clauses":["type:parent", "type2:parent"],
  "must_not_clauses" : \{"bool": {...}}
   }
}

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112182#comment-17112182
 ] 

Cao Manh Dat commented on SOLR-14419:
-

When I say paramValue as a JsonObject, I mean this
{ "query": { "bool":{ "must":{"param":"must_clauses"}, 
"must_not":{"param":\{"must_not_clauses"  "params": {  
"must_clauses":["type:parent", "type2:parent"],
  "must_not_clauses" : \{"bool": {...}}
   }
}

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112002#comment-17112002
 ] 

Cao Manh Dat edited comment on SOLR-14419 at 5/20/20, 9:58 AM:
---

 
{quote}recursive dependency
{quote}
My point here is the paramValue here is just a String ("type:parent"), It will 
be nice if paramValue is a JsonObject, then recurisve dependency will be a 
problem.
{quote}_feature is kinda limited_ I see no limits so far.
{quote}
I mean I don't see many usecase this feature will be useful?

Right, the $ will be a problem if the query start with $. Then how the 
tranditional local params solve that problem?


was (Author: caomanhdat):
 
{quote}recursive dependency
{quote}
My point here is the paramValue here is just a String ("type:parent"), It will 
be nice if paramValue is a JsonObject, then recurisve dependency will be a 
problem.
{quote}_feature is kinda limited_ I see no limits so far.
{quote}
I mean I don't see many usecase this feature will be useful?

Right, the $ will be a problem if the query start with $. 

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112002#comment-17112002
 ] 

Cao Manh Dat edited comment on SOLR-14419 at 5/20/20, 9:57 AM:
---

 
{quote}recursive dependency
{quote}
My point here is the paramValue here is just a String ("type:parent"), It will 
be nice if paramValue is a JsonObject, then recurisve dependency will be a 
problem.
{quote}_feature is kinda limited_ I see no limits so far.
{quote}
I mean I don't see many usecase this feature will be useful?

Right, the $ will be a problem if the query start with $. 


was (Author: caomanhdat):
 
{quote}recursive dependency
{quote}
My point here is the paramValue here is just a String ("type:parent"), It will 
be nice if paramValue is a JsonObject, then recurisve dependency will be a 
problem.
{quote}_feature is kinda limited_ I see no limits so far.
{quote}
I mean I don't see many usecase this feature will be useful?

 

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17112002#comment-17112002
 ] 

Cao Manh Dat commented on SOLR-14419:
-

 
{quote}recursive dependency
{quote}
My point here is the paramValue here is just a String ("type:parent"), It will 
be nice if paramValue is a JsonObject, then recurisve dependency will be a 
problem.
{quote}_feature is kinda limited_ I see no limits so far.
{quote}
I mean I don't see many usecase this feature will be useful?

 

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111882#comment-17111882
 ] 

Cao Manh Dat edited comment on SOLR-14419 at 5/20/20, 7:53 AM:
---

It seems {{{'param': 'paramName'}}} too verbose and vague at the same time? Can 
it be {{$paramName}} only (I don't like special character, but we already have 
tags).

It seems like paramValue can only be a String. If we support paramValue as a 
Json object, it may leads to recursive dependency, i.e: paramA -> paramB -> 
paramA -> etc

So the application of this feature is kinda limited, is it?


was (Author: caomanhdat):
It seems \{'param': 'paramName'} too verbose and vague at the same time? Can it 
be {{$paramName}} only (I don't like special character, but we already have 
tags).


 It seems like paramValue can only be a String. If we support paramValue as a 
Json object, it may leads to recursive dependency, i.e: paramA -> paramB -> 
paramA -> ...

So the application of this feature is kinda limited, is it?

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111882#comment-17111882
 ] 

Cao Manh Dat edited comment on SOLR-14419 at 5/20/20, 7:53 AM:
---

It seems \{'param': 'paramName'} too verbose and vague at the same time? Can it 
be {{$paramName}} only (I don't like special character, but we already have 
tags).

It seems like paramValue can only be a String. If we support paramValue as a 
Json object, it may leads to recursive dependency, i.e: paramA -> paramB -> 
paramA -> etc

So the application of this feature is kinda limited, is it?


was (Author: caomanhdat):
It seems {{{'param': 'paramName'}}} too verbose and vague at the same time? Can 
it be {{$paramName}} only (I don't like special character, but we already have 
tags).

It seems like paramValue can only be a String. If we support paramValue as a 
Json object, it may leads to recursive dependency, i.e: paramA -> paramB -> 
paramA -> etc

So the application of this feature is kinda limited, is it?

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111882#comment-17111882
 ] 

Cao Manh Dat commented on SOLR-14419:
-

It seems {'param': 'paramName'} too verbose and vague at the same time? Can it 
be {{$paramName}} only (I don't like special character, but we already have 
tags).
It seems like paramValue can only be a String. If we support paramValue as a 
Json object, it may leads to recursive dependency, i.e: paramA -> paramB -> 
paramA -> ...
So the application of this feature is kinda limited, is it?


> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-20 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111882#comment-17111882
 ] 

Cao Manh Dat edited comment on SOLR-14419 at 5/20/20, 7:52 AM:
---

It seems \{'param': 'paramName'} too verbose and vague at the same time? Can it 
be {{$paramName}} only (I don't like special character, but we already have 
tags).


 It seems like paramValue can only be a String. If we support paramValue as a 
Json object, it may leads to recursive dependency, i.e: paramA -> paramB -> 
paramA -> ...

So the application of this feature is kinda limited, is it?


was (Author: caomanhdat):
It seems {'param': 'paramName'} too verbose and vague at the same time? Can it 
be {{$paramName}} only (I don't like special character, but we already have 
tags).
It seems like paramValue can only be a String. If we support paramValue as a 
Json object, it may leads to recursive dependency, i.e: paramA -> paramB -> 
paramA -> ...
So the application of this feature is kinda limited, is it?


> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14419) Query DLS {"param":"ref"}

2020-05-19 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111061#comment-17111061
 ] 

Cao Manh Dat commented on SOLR-14419:
-

Hi [~mkhl], I'm going to look into the patch tomorrow.

> Query DLS {"param":"ref"}
> -
>
> Key: SOLR-14419
> URL: https://issues.apache.org/jira/browse/SOLR-14419
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: JSON Request API
>Reporter: Mikhail Khludnev
>Assignee: Mikhail Khludnev
>Priority: Major
> Fix For: 8.6
>
> Attachments: SOLR-14419.patch, SOLR-14419.patch, SOLR-14419.patch
>
>
> What we can do with plain params: 
> {{q=\{!parent which=$prnts}...=type:parent}}
> obviously I want to have something like this in Query DSL:
> {code}
> { "query": { "parents":{ "which":{"param":"prnts"}, "query":"..."}}
>   "params": {
>   "prnts":"type:parent"
>}
> }
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-14488) Making replica from leader configurable

2020-05-19 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14488:

Comment: was deleted

(was: Hi [~cpoerschke], my plan here is introduce another block for config tlog 
replica and pull replica. Putting everything under  of , seems counterintuitive to 
me. Something like this
{code}

  ... same content as  tag in  ...

{code})

> Making replica from leader configurable
> ---
>
> Key: SOLR-14488
> URL: https://issues.apache.org/jira/browse/SOLR-14488
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Right now, users can't configure related parameters for replicating from 
> leader process. Like {{commitReserveDuration}}, throttling, etc.
> The default 10s value of {{commitReserveDuration}} can making replicate from 
> leader failed constantly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14488) Making replica from leader configurable

2020-05-19 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110894#comment-17110894
 ] 

Cao Manh Dat commented on SOLR-14488:
-

Hi [~cpoerschke], my plan here is introduce another block for config tlog 
replica and pull replica. Putting everything under  of , seems counterintuitive to 
me. Something like this
{code}

  ... same content as  tag in  ...

{code}

> Making replica from leader configurable
> ---
>
> Key: SOLR-14488
> URL: https://issues.apache.org/jira/browse/SOLR-14488
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: replication (java)
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Right now, users can't configure related parameters for replicating from 
> leader process. Like {{commitReserveDuration}}, throttling, etc.
> The default 10s value of {{commitReserveDuration}} can making replicate from 
> leader failed constantly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14488) Making replica from leader configurable

2020-05-14 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14488:
---

 Summary: Making replica from leader configurable
 Key: SOLR-14488
 URL: https://issues.apache.org/jira/browse/SOLR-14488
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Right now, users can't configure related parameters for replicating from leader 
process. Like {{commitReserveDuration}}, throttling, etc.
The default 10s value of {{commitReserveDuration}} can making replicate from 
leader failed constantly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-12642) SolrCmdDistributor should send updates in batch when use Http2SolrClient?

2020-04-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-12642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094072#comment-17094072
 ] 

Cao Manh Dat commented on SOLR-12642:
-

Above numbers are not correct. Basically above benchmark was performed on a 
buggy Jetty version. During indexing, leader failed to send updates to replicas 
quite frequently without networking issue, therefore replicas were put into 
recovery mode. Replicas did not do indexing much so that result the gain in CPU 
time and garbage generated. On Leader's side the CPU time is 50% more since it 
need to stream index files back to replicas during recovery.

> SolrCmdDistributor should send updates in batch when use Http2SolrClient?
> -
>
> Key: SOLR-12642
> URL: https://issues.apache.org/jira/browse/SOLR-12642
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: 8.0
>
> Attachments: http2-branch.log, master-branch.log
>
>
> In the past, batch updates are sent in a single stream from the leader, the 
> replica will create a single thread to parse all the updates. For the 
> simplicity of {{SOLR-12605}}, the leader is now sending individual updates to 
> replicas, therefore they are now parsing updates in different threads which 
> increase the usage of memory and CPU.
> In the past, this is an unacceptable approach, because, for every update, we 
> must create different connections to replicas. But with the support of 
> HTTP/2, all updates will be sent in a single connection from leader to a 
> replica. Therefore the cost is not as high as it used to be.
> On the other hand, sending individual updates will improve the indexing 
> performance and better error-handling for failures of a single update in a 
> batch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync should not fail with SocketTimeoutException from hanging nodes

2020-04-02 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Fix Version/s: 8.6
   master (9.0)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> PeerSync should not fail with SocketTimeoutException from hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14356) PeerSync should not fail with SocketTimeoutException from hanging nodes

2020-04-02 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat reassigned SOLR-14356:
---

Assignee: Cao Manh Dat

> PeerSync should not fail with SocketTimeoutException from hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.6
>
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync should not fail with SocketTimeoutException from hanging nodes

2020-04-02 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Summary: PeerSync should not fail with SocketTimeoutException from hanging 
nodes  (was: PeerSync with hanging nodes)

> PeerSync should not fail with SocketTimeoutException from hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14365) CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values

2020-03-31 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071619#comment-17071619
 ] 

Cao Manh Dat commented on SOLR-14365:
-

[~jbernste] [~shalin] please take a look at the patch.

> CollapsingQParser - Avoiding always allocate int[] and float[] with size 
> equals to number of unique values
> --
>
> Key: SOLR-14365
> URL: https://issues.apache.org/jira/browse/SOLR-14365
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.4.1
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14365.patch
>
>
> Since Collapsing is a PostFilter, documents reach Collapsing must match with 
> all filters and queries, so the number of documents Collapsing need to 
> collect/compute score is a small fraction of the total number documents in 
> the index. So why do we need to always consume the memory (for int[] and 
> float[] array) for all unique values of the collapsed field? If the number of 
> unique values of the collapsed field found in the documents that match 
> queries and filters is 300 then we only need int[] and float[] array with 
> size of 300 and not 1.2 million in size. However, we don't know which value 
> of the collapsed field will show up in the results so we cannot use a smaller 
> array.
> The easy fix for this problem is using as much as we need by using IntIntMap 
> and IntFloatMap that hold primitives and are much more space efficient than 
> the Java HashMap. These maps can be slower (10x or 20x) than plain int[] and 
> float[] if matched documents is large (almost all documents matched queries 
> and other filters). But our belief is that does not happen that frequently 
> (how frequently do we run collapsing on the entire index?).
> For this issue I propose adding 2 methods for collapsing which is
> * array : which is current implementation
> * hash : which is new approach and will be default method
> later we can add another method {{smart}} which is automatically pick method 
> based on comparision between {{number of docs matched queries and filters}} 
> and {{number of unique values of the field}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14365) CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values

2020-03-31 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071610#comment-17071610
 ] 

Cao Manh Dat commented on SOLR-14365:
-

Attached a WIP patch that includes

* seperate out the logic of creating {{map}}s (backed by {{array}} or 
{{hashMap}}) and how collapse are using it. So we just need to create 
{{mapFactory}} correspond to a method.
* WIP benchmark test, to ensure that we did achieve something in case of 
collapsing on partial of the result.

> CollapsingQParser - Avoiding always allocate int[] and float[] with size 
> equals to number of unique values
> --
>
> Key: SOLR-14365
> URL: https://issues.apache.org/jira/browse/SOLR-14365
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.4.1
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14365.patch
>
>
> Since Collapsing is a PostFilter, documents reach Collapsing must match with 
> all filters and queries, so the number of documents Collapsing need to 
> collect/compute score is a small fraction of the total number documents in 
> the index. So why do we need to always consume the memory (for int[] and 
> float[] array) for all unique values of the collapsed field? If the number of 
> unique values of the collapsed field found in the documents that match 
> queries and filters is 300 then we only need int[] and float[] array with 
> size of 300 and not 1.2 million in size. However, we don't know which value 
> of the collapsed field will show up in the results so we cannot use a smaller 
> array.
> The easy fix for this problem is using as much as we need by using IntIntMap 
> and IntFloatMap that hold primitives and are much more space efficient than 
> the Java HashMap. These maps can be slower (10x or 20x) than plain int[] and 
> float[] if matched documents is large (almost all documents matched queries 
> and other filters). But our belief is that does not happen that frequently 
> (how frequently do we run collapsing on the entire index?).
> For this issue I propose adding 2 methods for collapsing which is
> * array : which is current implementation
> * hash : which is new approach and will be default method
> later we can add another method {{smart}} which is automatically pick method 
> based on comparision between {{number of docs matched queries and filters}} 
> and {{number of unique values of the field}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14365) CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values

2020-03-31 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14365:

Attachment: SOLR-14365.patch

> CollapsingQParser - Avoiding always allocate int[] and float[] with size 
> equals to number of unique values
> --
>
> Key: SOLR-14365
> URL: https://issues.apache.org/jira/browse/SOLR-14365
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.4.1
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14365.patch
>
>
> Since Collapsing is a PostFilter, documents reach Collapsing must match with 
> all filters and queries, so the number of documents Collapsing need to 
> collect/compute score is a small fraction of the total number documents in 
> the index. So why do we need to always consume the memory (for int[] and 
> float[] array) for all unique values of the collapsed field? If the number of 
> unique values of the collapsed field found in the documents that match 
> queries and filters is 300 then we only need int[] and float[] array with 
> size of 300 and not 1.2 million in size. However, we don't know which value 
> of the collapsed field will show up in the results so we cannot use a smaller 
> array.
> The easy fix for this problem is using as much as we need by using IntIntMap 
> and IntFloatMap that hold primitives and are much more space efficient than 
> the Java HashMap. These maps can be slower (10x or 20x) than plain int[] and 
> float[] if matched documents is large (almost all documents matched queries 
> and other filters). But our belief is that does not happen that frequently 
> (how frequently do we run collapsing on the entire index?).
> For this issue I propose adding 2 methods for collapsing which is
> * array : which is current implementation
> * hash : which is new approach and will be default method
> later we can add another method {{smart}} which is automatically pick method 
> based on comparision between {{number of docs matched queries and filters}} 
> and {{number of unique values of the field}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync with hanging nodes

2020-03-29 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Attachment: SOLR-14356.patch

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync with hanging nodes

2020-03-29 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Status: Patch Available  (was: Open)

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync with hanging nodes

2020-03-29 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Status: Open  (was: Patch Available)

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync with hanging nodes

2020-03-29 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Attachment: (was: SOLR-14356.patch)

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync with hanging nodes

2020-03-29 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Status: Patch Available  (was: Open)

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync with hanging nodes

2020-03-29 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Attachment: SOLR-14356.patch

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch, SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14368) SyncStrategy result should not prevent a replica to become leader

2020-03-27 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14368:

Description: 
h2. History

In the beginning of SolrCloud, to become leader a replica will need to _sync_ 
with other replicas, This process includes
 * Compare the current replica (leader’s candidate) tlog with others replicas. 
For example if current candidate’s data is too behind others, that replica 
should not become leader.
 * Requesting other replicas to do a sync back before become leader, so imagine 
when the old leader got shut down when it trying to send multiple updates (u1, 
u2, u3, u4) to others
 * Replica A may receive updates (u1, u2)
 * Replica B may receive updates (u3, u4)
 * If replica A becomes leader and it does not request replica B to sync back, 
replica B then needs to go into a recovery process which is costly.

But this process have some problem
 # We only sync with live replicas, so in case of no others live replicas at 
the time of the election, current replica can blindly become leader -> data 
loss, this problem was fixed with SOLR-11702
 # For any IOException which is not catched properly during the communication 
process with the current replica and others can prevent that replica becoming 
leader.

h2. Idea

Basically with new ShardTerms information, we can pick arbitrary replicas with 
the highest _term_ to become leader. The reason here is replica’s _term_ 
effectively represents how close a replica is up-to-date with the leader.

The only meaning of _sync_ with other replicas now is to prevent costly 
recovery processes from happening. Therefore SyncStrategy should not prevent a 
replica from becoming a leader.

  was:Update later...


> SyncStrategy result should not prevent a replica to become leader
> -
>
> Key: SOLR-14368
> URL: https://issues.apache.org/jira/browse/SOLR-14368
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> h2. History
> In the beginning of SolrCloud, to become leader a replica will need to _sync_ 
> with other replicas, This process includes
>  * Compare the current replica (leader’s candidate) tlog with others 
> replicas. For example if current candidate’s data is too behind others, that 
> replica should not become leader.
>  * Requesting other replicas to do a sync back before become leader, so 
> imagine when the old leader got shut down when it trying to send multiple 
> updates (u1, u2, u3, u4) to others
>  * Replica A may receive updates (u1, u2)
>  * Replica B may receive updates (u3, u4)
>  * If replica A becomes leader and it does not request replica B to sync 
> back, replica B then needs to go into a recovery process which is costly.
> But this process have some problem
>  # We only sync with live replicas, so in case of no others live replicas at 
> the time of the election, current replica can blindly become leader -> data 
> loss, this problem was fixed with SOLR-11702
>  # For any IOException which is not catched properly during the communication 
> process with the current replica and others can prevent that replica becoming 
> leader.
> h2. Idea
> Basically with new ShardTerms information, we can pick arbitrary replicas 
> with the highest _term_ to become leader. The reason here is replica’s _term_ 
> effectively represents how close a replica is up-to-date with the leader.
> The only meaning of _sync_ with other replicas now is to prevent costly 
> recovery processes from happening. Therefore SyncStrategy should not prevent 
> a replica from becoming a leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14356) PeerSync with hanging nodes

2020-03-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068520#comment-17068520
 ] 

Cao Manh Dat edited comment on SOLR-14356 at 3/27/20, 10:27 AM:


Second thought, I think it is sufficient now to just add the Exception to the 
list and revisit the retry problem in another issue. The reason here is
 * We already count ConnectTimeoutException as success
 * The more I think about how a replica count on the result of SyncStrategy to 
become leader the more I feel it error-prone. Will open another issue for this.

[~shalin] WDYT? Opened SOLR-14368


was (Author: caomanhdat):
Second thought, I think it is sufficient now to just add the Exception to the 
list and revisit the retry problem in another issue. The reason here is
 * We already count ConnectTimeoutException as success
 * The more I think about how a replica count on the result of SyncStrategy to 
become leader the more I feel it error-prone. Will open another issue for this.

[~shalin] WDYT?

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14368) SyncStrategy result should not prevent a replica to become leader

2020-03-27 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14368:
---

 Summary: SyncStrategy result should not prevent a replica to 
become leader
 Key: SOLR-14368
 URL: https://issues.apache.org/jira/browse/SOLR-14368
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Update later...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14356) PeerSync with hanging nodes

2020-03-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068520#comment-17068520
 ] 

Cao Manh Dat edited comment on SOLR-14356 at 3/27/20, 10:24 AM:


Second thought, I think it is sufficient now to just add the Exception to the 
list and revisit the retry problem in another issue. The reason here is
 * We already count ConnectTimeoutException as success
 * The more I think about how a replica count on the result of SyncStrategy to 
become leader the more I feel it error-prone. Will open another issue for this.

[~shalin] WDYT?


was (Author: caomanhdat):
Second thought, I think it is sufficient now to just add the Exception to the 
list and revisit the retry problem in another issue. The reason here is
 * We already count ConnectTimeoutException as success
 * The more I think about how a replica count on the result of SyncStrategy to 
become leader the more I feel it error-prone. Will open another issue for this.

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14356) PeerSync with hanging nodes

2020-03-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068520#comment-17068520
 ] 

Cao Manh Dat commented on SOLR-14356:
-

Second thought, I think it is sufficient now to just add the Exception to the 
list and revisit the retry problem in another issue. The reason here is
 * We already count ConnectTimeoutException as success
 * The more I think about how a replica count on the result of SyncStrategy to 
become leader the more I feel it error-prone. Will open another issue for this.

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync with hanging nodes

2020-03-27 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Attachment: SOLR-14356.patch

> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
> Attachments: SOLR-14356.patch
>
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14366) Adding number of backed up documents to backup.properties

2020-03-27 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat reassigned SOLR-14366:
---

Assignee: Cao Manh Dat

> Adding number of backed up documents to backup.properties
> -
>
> Key: SOLR-14366
> URL: https://issues.apache.org/jira/browse/SOLR-14366
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> This is a needed information but it is non trivial to achieve.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14366) Adding number of backed up documents to backup.properties

2020-03-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068382#comment-17068382
 ] 

Cao Manh Dat commented on SOLR-14366:
-

Basically now, we ask deletionPolicy to give us an IndexCommit which contains
 * list of files
 * number of segments
 * commit’s userData

But no maxDoc, since that information is obtained by sum up maxDoc from all 
segmentInfos. In theory we can get a IndexReader from SolrCore then get 
IndexCommit from that Reader instead of going through deletionPolicy. but 
getting IndexCommit from deletionPolicy give us a guarantee that all files 
relates to that commit are not deleted. Not sure we can get same guarantee from 
Reader.

Any idea [~hossman] [~varun]?

> Adding number of backed up documents to backup.properties
> -
>
> Key: SOLR-14366
> URL: https://issues.apache.org/jira/browse/SOLR-14366
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
>
> This is a needed information but it is non trivial to achieve.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14366) Adding number of backed up documents to backup.properties

2020-03-27 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14366:
---

 Summary: Adding number of backed up documents to backup.properties
 Key: SOLR-14366
 URL: https://issues.apache.org/jira/browse/SOLR-14366
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat


This is a needed information but it is non trivial to achieve.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14365) CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values

2020-03-26 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068280#comment-17068280
 ] 

Cao Manh Dat commented on SOLR-14365:
-

Hi [~jbernste] [~shalin] should we adding another method or just implicitly 
change the default implementation?

> CollapsingQParser - Avoiding always allocate int[] and float[] with size 
> equals to number of unique values
> --
>
> Key: SOLR-14365
> URL: https://issues.apache.org/jira/browse/SOLR-14365
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 8.4.1
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Since Collapsing is a PostFilter, documents reach Collapsing must match with 
> all filters and queries, so the number of documents Collapsing need to 
> collect/compute score is a small fraction of the total number documents in 
> the index. So why do we need to always consume the memory (for int[] and 
> float[] array) for all unique values of the collapsed field? If the number of 
> unique values of the collapsed field found in the documents that match 
> queries and filters is 300 then we only need int[] and float[] array with 
> size of 300 and not 1.2 million in size. However, we don't know which value 
> of the collapsed field will show up in the results so we cannot use a smaller 
> array.
> The easy fix for this problem is using as much as we need by using IntIntMap 
> and IntFloatMap that hold primitives and are much more space efficient than 
> the Java HashMap. These maps can be slower (10x or 20x) than plain int[] and 
> float[] if matched documents is large (almost all documents matched queries 
> and other filters). But our belief is that does not happen that frequently 
> (how frequently do we run collapsing on the entire index?).
> For this issue I propose adding 2 methods for collapsing which is
> * array : which is current implementation
> * hash : which is new approach and will be default method
> later we can add another method {{smart}} which is automatically pick method 
> based on comparision between {{number of docs matched queries and filters}} 
> and {{number of unique values of the field}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14365) CollapsingQParser - Avoiding always allocate int[] and float[] with size equals to number of unique values

2020-03-26 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14365:
---

 Summary: CollapsingQParser - Avoiding always allocate int[] and 
float[] with size equals to number of unique values
 Key: SOLR-14365
 URL: https://issues.apache.org/jira/browse/SOLR-14365
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 8.4.1
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Since Collapsing is a PostFilter, documents reach Collapsing must match with 
all filters and queries, so the number of documents Collapsing need to 
collect/compute score is a small fraction of the total number documents in the 
index. So why do we need to always consume the memory (for int[] and float[] 
array) for all unique values of the collapsed field? If the number of unique 
values of the collapsed field found in the documents that match queries and 
filters is 300 then we only need int[] and float[] array with size of 300 and 
not 1.2 million in size. However, we don't know which value of the collapsed 
field will show up in the results so we cannot use a smaller array.

The easy fix for this problem is using as much as we need by using IntIntMap 
and IntFloatMap that hold primitives and are much more space efficient than the 
Java HashMap. These maps can be slower (10x or 20x) than plain int[] and 
float[] if matched documents is large (almost all documents matched queries and 
other filters). But our belief is that does not happen that frequently (how 
frequently do we run collapsing on the entire index?).

For this issue I propose adding 2 methods for collapsing which is
* array : which is current implementation
* hash : which is new approach and will be default method
later we can add another method {{smart}} which is automatically pick method 
based on comparision between {{number of docs matched queries and filters}} and 
{{number of unique values of the field}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-8274) Add per-request MDC logging based on user-provided value.

2020-03-23 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17064618#comment-17064618
 ] 

Cao Manh Dat commented on SOLR-8274:


[~dsmiley] I think it can, right now it encodes basic data like trace-id, time 
into Http Headers so we can use that for passing anything else we want. 
But should we leverage for this case or not, i'm not sure.

> Add per-request MDC logging based on user-provided value.
> -
>
> Key: SOLR-8274
> URL: https://issues.apache.org/jira/browse/SOLR-8274
> Project: Solr
>  Issue Type: Improvement
>  Components: logging
>Reporter: Jason Gerlowski
>Priority: Minor
> Attachments: SOLR-8274.patch
>
>
> *Problem 1* Currently, there's no way (AFAIK) to find all log messages 
> associated with a particular request.
> *Problem 2* There's also no easy way for multi-tenant Solr setups to find all 
> log messages associated with a particular customer/tenant.
> Both of these problems would be more manageable if Solr could be configured 
> to record an MDC tag based on a header, or some other user provided value.
> This would allow admins to group together logs about a single request.  If 
> the same header value is repeated multiple times this functionality could 
> also be used to group together arbitrary requests, such as those that come 
> from a particular user, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14356) PeerSync with hanging nodes

2020-03-23 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14356:
---

 Summary: PeerSync with hanging nodes
 Key: SOLR-14356
 URL: https://issues.apache.org/jira/browse/SOLR-14356
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat


Right now in {{PeerSync}} (during leader election), in case of exception on 
requesting versions to a node, we will skip that node if exception is one the 
following type
* ConnectTimeoutException
* NoHttpResponseException
* SocketException
Sometime the other node basically hang but still accept connection. In that 
case SocketTimeoutException is thrown and we consider the {{PeerSync}} process 
as failed and the whole shard just basically leaderless forever (as long as the 
hang node still there).

We can't just blindly adding {{SocketTimeoutException}} to above list, since 
[~shalin] mentioned that sometimes timeout can happen because of genuine 
reasons too e.g. temporary GC pause.
I think the general idea here is we obey {{leaderVoteWait}} restriction and 
retry doing sync with others.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14356) PeerSync with hanging nodes

2020-03-23 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14356:

Description: 
Right now in {{PeerSync}} (during leader election), in case of exception on 
requesting versions to a node, we will skip that node if exception is one the 
following type
* ConnectTimeoutException
* NoHttpResponseException
* SocketException
Sometime the other node basically hang but still accept connection. In that 
case SocketTimeoutException is thrown and we consider the {{PeerSync}} process 
as failed and the whole shard just basically leaderless forever (as long as the 
hang node still there).

We can't just blindly adding {{SocketTimeoutException}} to above list, since 
[~shalin] mentioned that sometimes timeout can happen because of genuine 
reasons too e.g. temporary GC pause.
I think the general idea here is we obey {{leaderVoteWait}} restriction and 
retry doing sync with others in case of connection/timeout exception happen.


  was:
Right now in {{PeerSync}} (during leader election), in case of exception on 
requesting versions to a node, we will skip that node if exception is one the 
following type
* ConnectTimeoutException
* NoHttpResponseException
* SocketException
Sometime the other node basically hang but still accept connection. In that 
case SocketTimeoutException is thrown and we consider the {{PeerSync}} process 
as failed and the whole shard just basically leaderless forever (as long as the 
hang node still there).

We can't just blindly adding {{SocketTimeoutException}} to above list, since 
[~shalin] mentioned that sometimes timeout can happen because of genuine 
reasons too e.g. temporary GC pause.
I think the general idea here is we obey {{leaderVoteWait}} restriction and 
retry doing sync with others.



> PeerSync with hanging nodes
> ---
>
> Key: SOLR-14356
> URL: https://issues.apache.org/jira/browse/SOLR-14356
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Priority: Major
>
> Right now in {{PeerSync}} (during leader election), in case of exception on 
> requesting versions to a node, we will skip that node if exception is one the 
> following type
> * ConnectTimeoutException
> * NoHttpResponseException
> * SocketException
> Sometime the other node basically hang but still accept connection. In that 
> case SocketTimeoutException is thrown and we consider the {{PeerSync}} 
> process as failed and the whole shard just basically leaderless forever (as 
> long as the hang node still there).
> We can't just blindly adding {{SocketTimeoutException}} to above list, since 
> [~shalin] mentioned that sometimes timeout can happen because of genuine 
> reasons too e.g. temporary GC pause.
> I think the general idea here is we obey {{leaderVoteWait}} restriction and 
> retry doing sync with others in case of connection/timeout exception happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14354) Async or using threads in better way for HttpShardHandler

2020-03-22 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14354:
---

 Summary: Async or using threads in better way for HttpShardHandler
 Key: SOLR-14354
 URL: https://issues.apache.org/jira/browse/SOLR-14354
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat
 Attachments: image-2020-03-23-10-04-08-399.png, 
image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png

h2. 1. Current approach (problem) of Solr

Below is the diagram describe the model on how currently handling a request.

!image-2020-03-23-10-04-08-399.png!

The main-thread that handles the search requests, will submit n requests (n 
equals to number of shards) to an executor. So each request will correspond to 
a thread, after sending a request that thread basically do nothing just waiting 
for response from other side. That thread will be swapped out and CPU will try 
to handle another thread (this is called context switch, CPU will save the 
context of the current thread and switch to another one). When some data (not 
all) come back, that thread will be called to parsing these data, then it will 
wait until more data come back. So there will be lots of context switching in 
CPU. That is quite inefficient on using threads.Basically we want less threads 
and most of them must busy all the time, because threads are not free as well 
as context switching. That is the main idea behind everything, like executor
h2. 2. Async call of Jetty HttpClient

Jetty HttpClient offers async API like this.
{code:java}
httpClient.newRequest("http://domain.com/path;)
// Add request hooks
.onRequestQueued(request -> { ... })
.onRequestBegin(request -> { ... })

// Add response hooks
.onResponseBegin(response -> { ... })
.onResponseHeaders(response -> { ... })
.onResponseContent((response, buffer) -> { ... })

.send(result -> { ... }); {code}
Therefore after calling {{send()}} the thread will return immediately without 
any block. Then when the client received the header from other side, it will 
call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
all response) from the data it will call {{onContent(buffer)}} listeners. When 
everything finished it will call {{onComplete}} listeners. One main thing that 
will must notice here is all listeners should finish quick, if the listener 
block, all further data of that request won’t be handled until the listener 
finish.
h2. 3. Solution 1: Sending requests async but spin one thread per response


 Jetty HttpClient already provides several listeners, one of them is 
InputStreamResponseListener. This is how it is get used
{code:java}
InputStreamResponseListener listener = new InputStreamResponseListener();
client.newRequest(...).send(listener);

// Wait for the response headers to arrive
Response response = listener.get(5, TimeUnit.SECONDS);
if (response.getStatus() == 200) {
  // Obtain the input stream on the response content
  try (InputStream input = listener.getInputStream()) {
// Read the response content
  }
} {code}
In this case, there will be 2 thread
 * one thread trying to read the response content from InputStream
 * one thread (this is a short-live task) feeding content to above InputStream 
whenever some byte[] is available. Note that if this thread unable to feed data 
into InputStream, this thread will wait.

By using this one, the model of HttpShardHandler can be written into something 
like this
{code:java}
handler.sendReq(req, (is) -> {
  executor.submit(() ->
try (is) {
  // Read the content from InputStream
}
  )
}) {code}

 The first diagram will be changed into this

!image-2020-03-23-10-09-10-221.png!

Notice that although “sending req to shard1” is wide, it won’t take long time 
since sending req is a very quick operation. With this operation, handling 
threads won’t be spin up until first bytes are sent back. Notice that in this 
approach we still have active threads waiting for more data from InputStream
h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.

Jetty have another listener called BufferingResponseListener. This is how it is 
get used
{code:java}
client.newRequest(...).send(new BufferingResponseListener() {
  public void onComplete(Result result) {
try {
  byte[] response = getContent();
  //handling response
}
  }
}); {code}
On receiving data, Jetty (one of its thread) will call the listener with the 
given data (data here is just byte[] represent part of the response). The 
listener will then buffer that byte[] into an internal buffer. When all the 
data are received, Jetty will call onComplete of the listener and inside that 
method we will get all the response.

By using this one, the model of HttpShardHandler can be written 

[jira] [Comment Edited] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047131#comment-17047131
 ] 

Cao Manh Dat edited comment on SOLR-14286 at 2/28/20 1:45 AM:
--

Hoping that above fixes solved the problem. The only difference between 
"jeger-thrift-1.1.0.jar.sha1" and above files are a newline at the end of sha 
file. This problem seems happened before 
https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ded726a

I think we need to write out all steps needed to upgrade a library, there are 
several mistakes can be made easily. Above error was not be able to find during 
precommit.


was (Author: caomanhdat):
Hoping that above fixes solved the problem. The only difference between 
"jeger-thrift-1.1.0.jar.sha1" and above files are a newline at the end of sha 
file. This problem seems happened before 
https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ded726a

I think we need to write out all steps needed to upgrade a library, there are 
several mistakes can be made easily.

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.5
>
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047131#comment-17047131
 ] 

Cao Manh Dat commented on SOLR-14286:
-

Hoping that above fixes solved the problem. The only difference between 
"jeger-thrift-1.1.0.jar.sha1" and above files are a newline at the end of sha 
file. This problem seems happened before 
https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ded726a

I think we need to write out all steps needed to upgrade a library, there are 
several mistakes can be made easily.

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.5
>
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046583#comment-17046583
 ] 

Cao Manh Dat commented on SOLR-14286:
-

I think so because I tried to run precommit both on gradlew and ant multiple 
times.

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.5
>
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-14286.
-
Resolution: Fixed

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.5
>
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046457#comment-17046457
 ] 

Cao Manh Dat commented on SOLR-14286:
-

Hi [~janhoy], I'm seeing this problem on trying to run gradle precommit
{code}
Execution failed for task ':verifyLocks'.
> Found dependencies whose dependents changed:
  -io.opentracing:opentracing-api:0.33.0 (5 constraints: 4c3c8052)
  +io.opentracing:opentracing-api:0.33.0 (5 constraints: 4d3cfe52)
  -io.opentracing:opentracing-util:0.33.0 (3 constraints: f61f583b)
  +io.opentracing:opentracing-util:0.33.0 (3 constraints: f71f843b)
  -org.slf4j:slf4j-api:1.7.24 (18 constraints: 6ef487eb)
  +org.slf4j:slf4j-api:1.7.24 (18 constraints: 74f4c3f7)
  Please run './gradlew --write-locks'.
{code}
But it was not get changed by this issue? So should I run {{gradlew 
--write-locks}}?

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.5
>
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046434#comment-17046434
 ] 

Cao Manh Dat commented on SOLR-14286:
-

Thanks [~janhoy] I should move in a slower pace. 

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.5
>
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat resolved SOLR-14286.
-
Fix Version/s: 8.5
   master (9.0)
   Resolution: Fixed

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.5
>
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046305#comment-17046305
 ] 

Cao Manh Dat edited comment on SOLR-14286 at 2/27/20 8:10 AM:
--

yes [~janhoy], I'm doing backporting. My bad, it should be in 8.5.0 changes.txt


was (Author: caomanhdat):
yes [~janhoy], I'm doing backporting

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-27 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046305#comment-17046305
 ] 

Cao Manh Dat commented on SOLR-14286:
-

yes [~janhoy], I'm doing backporting

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-26 Thread Cao Manh Dat (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14286:

Issue Type: Improvement  (was: Bug)

> Upgrade Jaegar to 1.1.0
> ---
>
> Key: SOLR-14286
> URL: https://issues.apache.org/jira/browse/SOLR-14286
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
> JaegarTracer-Configurator module) which has several security issues. We 
> should upgrade to Jaegar 1.1.0 which compatible which the current version we 
> are using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14286) Upgrade Jaegar to 1.1.0

2020-02-26 Thread Cao Manh Dat (Jira)
Cao Manh Dat created SOLR-14286:
---

 Summary: Upgrade Jaegar to 1.1.0
 Key: SOLR-14286
 URL: https://issues.apache.org/jira/browse/SOLR-14286
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Cao Manh Dat
Assignee: Cao Manh Dat


Rohit Singh pointed to me that we are using thrift 0.12.0 (in 
JaegarTracer-Configurator module) which has several security issues. We should 
upgrade to Jaegar 1.1.0 which compatible which the current version we are 
using. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-12859) DocExpirationUpdateProcessorFactory does not work with BasicAuth

2020-01-17 Thread Cao Manh Dat (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-12859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017840#comment-17017840
 ] 

Cao Manh Dat commented on SOLR-12859:
-

After having a chat with [~shalin], we kinda think that the Hoss's initial 
approach is more valid than mine because
* it makes less significant change to the code base.
* even {{DefaultSolrThreadFactory}} belongs to solr-core, we can't enforce 
tests to not using that.

> DocExpirationUpdateProcessorFactory does not work with BasicAuth
> 
>
> Key: SOLR-12859
> URL: https://issues.apache.org/jira/browse/SOLR-12859
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 7.5
>Reporter: Varun Thacker
>Priority: Major
> Attachments: SOLR-12859.patch
>
>
> I setup a cluster with basic auth and then wanted to use Solr's TTL feature ( 
> DocExpirationUpdateProcessorFactory ) to auto-delete documents.
>  
> Turns out it doesn't work when Basic Auth is enabled. I get the following 
> stacktrace from the logs
> {code:java}
> 2018-10-12 22:06:38.967 ERROR (autoExpireDocs-42-thread-1) [   ] 
> o.a.s.u.p.DocExpirationUpdateProcessorFactory Runtime error in periodic 
> deletion of expired docs: Async exception during distributed update: Error 
> from server at http://192.168.0.8:8983/solr/gettingstarted_shard2_replica_n6: 
> require authentication
> request: 
> http://192.168.0.8:8983/solr/gettingstarted_shard2_replica_n6/update?update.distrib=TOLEADER=http%3A%2F%2F192.168.0.8%3A8983%2Fsolr%2Fgettingstarted_shard1_replica_n2%2F=javabin=2
> org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
>  Async exception during distributed update: Error from server at 
> http://192.168.0.8:8983/solr/gettingstarted_shard2_replica_n6: require 
> authentication
> request: 
> http://192.168.0.8:8983/solr/gettingstarted_shard2_replica_n6/update?update.distrib=TOLEADER=http%3A%2F%2F192.168.0.8%3A8983%2Fsolr%2Fgettingstarted_shard1_replica_n2%2F=javabin=2
>     at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:964)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1976)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:182)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 2018-09-18 13:07:55]
>     at 
> org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
>  ~[solr-core-7.5.0.jar:7.5.0 b5bf70b7e32d7ddd9742cc821d471c5fabd4e3df - 
> jimczi - 

  1   2   >