[jira] [Commented] (SOLR-14469) Removed deprecated code in solr/core (master only)

2020-07-09 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155126#comment-17155126
 ] 

David Smiley commented on SOLR-14469:
-

As you pursue this, remember to use the \{{master-deprecations}} branch, which 
was set up by Alan and I've done some significant. work there already for 
solr-core.

> Removed deprecated code in solr/core (master only)
> --
>
> Key: SOLR-14469
> URL: https://issues.apache.org/jira/browse/SOLR-14469
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I'm currently working on getting all the warnings out of the code, so this is 
> something of a placeholder for a week or two.
> There will be sub-tasks, please create them when you start working on a 
> project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris opened a new pull request #1662: Harden TestBuildingUpMemoryPressure

2020-07-09 Thread GitBox


atris opened a new pull request #1662:
URL: https://github.com/apache/lucene-solr/pull/1662


   1. Add specific checks for exception message expectation.
   2. Ensure that a non triggering value is returned when the fake circuit 
breaker should not be tripping.
   
   Also adds temporary logging for debugging, to be removed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14404) CoreContainer level custom requesthandlers

2020-07-09 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155098#comment-17155098
 ] 

David Smiley commented on SOLR-14404:
-

I think the test is fine with respect to this matter.  A user might very well 
have similar logic in client code; I think the user shouldn't have to add 
sleeps nor inspect ZK or similar.  The user uploaded 2.0 then it expects to use 
2.0 immediately after.  I think it should be up to Solr to figure out how to 
make this work.

For example, maybe we need to add a call to 
{{org.apache.zookeeper.ZooKeeper#sync}} before a ZK read like we already do for 
{{AliasManager.update()}} which is called in a number of places to ensure the 
collection alias information is up to date.  Remember that ZK is eventually 
consistent but there are mechanisms like sync to help.

> CoreContainer level custom requesthandlers
> --
>
> Key: SOLR-14404
> URL: https://issues.apache.org/jira/browse/SOLR-14404
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> caveats:
>  * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. 
> Which means only V2 APIs are supported
>  * The path should have prefix {{/api/plugin}}
> add a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin",
>   "class": "full.ClassName", 
>   "path-prefix" : "some-path-prefix"
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> add a plugin from a package
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", 
>   "class": "pkgName:full.ClassName" ,
>   "path-prefix" : "some-path-prefix"  ,  
>   "version: "1.0"   
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> remove a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "remove": "myplugin"
> }' http://localhost:8983/api/cluster/plugins
> {code}
> The configuration will be stored in the {{clusterprops.json}}
>  as
> {code:java}
> {
> "plugins" : {
> "myplugin" : {"class": "full.ClassName", "path-prefix" : "some-path-prefix" }
> }
> }
> {code}
> example plugin
> {code:java}
> public class MyPlugin {
>   private final CoreContainer coreContainer;
>   public MyPlugin(CoreContainer coreContainer) {
> this.coreContainer = coreContainer;
>   }
>   @EndPoint(path = "/$path-prefix/path1",
> method = METHOD.GET,
> permission = READ)
>   public void call(SolrQueryRequest req, SolrQueryResponse rsp){
> rsp.add("myplugin.version", "2.0");
>   }
> }
> {code}
> This plugin will be accessible on all nodes at 
> {{/api/some-path-prefix/path1}}. It's possible to add more methods at 
> different paths. Ensure that all paths start with {{$path-prefix}} because 
> that is the prefix in which the plugin is registered with. So 
> {{/some-path-prefix/path2}} , {{/some-path-prefix/my/deeply/nested/path}} are 
> all valid paths. 
> It's possible that the user chooses to register the plugin with a different 
> name. In that case , use a template variable as follows in paths 
> {{/cluster/some/other/path}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155084#comment-17155084
 ] 

ASF subversion and git services commented on SOLR-14354:


Commit 5538879bbdecfda48eec6625cf877c49ceb12d39 in lucene-solr's branch 
refs/heads/branch_8x from Cao Manh Dat
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5538879 ]

SOLR-14354: HttpShardHandler send requests in async (#1470)


> HttpShardHandler send requests in async
> ---
>
> Key: SOLR-14354
> URL: https://issues.apache.org/jira/browse/SOLR-14354
> Project: Solr
>  Issue Type: Improvement
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: image-2020-03-23-10-04-08-399.png, 
> image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n 
> equals to number of shards) to an executor. So each request will correspond 
> to a thread, after sending a request that thread basically do nothing just 
> waiting for response from other side. That thread will be swapped out and CPU 
> will try to handle another thread (this is called context switch, CPU will 
> save the context of the current thread and switch to another one). When some 
> data (not all) come back, that thread will be called to parsing these data, 
> then it will wait until more data come back. So there will be lots of context 
> switching in CPU. That is quite inefficient on using threads.Basically we 
> want less threads and most of them must busy all the time, because threads 
> are not free as well as context switching. That is the main idea behind 
> everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path;)
> // Add request hooks
> .onRequestQueued(request -> { ... })
> .onRequestBegin(request -> { ... })
> // Add response hooks
> .onResponseBegin(response -> { ... })
> .onResponseHeaders(response -> { ... })
> .onResponseContent((response, buffer) -> { ... })
> .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without 
> any block. Then when the client received the header from other side, it will 
> call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not 
> all response) from the data it will call {{onContent(buffer)}} listeners. 
> When everything finished it will call {{onComplete}} listeners. One main 
> thing that will must notice here is all listeners should finish quick, if the 
> listener block, all further data of that request won’t be handled until the 
> listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is 
> InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
> // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above 
> InputStream whenever some byte[] is available. Note that if this thread 
> unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into 
> something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
> try (is) {
>   // Read the content from InputStream
> }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time 
> since sending req is a very quick operation. With this operation, handling 
> threads won’t be spin up until first bytes are sent back. Notice that in this 
> approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This 

[jira] [Resolved] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-07-09 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter resolved SOLR-13132.
---
Fix Version/s: 8.7
   master (9.0)
 Assignee: Chris M. Hostetter
   Resolution: Fixed

Woot! committed and backported.

Thanks for sticking with this [~mgibney].

> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-13132-benchmarks.tgz, 
> SOLR-13132-with-cache-01.patch, SOLR-13132-with-cache.patch, 
> SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155082#comment-17155082
 ] 

ASF subversion and git services commented on SOLR-13132:


Commit c20501a5044e55bb6bd35e926ed803bd77c38df2 in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c20501a ]

SOLR-13132: fix some small package visibility and javadoc glitches that were 
caught on backport by the java8/branch_8x precommit but slipped past the 
java11/master precommit


> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-benchmarks.tgz, 
> SOLR-13132-with-cache-01.patch, SOLR-13132-with-cache.patch, 
> SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155080#comment-17155080
 ] 

ASF subversion and git services commented on SOLR-13132:


Commit 499a4503de770980b8cbcdc4dc15a17fd1f94f74 in lucene-solr's branch 
refs/heads/branch_8x from Michael Gibney
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=499a450 ]

SOLR-13132: JSON Facet perf improvements to support "sweeping" collection of 
"relatedness()"

This adds a lot of "under the covers" improvements to how JSON Faceting 
FacetField processors work, to enable
"sweeping" support when the SlotAcc used for sorting support it (currently just 
"relatedness()")

This is a squash commit of all changes on 
https://github.com/magibney/lucene-solr/tree/SOLR-13132
Up to and including ca7a8e0b39840d00af9022c048346a7d84bf280d.

Co-authored-by: Chris Hostetter 
Co-authored-by: Michael Gibney 

(cherry picked from commit 40e2122b5a5b89f446e51692ef0d72e48c7b71e5 w/some 
small fixes for backporting issues)


> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-benchmarks.tgz, 
> SOLR-13132-with-cache-01.patch, SOLR-13132-with-cache.patch, 
> SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155079#comment-17155079
 ] 

ASF subversion and git services commented on SOLR-13132:


Commit 499a4503de770980b8cbcdc4dc15a17fd1f94f74 in lucene-solr's branch 
refs/heads/branch_8x from Michael Gibney
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=499a450 ]

SOLR-13132: JSON Facet perf improvements to support "sweeping" collection of 
"relatedness()"

This adds a lot of "under the covers" improvements to how JSON Faceting 
FacetField processors work, to enable
"sweeping" support when the SlotAcc used for sorting support it (currently just 
"relatedness()")

This is a squash commit of all changes on 
https://github.com/magibney/lucene-solr/tree/SOLR-13132
Up to and including ca7a8e0b39840d00af9022c048346a7d84bf280d.

Co-authored-by: Chris Hostetter 
Co-authored-by: Michael Gibney 

(cherry picked from commit 40e2122b5a5b89f446e51692ef0d72e48c7b71e5 w/some 
small fixes for backporting issues)


> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-benchmarks.tgz, 
> SOLR-13132-with-cache-01.patch, SOLR-13132-with-cache.patch, 
> SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14404) CoreContainer level custom requesthandlers

2020-07-09 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155073#comment-17155073
 ] 

Noble Paul commented on SOLR-14404:
---

I have hardened the tests in the PR 
[https://github.com/apache/lucene-solr/pull/1661]

I beasted it for some time. No failures yet.  [~erickerickson] please try to 
beast the branch it if possible

> CoreContainer level custom requesthandlers
> --
>
> Key: SOLR-14404
> URL: https://issues.apache.org/jira/browse/SOLR-14404
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> caveats:
>  * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. 
> Which means only V2 APIs are supported
>  * The path should have prefix {{/api/plugin}}
> add a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin",
>   "class": "full.ClassName", 
>   "path-prefix" : "some-path-prefix"
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> add a plugin from a package
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", 
>   "class": "pkgName:full.ClassName" ,
>   "path-prefix" : "some-path-prefix"  ,  
>   "version: "1.0"   
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> remove a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "remove": "myplugin"
> }' http://localhost:8983/api/cluster/plugins
> {code}
> The configuration will be stored in the {{clusterprops.json}}
>  as
> {code:java}
> {
> "plugins" : {
> "myplugin" : {"class": "full.ClassName", "path-prefix" : "some-path-prefix" }
> }
> }
> {code}
> example plugin
> {code:java}
> public class MyPlugin {
>   private final CoreContainer coreContainer;
>   public MyPlugin(CoreContainer coreContainer) {
> this.coreContainer = coreContainer;
>   }
>   @EndPoint(path = "/$path-prefix/path1",
> method = METHOD.GET,
> permission = READ)
>   public void call(SolrQueryRequest req, SolrQueryResponse rsp){
> rsp.add("myplugin.version", "2.0");
>   }
> }
> {code}
> This plugin will be accessible on all nodes at 
> {{/api/some-path-prefix/path1}}. It's possible to add more methods at 
> different paths. Ensure that all paths start with {{$path-prefix}} because 
> that is the prefix in which the plugin is registered with. So 
> {{/some-path-prefix/path2}} , {{/some-path-prefix/my/deeply/nested/path}} are 
> all valid paths. 
> It's possible that the user chooses to register the plugin with a different 
> name. In that case , use a template variable as follows in paths 
> {{/cluster/some/other/path}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] noblepaul opened a new pull request #1661: SOLR-14404: test fix

2020-07-09 Thread GitBox


noblepaul opened a new pull request #1661:
URL: https://github.com/apache/lucene-solr/pull/1661


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14404) CoreContainer level custom requesthandlers

2020-07-09 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155038#comment-17155038
 ] 

Noble Paul edited comment on SOLR-14404 at 7/10/20, 2:05 AM:
-

Thanks [~hossman]

Looks like a concurrency issue
 # package version 2.0 is created
 # immediately a plugin is updated to use version 2.0
 # The node does not have that version of the package yet

Before throwing an error with "Invalid package" or "No such package version:" , 
it should try to refresh the package versions from ZK


was (Author: noble.paul):
Thanks [~hossman]

Looks like a concurrency issue


 # package version 2.0 is created
 # immediately a plugin is updated to use version 2.0
 # The node does not have that version of the package yet

> CoreContainer level custom requesthandlers
> --
>
> Key: SOLR-14404
> URL: https://issues.apache.org/jira/browse/SOLR-14404
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> caveats:
>  * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. 
> Which means only V2 APIs are supported
>  * The path should have prefix {{/api/plugin}}
> add a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin",
>   "class": "full.ClassName", 
>   "path-prefix" : "some-path-prefix"
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> add a plugin from a package
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", 
>   "class": "pkgName:full.ClassName" ,
>   "path-prefix" : "some-path-prefix"  ,  
>   "version: "1.0"   
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> remove a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "remove": "myplugin"
> }' http://localhost:8983/api/cluster/plugins
> {code}
> The configuration will be stored in the {{clusterprops.json}}
>  as
> {code:java}
> {
> "plugins" : {
> "myplugin" : {"class": "full.ClassName", "path-prefix" : "some-path-prefix" }
> }
> }
> {code}
> example plugin
> {code:java}
> public class MyPlugin {
>   private final CoreContainer coreContainer;
>   public MyPlugin(CoreContainer coreContainer) {
> this.coreContainer = coreContainer;
>   }
>   @EndPoint(path = "/$path-prefix/path1",
> method = METHOD.GET,
> permission = READ)
>   public void call(SolrQueryRequest req, SolrQueryResponse rsp){
> rsp.add("myplugin.version", "2.0");
>   }
> }
> {code}
> This plugin will be accessible on all nodes at 
> {{/api/some-path-prefix/path1}}. It's possible to add more methods at 
> different paths. Ensure that all paths start with {{$path-prefix}} because 
> that is the prefix in which the plugin is registered with. So 
> {{/some-path-prefix/path2}} , {{/some-path-prefix/my/deeply/nested/path}} are 
> all valid paths. 
> It's possible that the user chooses to register the plugin with a different 
> name. In that case , use a template variable as follows in paths 
> {{/cluster/some/other/path}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14404) CoreContainer level custom requesthandlers

2020-07-09 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155038#comment-17155038
 ] 

Noble Paul commented on SOLR-14404:
---

Thanks [~hossman]

Looks like a concurrency issue


 # package version 2.0 is created
 # immediately a plugin is updated to use version 2.0
 # The node does not have that version of the package yet

> CoreContainer level custom requesthandlers
> --
>
> Key: SOLR-14404
> URL: https://issues.apache.org/jira/browse/SOLR-14404
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> caveats:
>  * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. 
> Which means only V2 APIs are supported
>  * The path should have prefix {{/api/plugin}}
> add a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin",
>   "class": "full.ClassName", 
>   "path-prefix" : "some-path-prefix"
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> add a plugin from a package
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", 
>   "class": "pkgName:full.ClassName" ,
>   "path-prefix" : "some-path-prefix"  ,  
>   "version: "1.0"   
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> remove a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "remove": "myplugin"
> }' http://localhost:8983/api/cluster/plugins
> {code}
> The configuration will be stored in the {{clusterprops.json}}
>  as
> {code:java}
> {
> "plugins" : {
> "myplugin" : {"class": "full.ClassName", "path-prefix" : "some-path-prefix" }
> }
> }
> {code}
> example plugin
> {code:java}
> public class MyPlugin {
>   private final CoreContainer coreContainer;
>   public MyPlugin(CoreContainer coreContainer) {
> this.coreContainer = coreContainer;
>   }
>   @EndPoint(path = "/$path-prefix/path1",
> method = METHOD.GET,
> permission = READ)
>   public void call(SolrQueryRequest req, SolrQueryResponse rsp){
> rsp.add("myplugin.version", "2.0");
>   }
> }
> {code}
> This plugin will be accessible on all nodes at 
> {{/api/some-path-prefix/path1}}. It's possible to add more methods at 
> different paths. Ensure that all paths start with {{$path-prefix}} because 
> that is the prefix in which the plugin is registered with. So 
> {{/some-path-prefix/path2}} , {{/some-path-prefix/my/deeply/nested/path}} are 
> all valid paths. 
> It's possible that the user chooses to register the plugin with a different 
> name. In that case , use a template variable as follows in paths 
> {{/cluster/some/other/path}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155033#comment-17155033
 ] 

ASF subversion and git services commented on SOLR-13132:


Commit 40e2122b5a5b89f446e51692ef0d72e48c7b71e5 in lucene-solr's branch 
refs/heads/master from Michael Gibney
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=40e2122 ]

SOLR-13132: JSON Facet perf improvements to support "sweeping" collection of 
"relatedness()"

This adds a lot of "under the covers" improvements to how JSON Faceting 
FacetField processors work, to enable
"sweeping" support when the SlotAcc used for sorting support it (currently just 
"relatedness()")

This is a squash commit of all changes on 
https://github.com/magibney/lucene-solr/tree/SOLR-13132
Up to and including ca7a8e0b39840d00af9022c048346a7d84bf280d.

Co-authored-by: Chris Hostetter 
Co-authored-by: Michael Gibney 


> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-benchmarks.tgz, 
> SOLR-13132-with-cache-01.patch, SOLR-13132-with-cache.patch, 
> SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155032#comment-17155032
 ] 

ASF subversion and git services commented on SOLR-13132:


Commit 40e2122b5a5b89f446e51692ef0d72e48c7b71e5 in lucene-solr's branch 
refs/heads/master from Michael Gibney
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=40e2122 ]

SOLR-13132: JSON Facet perf improvements to support "sweeping" collection of 
"relatedness()"

This adds a lot of "under the covers" improvements to how JSON Faceting 
FacetField processors work, to enable
"sweeping" support when the SlotAcc used for sorting support it (currently just 
"relatedness()")

This is a squash commit of all changes on 
https://github.com/magibney/lucene-solr/tree/SOLR-13132
Up to and including ca7a8e0b39840d00af9022c048346a7d84bf280d.

Co-authored-by: Chris Hostetter 
Co-authored-by: Michael Gibney 


> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-benchmarks.tgz, 
> SOLR-13132-with-cache-01.patch, SOLR-13132-with-cache.patch, 
> SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14635) improve ThreadDumpHandler to show more info related to locks

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155031#comment-17155031
 ] 

ASF subversion and git services commented on SOLR-14635:


Commit da975fc46cffca332855860b015e8cc52dc5ed66 in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=da975fc ]

SOLR-14635: disable test that has silly concurrency assumptions

(cherry picked from commit 5a422db60e26e8d5488edb0d575cb86a60023a5d)


> improve ThreadDumpHandler to show more info related to locks
> 
>
> Key: SOLR-14635
> URL: https://issues.apache.org/jira/browse/SOLR-14635
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-14635.patch
>
>
> Having recently spent some time trying to use ThreadDumpHandler to diagnose a 
> "lock leak" i realized there are quite a few bits of info available from the 
> ThreadMXBean/ThreadInfo datastcutures that are not included in the response, 
> and i think we should add them:
> * switch from {{findMonitorDeadlockedThreads()}} to 
> {{findDeadlockedThreads()}} to also detect deadlocks from ownable 
> syncrhonizers (ie: ReintrantLocks)
> * for each thread:
> ** in addition to outputing the current {{getLockName()}} when a thread is 
> blocked/waiting, return info about the lock owner when available.
> *** there's already dead code checking this and then throwing away the info
> ** return the list of all locks (both monitors and ownable synchronizers) 
> held by each thread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14635) improve ThreadDumpHandler to show more info related to locks

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155027#comment-17155027
 ] 

ASF subversion and git services commented on SOLR-14635:


Commit 5a422db60e26e8d5488edb0d575cb86a60023a5d in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5a422db ]

SOLR-14635: disable test that has silly concurrency assumptions


> improve ThreadDumpHandler to show more info related to locks
> 
>
> Key: SOLR-14635
> URL: https://issues.apache.org/jira/browse/SOLR-14635
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-14635.patch
>
>
> Having recently spent some time trying to use ThreadDumpHandler to diagnose a 
> "lock leak" i realized there are quite a few bits of info available from the 
> ThreadMXBean/ThreadInfo datastcutures that are not included in the response, 
> and i think we should add them:
> * switch from {{findMonitorDeadlockedThreads()}} to 
> {{findDeadlockedThreads()}} to also detect deadlocks from ownable 
> syncrhonizers (ie: ReintrantLocks)
> * for each thread:
> ** in addition to outputing the current {{getLockName()}} when a thread is 
> blocked/waiting, return info about the lock owner when available.
> *** there's already dead code checking this and then throwing away the info
> ** return the list of all locks (both monitors and ownable synchronizers) 
> held by each thread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14635) improve ThreadDumpHandler to show more info related to locks

2020-07-09 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155022#comment-17155022
 ] 

Chris M. Hostetter commented on SOLR-14635:
---

Gah ... yeah: i made really silly concurrency mistakes in the test (i was so 
worried about the threads holding the "locks" until after the asserts, i forgot 
to consider that the threads might not even _start_ the "locked" blocks until 
after the assert!)

The handler improvements should be still be good – it's the tests that need 
fixed, so i'll disable for now. (in general, i'll probably have to scale back 
the test to only confirm that the "ownership" is listed properly ... other then 
polling i'm not sure it's possible to confirm a thread is BLOCKED (either on a 
monitor or an ownable synchornier)

> improve ThreadDumpHandler to show more info related to locks
> 
>
> Key: SOLR-14635
> URL: https://issues.apache.org/jira/browse/SOLR-14635
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-14635.patch
>
>
> Having recently spent some time trying to use ThreadDumpHandler to diagnose a 
> "lock leak" i realized there are quite a few bits of info available from the 
> ThreadMXBean/ThreadInfo datastcutures that are not included in the response, 
> and i think we should add them:
> * switch from {{findMonitorDeadlockedThreads()}} to 
> {{findDeadlockedThreads()}} to also detect deadlocks from ownable 
> syncrhonizers (ie: ReintrantLocks)
> * for each thread:
> ** in addition to outputing the current {{getLockName()}} when a thread is 
> blocked/waiting, return info about the lock owner when available.
> *** there's already dead code checking this and then throwing away the info
> ** return the list of all locks (both monitors and ownable synchronizers) 
> held by each thread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Reopened] (SOLR-14635) improve ThreadDumpHandler to show more info related to locks

2020-07-09 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter reopened SOLR-14635:
---

jenkins found a test failure ... something i must have overlooked in the 
synchorniation setting up the monitor locks ... need to dig

> improve ThreadDumpHandler to show more info related to locks
> 
>
> Key: SOLR-14635
> URL: https://issues.apache.org/jira/browse/SOLR-14635
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-14635.patch
>
>
> Having recently spent some time trying to use ThreadDumpHandler to diagnose a 
> "lock leak" i realized there are quite a few bits of info available from the 
> ThreadMXBean/ThreadInfo datastcutures that are not included in the response, 
> and i think we should add them:
> * switch from {{findMonitorDeadlockedThreads()}} to 
> {{findDeadlockedThreads()}} to also detect deadlocks from ownable 
> syncrhonizers (ie: ReintrantLocks)
> * for each thread:
> ** in addition to outputing the current {{getLockName()}} when a thread is 
> blocked/waiting, return info about the lock owner when available.
> *** there's already dead code checking this and then throwing away the info
> ** return the list of all locks (both monitors and ownable synchronizers) 
> held by each thread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-07-09 Thread Mark Robert Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Robert Miller updated SOLR-14636:
--
Attachment: solr-core-serial-run.gif
Status: Open  (was: Open)

!solr-core-serial-run.gif!

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
> Attachments: solr-core-serial-run.gif
>
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *tests***:
>  * *core*: passing with ignores (not solid*)
>  * *solrj*: tbd
>  * *test-framework*: tbd
>  * *contrib/analysis-extras*: tbd
>  * *contrib/analytics*: tbd
>  * *contrib/clustering*: tbd
>  * *contrib/dataimporthandler*: tbd
>  * *contrib/dataimporthandler-extras*: tbd
>  * *contrib/extraction*: tbd
>  * *contrib/jaegertracer-configurator*: tbd
>  * *contrib/langid*: tbd
>  * *contrib/prometheus-exporter*: tbd
>  * *contrib/velocity*: tbd
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-07-09 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154967#comment-17154967
 ] 

Mark Robert Miller commented on SOLR-14636:
---

I used up my focus hours to get here, it's a friggen rubik's cube, so a little 
slower pace for a bit.

This branch is the result of a process that came out of the culmination of all 
of my work on Lucene, Solr, and SolrCloud over the past 15 years.

The mistake is to think the current state of affairs can be addressed issue by 
issue. It's not a stupid mistake, it's a super common one. But it's die on the 
hamster wheel mistake.

The current state of affairs can be addressed though.

I like to call it "speed is the light". There is an assumption that the 
SolrCloud tests in particular are slow by nature. You have ZK, god ... hdfs, 
Jetty, RRDB, 5 kitechen sinks, 3 more in the guest house, and like a bajillion 
3rd party libs. The assumption is wrong though. Each test has the potential to 
fly.

So make the tests fly. And it's a laborious process, because the code base is 
old and sprawling. But I stopped caring if I could find my way out of these 
excursions long ago, so I make the tests fly one by one. And the system falls 
apart. Because the system is built to survive a much more forgiving world and 
it is full of gremlins and bugs and really hideous stuff to look at. It thrives 
in this world where it can cause chaos and behave like a little black box of 
alchemy. I squeeze with my tools right down on those tests, and I fix the 
problems that start to so easily emerge. And the system starts to work. I 
prefer it about 10-1000x myself.

I'll have to spend some time doing more hardening, renabling tests, and then I 
have to add some new tests, that start by adding one replica, one shard, one 
document. And then moving up, little by little. Finds great sh$$#.

 

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *tests***:
>  * *core*: passing with ignores (not solid*)
>  * *solrj*: tbd
>  * *test-framework*: tbd
>  * *contrib/analysis-extras*: tbd
>  * *contrib/analytics*: tbd
>  * *contrib/clustering*: tbd
>  * *contrib/dataimporthandler*: tbd
>  * *contrib/dataimporthandler-extras*: tbd
>  * *contrib/extraction*: tbd
>  * *contrib/jaegertracer-configurator*: tbd
>  * *contrib/langid*: tbd
>  * *contrib/prometheus-exporter*: tbd
>  * *contrib/velocity*: tbd
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14635) improve ThreadDumpHandler to show more info related to locks

2020-07-09 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-14635:
--
Fix Version/s: 8.7
   master (9.0)
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> improve ThreadDumpHandler to show more info related to locks
> 
>
> Key: SOLR-14635
> URL: https://issues.apache.org/jira/browse/SOLR-14635
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Fix For: master (9.0), 8.7
>
> Attachments: SOLR-14635.patch
>
>
> Having recently spent some time trying to use ThreadDumpHandler to diagnose a 
> "lock leak" i realized there are quite a few bits of info available from the 
> ThreadMXBean/ThreadInfo datastcutures that are not included in the response, 
> and i think we should add them:
> * switch from {{findMonitorDeadlockedThreads()}} to 
> {{findDeadlockedThreads()}} to also detect deadlocks from ownable 
> syncrhonizers (ie: ReintrantLocks)
> * for each thread:
> ** in addition to outputing the current {{getLockName()}} when a thread is 
> blocked/waiting, return info about the lock owner when available.
> *** there's already dead code checking this and then throwing away the info
> ** return the list of all locks (both monitors and ownable synchronizers) 
> held by each thread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14635) improve ThreadDumpHandler to show more info related to locks

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154957#comment-17154957
 ] 

ASF subversion and git services commented on SOLR-14635:


Commit a2f49eec89180357c07154e6979dff9431369124 in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a2f49ee ]

SOLR-14635: ThreadDumpHandler has been enhanced to show lock ownership

(cherry picked from commit 5c6314a970f9a6a07aee5a14851f3b0f9fbe02fb)


> improve ThreadDumpHandler to show more info related to locks
> 
>
> Key: SOLR-14635
> URL: https://issues.apache.org/jira/browse/SOLR-14635
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14635.patch
>
>
> Having recently spent some time trying to use ThreadDumpHandler to diagnose a 
> "lock leak" i realized there are quite a few bits of info available from the 
> ThreadMXBean/ThreadInfo datastcutures that are not included in the response, 
> and i think we should add them:
> * switch from {{findMonitorDeadlockedThreads()}} to 
> {{findDeadlockedThreads()}} to also detect deadlocks from ownable 
> syncrhonizers (ie: ReintrantLocks)
> * for each thread:
> ** in addition to outputing the current {{getLockName()}} when a thread is 
> blocked/waiting, return info about the lock owner when available.
> *** there's already dead code checking this and then throwing away the info
> ** return the list of all locks (both monitors and ownable synchronizers) 
> held by each thread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-07-09 Thread Ilan Ginzburg (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154952#comment-17154952
 ] 

Ilan Ginzburg commented on SOLR-14636:
--

Is there documentation (even if very high level) describing the 
changes/intention that can be used as a reading guide to approach this new 
branch? Thanks.

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *tests***:
>  * *core*: passing with ignores (not solid*)
>  * *solrj*: tbd
>  * *test-framework*: tbd
>  * *contrib/analysis-extras*: tbd
>  * *contrib/analytics*: tbd
>  * *contrib/clustering*: tbd
>  * *contrib/dataimporthandler*: tbd
>  * *contrib/dataimporthandler-extras*: tbd
>  * *contrib/extraction*: tbd
>  * *contrib/jaegertracer-configurator*: tbd
>  * *contrib/langid*: tbd
>  * *contrib/prometheus-exporter*: tbd
>  * *contrib/velocity*: tbd
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-07-09 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154939#comment-17154939
 ] 

Mark Robert Miller edited comment on SOLR-14636 at 7/9/20, 9:06 PM:


I've been meeting with Noble and Ishan weekly for some time now to discuss 
various issues, and while I had been feeling rather burned out on this effort, 
one thing led to another and I caught a little wind.

The current *early* state is here: 
[https://github.com/apache/lucene-solr/tree/reference_impl]


was (Author: markrmiller):
I've been meeting with Noble and Ishan weekly for some time now to discuss 
various issues, and while I had been feeling rather burned out on this effort, 
one thing led to another and I caught a little wind.

The current *early* state us here: 
[https://github.com/apache/lucene-solr/tree/reference_impl]

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *tests***:
>  * *core*: passing with ignores (not solid*)
>  * *solrj*: tbd
>  * *test-framework*: tbd
>  * *contrib/analysis-extras*: tbd
>  * *contrib/analytics*: tbd
>  * *contrib/clustering*: tbd
>  * *contrib/dataimporthandler*: tbd
>  * *contrib/dataimporthandler-extras*: tbd
>  * *contrib/extraction*: tbd
>  * *contrib/jaegertracer-configurator*: tbd
>  * *contrib/langid*: tbd
>  * *contrib/prometheus-exporter*: tbd
>  * *contrib/velocity*: tbd
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-07-09 Thread Mark Robert Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Robert Miller updated SOLR-14636:
--
Description: 
SolrCloud powers critical infrastructure and needs the ability to run quickly 
with stability. This reference implementation will allow for this.

*location*: [https://github.com/apache/lucene-solr/tree/reference_impl]

*status*: alpha

*tests***:
 * *core*: passing with ignores (not solid*)
 * *solrj*: tbd
 * *test-framework*: tbd
 * *contrib/analysis-extras*: tbd
 * *contrib/analytics*: tbd
 * *contrib/clustering*: tbd
 * *contrib/dataimporthandler*: tbd
 * *contrib/dataimporthandler-extras*: tbd
 * *contrib/extraction*: tbd
 * *contrib/jaegertracer-configurator*: tbd
 * *contrib/langid*: tbd
 * *contrib/prometheus-exporter*: tbd
 * *contrib/velocity*: tbd

_* Running tests quickly and efficiently with strict policing will more 
frequently find bugs and requires a period of hardening._
 _** Non Nightly currently, Nightly comes last._

  was:
SolrCloud powers critical infrastructure and needs the ability to run quickly 
with stability. This reference implementation will allow for this.

 

*status*: alpha

*tests***:
 * *core*: passing with ignores (not solid*)
 * *solrj*: tbd
 * *test-framework*: tbd
 * *contrib/analysis-extras*: tbd
 * *contrib/analytics*: tbd
 * *contrib/clustering*: tbd
 * *contrib/dataimporthandler*: tbd
 * *contrib/dataimporthandler-extras*: tbd
 * *contrib/extraction*: tbd
 * *contrib/jaegertracer-configurator*: tbd
 * *contrib/langid*: tbd
 * *contrib/prometheus-exporter*: tbd
 * *contrib/velocity*: tbd

_* Running tests quickly and efficiently with strict policing will more 
frequently find bugs and requires a period of hardening._
_** Non Nightly currently, Nightly comes last._


> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
> *location*: [https://github.com/apache/lucene-solr/tree/reference_impl]
> *status*: alpha
> *tests***:
>  * *core*: passing with ignores (not solid*)
>  * *solrj*: tbd
>  * *test-framework*: tbd
>  * *contrib/analysis-extras*: tbd
>  * *contrib/analytics*: tbd
>  * *contrib/clustering*: tbd
>  * *contrib/dataimporthandler*: tbd
>  * *contrib/dataimporthandler-extras*: tbd
>  * *contrib/extraction*: tbd
>  * *contrib/jaegertracer-configurator*: tbd
>  * *contrib/langid*: tbd
>  * *contrib/prometheus-exporter*: tbd
>  * *contrib/velocity*: tbd
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
>  _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-07-09 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154939#comment-17154939
 ] 

Mark Robert Miller commented on SOLR-14636:
---

I've been meeting with Noble and Ishan weekly for some time now to discuss 
various issues, and while I had been feeling rather burned out on this effort, 
one thing led to another and I caught a little wind.

The current *early* state us here: 
[https://github.com/apache/lucene-solr/tree/reference_impl]

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
>  
> *status*: alpha
> *tests***:
>  * *core*: passing with ignores (not solid*)
>  * *solrj*: tbd
>  * *test-framework*: tbd
>  * *contrib/analysis-extras*: tbd
>  * *contrib/analytics*: tbd
>  * *contrib/clustering*: tbd
>  * *contrib/dataimporthandler*: tbd
>  * *contrib/dataimporthandler-extras*: tbd
>  * *contrib/extraction*: tbd
>  * *contrib/jaegertracer-configurator*: tbd
>  * *contrib/langid*: tbd
>  * *contrib/prometheus-exporter*: tbd
>  * *contrib/velocity*: tbd
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
> _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14404) CoreContainer level custom requesthandlers

2020-07-09 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154917#comment-17154917
 ] 

Erick Erickson commented on SOLR-14404:
---

I can also beast it, I'll start of a run now to see if I can repro. If so I'd 
be happy to try some patches.

> CoreContainer level custom requesthandlers
> --
>
> Key: SOLR-14404
> URL: https://issues.apache.org/jira/browse/SOLR-14404
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> caveats:
>  * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. 
> Which means only V2 APIs are supported
>  * The path should have prefix {{/api/plugin}}
> add a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin",
>   "class": "full.ClassName", 
>   "path-prefix" : "some-path-prefix"
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> add a plugin from a package
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", 
>   "class": "pkgName:full.ClassName" ,
>   "path-prefix" : "some-path-prefix"  ,  
>   "version: "1.0"   
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> remove a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "remove": "myplugin"
> }' http://localhost:8983/api/cluster/plugins
> {code}
> The configuration will be stored in the {{clusterprops.json}}
>  as
> {code:java}
> {
> "plugins" : {
> "myplugin" : {"class": "full.ClassName", "path-prefix" : "some-path-prefix" }
> }
> }
> {code}
> example plugin
> {code:java}
> public class MyPlugin {
>   private final CoreContainer coreContainer;
>   public MyPlugin(CoreContainer coreContainer) {
> this.coreContainer = coreContainer;
>   }
>   @EndPoint(path = "/$path-prefix/path1",
> method = METHOD.GET,
> permission = READ)
>   public void call(SolrQueryRequest req, SolrQueryResponse rsp){
> rsp.add("myplugin.version", "2.0");
>   }
> }
> {code}
> This plugin will be accessible on all nodes at 
> {{/api/some-path-prefix/path1}}. It's possible to add more methods at 
> different paths. Ensure that all paths start with {{$path-prefix}} because 
> that is the prefix in which the plugin is registered with. So 
> {{/some-path-prefix/path2}} , {{/some-path-prefix/my/deeply/nested/path}} are 
> all valid paths. 
> It's possible that the user chooses to register the plugin with a different 
> name. In that case , use a template variable as follows in paths 
> {{/cluster/some/other/path}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14608) Faster sorting for the /export handler

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154916#comment-17154916
 ] 

ASF subversion and git services commented on SOLR-14608:


Commit e0fc38f1b1093cd761da03e561df6395d3a79fc1 in lucene-solr's branch 
refs/heads/jira/SOLR-14608-export from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e0fc38f ]

SOLR-14608: Add method for creating the MergeIterator


> Faster sorting for the /export handler
> --
>
> Key: SOLR-14608
> URL: https://issues.apache.org/jira/browse/SOLR-14608
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Andrzej Bialecki
>Priority: Major
>
> The largest cost of the export handler is the sorting. This ticket will 
> implement an improved algorithm for sorting that should greatly increase 
> overall throughput for the export handler.
> *The current algorithm is as follows:*
> Collect a bitset of matching docs. Iterate over that bitset and materialize 
> the top level oridinals for the sort fields in the document and add them to 
> priority queue of size 3. Then export the top 3 docs, turn off the 
> bits in the bit set and iterate again until all docs are sorted and sent. 
> There are two performance bottlenecks with this approach:
> 1) Materializing the top level ordinals adds a huge amount of overhead to the 
> sorting process.
> 2) The size of priority queue, 30,000, adds significant overhead to sorting 
> operations.
> *The new algorithm:*
> Has a top level *merge sort iterator* that wraps segment level iterators that 
> perform segment level priority queue sorts.
> *Segment level:*
> The segment level docset will be iterated and the segment level ordinals for 
> the sort fields will be materialized and added to a segment level priority 
> queue. As the segment level iterator pops docs from the priority queue the 
> top level ordinals for the sort fields are materialized. Because the top 
> level ordinals are materialized AFTER the sort, they only need to be looked 
> up when the segment level ordinal changes. This takes advantage of the sort 
> to limit the lookups into the top level ordinal structures. This also 
> eliminates redundant lookups of top level ordinals that occur during the 
> multiple passes over the matching docset.
> The segment level priority queues can be kept smaller than 30,000 to improve 
> performance of the sorting operations because the overall batch size will 
> still be 30,000 or greater when all the segment priority queue sizes are 
> added up. This allows for batch sizes much larger then 30,000 without using a 
> single large priority queue. The increased batch size means fewer iterations 
> over the matching docset and the decreased priority queue size means faster 
> sorting operations.
> *Top level:*
> A top level iterator does a merge sort over the segment level iterators by 
> comparing the top level ordinals materialized when the segment level docs are 
> popped from the segment level priority queues. This requires no extra memory 
> and will be very performant.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker

2020-07-09 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154897#comment-17154897
 ] 

Atri Sharma commented on SOLR-14588:


I beasted the test under high CPU stress -- still could not reproduce the same. 
I will push a change to add logging tomorrow.

> Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
> --
>
> Key: SOLR-14588
> URL: https://issues.apache.org/jira/browse/SOLR-14588
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This Jira tracks addition of circuit breakers in the search path and 
> implements JVM based circuit breaker which rejects incoming search requests 
> if the JVM heap usage exceeds a defined percentage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9424) Have a warning comment for AttributeSource.captureState

2020-07-09 Thread Haoyu Zhai (Jira)
Haoyu Zhai created LUCENE-9424:
--

 Summary: Have a warning comment for AttributeSource.captureState
 Key: LUCENE-9424
 URL: https://issues.apache.org/jira/browse/LUCENE-9424
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/javadocs
Reporter: Haoyu Zhai


{{AttributeSource.captureState}} is a powerful method that can be used to store 
and (later on) restore the current state, but it comes with a cost of copying 
all attributes in this source and sometimes can be a big cost if called 
multiple times.

We could probably add a warning to indicate this cost, as this method is 
encapsulated quite well and sometimes people who use it won't be aware of the 
cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] zhaih closed pull request #1613: LUCENE-8574 Cache ExpressionFunctionValues

2020-07-09 Thread GitBox


zhaih closed pull request #1613:
URL: https://github.com/apache/lucene-solr/pull/1613


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8574) ExpressionFunctionValues should cache per-hit value

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154886#comment-17154886
 ] 

ASF subversion and git services commented on LUCENE-8574:
-

Commit 1791d7e44278e9d032f99822e82b25109f8952c0 in lucene-solr's branch 
refs/heads/branch_8x from Michael McCandless
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1791d7e ]

LUCENE-8574: the DoubleValues for dependent bindings for an expression are now 
cached and reused and no longer inefficiently recomputed per hit


> ExpressionFunctionValues should cache per-hit value
> ---
>
> Key: LUCENE-8574
> URL: https://issues.apache.org/jira/browse/LUCENE-8574
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.5, 8.0
>Reporter: Michael McCandless
>Assignee: Robert Muir
>Priority: Major
> Attachments: LUCENE-8574.patch, unit_test.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The original version of {{ExpressionFunctionValues}} had a simple per-hit 
> cache, so that nested expressions that reference the same common variable 
> would compute the value for that variable the first time it was referenced 
> and then use that cached value for all subsequent invocations, within one 
> hit.  I think it was accidentally removed in LUCENE-7609?
> This is quite important if you have non-trivial expressions that reference 
> the same variable multiple times.
> E.g. if I have these expressions:
> {noformat}
> x = c + d
> c = b + 2 
> d = b * 2{noformat}
> Then evaluating x should only cause b's value to be computed once (for a 
> given hit), but today it's computed twice.  The problem is combinatoric if b 
> then references another variable multiple times, etc.
> I think to fix this we just need to restore the per-hit cache?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-8574) ExpressionFunctionValues should cache per-hit value

2020-07-09 Thread Michael McCandless (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-8574.

Resolution: Fixed

Thanks [~zhai7631]!

> ExpressionFunctionValues should cache per-hit value
> ---
>
> Key: LUCENE-8574
> URL: https://issues.apache.org/jira/browse/LUCENE-8574
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.5, 8.0
>Reporter: Michael McCandless
>Assignee: Robert Muir
>Priority: Major
> Attachments: LUCENE-8574.patch, unit_test.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The original version of {{ExpressionFunctionValues}} had a simple per-hit 
> cache, so that nested expressions that reference the same common variable 
> would compute the value for that variable the first time it was referenced 
> and then use that cached value for all subsequent invocations, within one 
> hit.  I think it was accidentally removed in LUCENE-7609?
> This is quite important if you have non-trivial expressions that reference 
> the same variable multiple times.
> E.g. if I have these expressions:
> {noformat}
> x = c + d
> c = b + 2 
> d = b * 2{noformat}
> Then evaluating x should only cause b's value to be computed once (for a 
> given hit), but today it's computed twice.  The problem is combinatoric if b 
> then references another variable multiple times, etc.
> I think to fix this we just need to restore the per-hit cache?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker

2020-07-09 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154882#comment-17154882
 ] 

Atri Sharma commented on SOLR-14588:


Thanks for highlighting -- I am taking a look. Can you please point to where I 
can monitor future Jenkins build?

> Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
> --
>
> Key: SOLR-14588
> URL: https://issues.apache.org/jira/browse/SOLR-14588
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This Jira tracks addition of circuit breakers in the search path and 
> implements JVM based circuit breaker which rejects incoming search requests 
> if the JVM heap usage exceeds a defined percentage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8574) ExpressionFunctionValues should cache per-hit value

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154878#comment-17154878
 ] 

ASF subversion and git services commented on LUCENE-8574:
-

Commit 60e0d8ac6e512008d68770c358af8f057b03566d in lucene-solr's branch 
refs/heads/master from Michael McCandless
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=60e0d8a ]

LUCENE-8574: the DoubleValues for dependent bindings for an expression are now 
cached and reused and no longer inefficiently recomputed per hit


> ExpressionFunctionValues should cache per-hit value
> ---
>
> Key: LUCENE-8574
> URL: https://issues.apache.org/jira/browse/LUCENE-8574
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.5, 8.0
>Reporter: Michael McCandless
>Assignee: Robert Muir
>Priority: Major
> Attachments: LUCENE-8574.patch, unit_test.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The original version of {{ExpressionFunctionValues}} had a simple per-hit 
> cache, so that nested expressions that reference the same common variable 
> would compute the value for that variable the first time it was referenced 
> and then use that cached value for all subsequent invocations, within one 
> hit.  I think it was accidentally removed in LUCENE-7609?
> This is quite important if you have non-trivial expressions that reference 
> the same variable multiple times.
> E.g. if I have these expressions:
> {noformat}
> x = c + d
> c = b + 2 
> d = b * 2{noformat}
> Then evaluating x should only cause b's value to be computed once (for a 
> given hit), but today it's computed twice.  The problem is combinatoric if b 
> then references another variable multiple times, etc.
> I think to fix this we just need to restore the per-hit cache?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14635) improve ThreadDumpHandler to show more info related to locks

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154872#comment-17154872
 ] 

ASF subversion and git services commented on SOLR-14635:


Commit 5c6314a970f9a6a07aee5a14851f3b0f9fbe02fb in lucene-solr's branch 
refs/heads/master from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5c6314a ]

SOLR-14635: ThreadDumpHandler has been enhanced to show lock ownership


> improve ThreadDumpHandler to show more info related to locks
> 
>
> Key: SOLR-14635
> URL: https://issues.apache.org/jira/browse/SOLR-14635
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-14635.patch
>
>
> Having recently spent some time trying to use ThreadDumpHandler to diagnose a 
> "lock leak" i realized there are quite a few bits of info available from the 
> ThreadMXBean/ThreadInfo datastcutures that are not included in the response, 
> and i think we should add them:
> * switch from {{findMonitorDeadlockedThreads()}} to 
> {{findDeadlockedThreads()}} to also detect deadlocks from ownable 
> syncrhonizers (ie: ReintrantLocks)
> * for each thread:
> ** in addition to outputing the current {{getLockName()}} when a thread is 
> blocked/waiting, return info about the lock owner when available.
> *** there's already dead code checking this and then throwing away the info
> ** return the list of all locks (both monitors and ownable synchronizers) 
> held by each thread



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14404) CoreContainer level custom requesthandlers

2020-07-09 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154840#comment-17154840
 ] 

Ishan Chattopadhyaya commented on SOLR-14404:
-

Thanks for the analysis, [~hossman]. [~noble.paul], can you please beast the 
test to see if they can be reproduced?

> CoreContainer level custom requesthandlers
> --
>
> Key: SOLR-14404
> URL: https://issues.apache.org/jira/browse/SOLR-14404
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> caveats:
>  * The class should be annotated with  {{org.apache.solr.api.EndPoint}}. 
> Which means only V2 APIs are supported
>  * The path should have prefix {{/api/plugin}}
> add a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin",
>   "class": "full.ClassName", 
>   "path-prefix" : "some-path-prefix"
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> add a plugin from a package
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "add": {
>   "name":"myplugin", 
>   "class": "pkgName:full.ClassName" ,
>   "path-prefix" : "some-path-prefix"  ,  
>   "version: "1.0"   
>   }
> }' http://localhost:8983/api/cluster/plugins
> {code}
> remove a plugin
> {code:java}
> curl -X POST -H 'Content-type:application/json' --data-binary '
> {
>   "remove": "myplugin"
> }' http://localhost:8983/api/cluster/plugins
> {code}
> The configuration will be stored in the {{clusterprops.json}}
>  as
> {code:java}
> {
> "plugins" : {
> "myplugin" : {"class": "full.ClassName", "path-prefix" : "some-path-prefix" }
> }
> }
> {code}
> example plugin
> {code:java}
> public class MyPlugin {
>   private final CoreContainer coreContainer;
>   public MyPlugin(CoreContainer coreContainer) {
> this.coreContainer = coreContainer;
>   }
>   @EndPoint(path = "/$path-prefix/path1",
> method = METHOD.GET,
> permission = READ)
>   public void call(SolrQueryRequest req, SolrQueryResponse rsp){
> rsp.add("myplugin.version", "2.0");
>   }
> }
> {code}
> This plugin will be accessible on all nodes at 
> {{/api/some-path-prefix/path1}}. It's possible to add more methods at 
> different paths. Ensure that all paths start with {{$path-prefix}} because 
> that is the prefix in which the plugin is registered with. So 
> {{/some-path-prefix/path2}} , {{/some-path-prefix/my/deeply/nested/path}} are 
> all valid paths. 
> It's possible that the user chooses to register the plugin with a different 
> name. In that case , use a template variable as follows in paths 
> {{/cluster/some/other/path}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-11208) Usage SynchronousQueue in Executors prevent large scale operations

2020-07-09 Thread Ilan Ginzburg (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154804#comment-17154804
 ] 

Ilan Ginzburg commented on SOLR-11208:
--

What would be the issue implementing the OP's suggestion of getting rid of the 
{{SynchronousQueue}} in the {{ExecutorUtil.MDCAwareThreadPoolExecutor}} built 
in {{OverseerCollectionMessageHandler}} to use a {{LinkedBlockingQueue}} (can 
be unbounded or bounded) or {{ArrayBlockingQueue}} (bounded)?
If we go for bounded, we can dimension the queue to a very large size without 
the penalty of having too many threads. If we go unbounded and if the queue 
grows to infinity like it theoretically could, then we have other issues anyway 
and the system is not functional regardless.

I'd go unbounded so one less configuration parameter to worry about. Possibly 
issue warn logs if queue exceeds 20k entries (something not expected to happen 
but if it does we'll know).

Given we already use multiple threads in the executor, there are no constraints 
on execution order that would be relaxed by using a real queue. Currently 
already a {{CollectionOperation}} submitted earlier can get executed after a 
{{CollectionOperation}} submitted later. Clients must be careful not to submit 
a subsequent operation before they know a previous one completed. Having a 
queue is not going to change anything there.

If nobody objects, I'd replace the {{SynchronousQueue}} by an unbounded 
{{LinkedBlockingQueue}} in {{OverseerCollectionMessageHandler}} (and change the 
value of 5 currently passed as {{corePoolSize}} to 10 to match 
{{maximumPooolSize}} given that {{maximumPooolSize}} will no longer be used).

> Usage SynchronousQueue in Executors prevent large scale operations
> --
>
> Key: SOLR-11208
> URL: https://issues.apache.org/jira/browse/SOLR-11208
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 6.6
>Reporter: Björn Häuser
>Priority: Major
> Attachments: response.json
>
>
> I am not sure where to start with this one.
> I tried to post this already on the mailing list: 
> https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201708.mbox/%3c48c49426-33a2-4d79-ae26-a4515b8f8...@gmail.com%3e
> In short: the usage of a SynchronousQueue as the workQeue prevents more tasks 
> than max threads.
> For example, taken from OverseerCollectionMessageHandler:
> {code:java}
>   ExecutorService tpe = new ExecutorUtil.MDCAwareThreadPoolExecutor(5, 10, 
> 0L, TimeUnit.MILLISECONDS,
>   new SynchronousQueue<>(),
>   new 
> DefaultSolrThreadFactory("OverseerCollectionMessageHandlerThreadFactory"));
> {code}
> This Executor is used when doing a REPLACENODE (= ADDREPLICA) command. When 
> the node has more than 10 collections this will fail with the mentioned 
> java.util.concurrent.RejectedExecutionException.
> I am also not sure how to fix this. Just replacing the queue with a different 
> implementation feels wrong to me or could cause unwanted side behaviour.
> Thanks



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14608) Faster sorting for the /export handler

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154784#comment-17154784
 ] 

ASF subversion and git services commented on SOLR-14608:


Commit 9b01320ddd3800607fa0197df6ac66bfd27e148a in lucene-solr's branch 
refs/heads/jira/SOLR-14608-export from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9b01320d ]

SOLR-14608: Add skeleton algorithm for segment level iterator


> Faster sorting for the /export handler
> --
>
> Key: SOLR-14608
> URL: https://issues.apache.org/jira/browse/SOLR-14608
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Andrzej Bialecki
>Priority: Major
>
> The largest cost of the export handler is the sorting. This ticket will 
> implement an improved algorithm for sorting that should greatly increase 
> overall throughput for the export handler.
> *The current algorithm is as follows:*
> Collect a bitset of matching docs. Iterate over that bitset and materialize 
> the top level oridinals for the sort fields in the document and add them to 
> priority queue of size 3. Then export the top 3 docs, turn off the 
> bits in the bit set and iterate again until all docs are sorted and sent. 
> There are two performance bottlenecks with this approach:
> 1) Materializing the top level ordinals adds a huge amount of overhead to the 
> sorting process.
> 2) The size of priority queue, 30,000, adds significant overhead to sorting 
> operations.
> *The new algorithm:*
> Has a top level *merge sort iterator* that wraps segment level iterators that 
> perform segment level priority queue sorts.
> *Segment level:*
> The segment level docset will be iterated and the segment level ordinals for 
> the sort fields will be materialized and added to a segment level priority 
> queue. As the segment level iterator pops docs from the priority queue the 
> top level ordinals for the sort fields are materialized. Because the top 
> level ordinals are materialized AFTER the sort, they only need to be looked 
> up when the segment level ordinal changes. This takes advantage of the sort 
> to limit the lookups into the top level ordinal structures. This also 
> eliminates redundant lookups of top level ordinals that occur during the 
> multiple passes over the matching docset.
> The segment level priority queues can be kept smaller than 30,000 to improve 
> performance of the sorting operations because the overall batch size will 
> still be 30,000 or greater when all the segment priority queue sizes are 
> added up. This allows for batch sizes much larger then 30,000 without using a 
> single large priority queue. The increased batch size means fewer iterations 
> over the matching docset and the decreased priority queue size means faster 
> sorting operations.
> *Top level:*
> A top level iterator does a merge sort over the segment level iterators by 
> comparing the top level ordinals materialized when the segment level docs are 
> popped from the segment level priority queues. This requires no extra memory 
> and will be very performant.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14404) CoreContainer level custom requesthandlers

2020-07-09 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154782#comment-17154782
 ] 

Chris M. Hostetter commented on SOLR-14404:
---

we're seeing sporadic failures of TestContainerPlugin.testApiFromPackage in 
jenkins and Jira patch review builds.  these don't seem to reliably reproduce, 
and based on how/where they fail i'm guessing this is either a concurrency 
problem in the "real" code or a timing assumption in the test client code that 
isn't valid: the client sends a request to upload a new jar and that request 
finishes, but when the test client subsequently tries to send a request to 
"use" that new version it fails that the version isn't valid.

Here's what an example test failure looks like from junits perspective...
{noformat}
[junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestContainerPlugin -Dtests.method=testApiFromPackage 
-Dtests.seed=52BF77C417B002F6 -Dtests.multiplier=2 -Dtests.slow=true 
-Dtests.locale=en-GB -Dtests.timezone=America/La_Paz -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] ERROR   11.6s J1 | TestContainerPlugin.testApiFromPackage <<<
   [junit4]> Throwable #1: 
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteExecutionException: 
Error from server at https://127.0.0.1:45705/solr: Error executing command
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([52BF77C417B002F6:BFE18FB525FAD57F]:0)
   [junit4]>at 
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteExecutionException.create(BaseHttpSolrClient.java:67)
   [junit4]>at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:647)
   [junit4]>at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
   [junit4]>at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
   [junit4]>at 
org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:389)
   [junit4]>at 
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:359)
   [junit4]>at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1155)
   [junit4]>at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:916)
   [junit4]>at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:850)
   [junit4]>at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:210)
   [junit4]>at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:227)
   [junit4]>at 
org.apache.solr.handler.TestContainerPlugin.testApiFromPackage(TestContainerPlugin.java:265)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
   [junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
{noformat}
And here's a suspicious error from the log file (just before the test shards 
shutting down the nodes/zk) that seems to be the cause of the problem  ...
{noformat}
[junit4]   2> 1392190 INFO  (zkCallback-17473-thread-1) [ ] 
o.a.s.c.SolrResourceLoader Added 1 libs to classloader, from paths: 
[/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/solr/build/solr-core/test/J1/temp/solr.handler.TestContainerPlugin_52BF77C417B002F6-001/tempDir-002/node2/filestore/myplugin]
   [junit4]   2> 1392190 INFO  (zkCallback-17473-thread-1) [ ] 
o.a.s.p.PackageLoader version: 2.0 is the new latest in package: mypkg
   [junit4]   2> 1392191 INFO  (zkCallback-17435-thread-1) [ ] 
o.a.s.c.SolrResourceLoader Added 1 libs to classloader, from paths: 
[/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/solr/build/solr-core/test/J1/temp/solr.handler.TestContainerPlugin_52BF77C417B002F6-001/tempDir-002/node3/filestore/myplugin]
   [junit4]   2> 1392191 INFO  (zkCallback-17435-thread-1) [ ] 
o.a.s.p.PackageLoader version: 2.0 is the new latest in package: mypkg
   [junit4]   2> 1392191 INFO  (qtp1162953234-28500) [n:127.0.0.1:33293_solr
 ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/cluster/package 
params={wt=javabin=2} status=0 QTime=28
   [junit4]   2> 1392191 INFO  (zkCallback-17418-thread-1) [ ] 
o.a.s.f.DistribPackageStore pub_key512.der does not exist locally, 
downloading.. 
   [junit4]   2> 1392192 INFO  (zkCallback-17438-thread-1) [ ] 
o.a.s.f.DistribPackageStore 

[jira] [Reopened] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker

2020-07-09 Thread Chris M. Hostetter (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter reopened SOLR-14588:
---

TestCircuitBreaker.testBuildingMemoryPressure has failed several times in 
non-reproducible ways on both regular jenkins run, and on Jira patch review 
builds (when the patches do not in any way affect code executed by the test)

Based on the nature of the test, i'm guessing the problem is related to either: 
concurrency bugs in the "real" code; timing assumptions that are violated by 
the test (ie: is registering then ew circut breaker a non-blocking method 
call?); or something about the combination of the test code + real cod thta is 
finiky when dealing with low-resource build servers.

The failures always look like this...
{noformat}
[junit4]   2> 2568622 INFO  
(TEST-TestCircuitBreaker.testBuildingMemoryPressure-seed#[52BF77C417B002F6]) [  
   ] o.a.s.SolrTestCaseJ4 ###Starting testBuildingMemoryPressure
   [junit4]   2> 2568623 INFO  (TestCircuitBreaker-32014-thread-3) [ ] 
o.a.s.c.S.Request [collection1]  webapp=null path=null 
params={q=name:"john+smith"==0=20=2.2} status=503 QTime=0
   [junit4]   2> 2568623 INFO  (TestCircuitBreaker-32014-thread-1) [ ] 
o.a.s.c.S.Request [collection1]  webapp=null path=null 
params={q=name:"john+smith"==0=20=2.2} status=503 QTime=0
   [junit4]   2> 2568623 INFO  (TestCircuitBreaker-32014-thread-2) [ ] 
o.a.s.c.S.Request [collection1]  webapp=null path=null 
params={q=name:"john+smith"==0=20=2.2} status=503 QTime=0
   [junit4]   2> 2568623 INFO  (TestCircuitBreaker-32014-thread-4) [ ] 
o.a.s.c.S.Request [collection1]  webapp=null path=null 
params={q=name:"john+smith"==0=20=2.2} status=503 QTime=0
   [junit4]   2> 2568624 INFO  (TestCircuitBreaker-32014-thread-5) [ ] 
o.a.s.c.S.Request [collection1]  webapp=null path=null 
params={q=name:"john+smith"==0=20=2.2} status=503 QTime=0
   [junit4]   2> 2568625 INFO  
(TEST-TestCircuitBreaker.testBuildingMemoryPressure-seed#[52BF77C417B002F6]) [  
   ] o.a.s.SolrTestCaseJ4 ###Ending testBuildingMemoryPressure
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestCircuitBreaker 
-Dtests.method=testBuildingMemoryPressure -Dtests.seed=52BF77C417B002F6 
-Dtests.multiplier=2 -Dtests.slow=true -Dtests.locale=bas-CM 
-Dtests.timezone=America/Creston -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] FAILURE 0.01s J1 | TestCircuitBreaker.testBuildingMemoryPressure <<<
   [junit4]> Throwable #1: java.lang.AssertionError: Number of failed 
queries is not correct expected:<1> but was:<5>
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([52BF77C417B002F6:ED325B14383CEE5F]:0)
   [junit4]>at 
org.apache.solr.util.TestCircuitBreaker.testBuildingMemoryPressure(TestCircuitBreaker.java:141)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:566)
   [junit4]>at java.base/java.lang.Thread.run(Thread.java:834)
{noformat}
If the fix for this probablem is not obvious to the folks who worked on this 
jira, then please PLEASE at least:
 * update the test to log the specific details of these "expected" exceptions 
that increment the fail counter
 * keep an eye on the future jenkins builds looking for future failutes with 
the modified logging to confirm what exactly is happening if/when this "all 5 
requests" failed situation occurs.

 

> Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
> --
>
> Key: SOLR-14588
> URL: https://issues.apache.org/jira/browse/SOLR-14588
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Major
> Fix For: master (9.0)
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This Jira tracks addition of circuit breakers in the search path and 
> implements JVM based circuit breaker which rejects incoming search requests 
> if the JVM heap usage exceeds a defined percentage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-07-09 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154771#comment-17154771
 ] 

Mark Robert Miller commented on SOLR-14636:
---

Okay, I'm about ready to share the current state of this reference branch. Some 
tests still misbehave, I have not tackled core reload or mutable schema / 
schemaless yet for a start.

Anyway, here is an example of how Solr tests should be running in none Nightly 
mode. I run them serially, none in parallel, you should not need a 32 core 
machine and 128GB RAM to run Solr tests fast, the Solr core module should run 
in under 10 minutes even serially when things are all sensible (they are not 
yet, but they are working their way there): [https://youtu.be/w7DtCNh0R9s]

SolrCloud tests should run just as fast as non SolrCloud tests and every test 
should move, move, move. Nightly runs can take more time and dig deeper, but 
useful time, not wasteful.

There is still misbehavior here, it takes some more policing and beasting 
before I become the God of this branch, but in the end, all violators will be 
found and prosecuted.

> Provide a reference implementation for SolrCloud that is stable and fast.
> -
>
> Key: SOLR-14636
> URL: https://issues.apache.org/jira/browse/SOLR-14636
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Robert Miller
>Assignee: Mark Robert Miller
>Priority: Major
>
> SolrCloud powers critical infrastructure and needs the ability to run quickly 
> with stability. This reference implementation will allow for this.
>  
> *status*: alpha
> *tests***:
>  * *core*: passing with ignores (not solid*)
>  * *solrj*: tbd
>  * *test-framework*: tbd
>  * *contrib/analysis-extras*: tbd
>  * *contrib/analytics*: tbd
>  * *contrib/clustering*: tbd
>  * *contrib/dataimporthandler*: tbd
>  * *contrib/dataimporthandler-extras*: tbd
>  * *contrib/extraction*: tbd
>  * *contrib/jaegertracer-configurator*: tbd
>  * *contrib/langid*: tbd
>  * *contrib/prometheus-exporter*: tbd
>  * *contrib/velocity*: tbd
> _* Running tests quickly and efficiently with strict policing will more 
> frequently find bugs and requires a period of hardening._
> _** Non Nightly currently, Nightly comes last._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-11390) Trie* field javadocs to @see *Point equivalent

2020-07-09 Thread Mike Drob (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob resolved SOLR-11390.
--
Fix Version/s: mino
 Assignee: Mike Drob
   Resolution: Fixed

> Trie* field javadocs to @see *Point equivalent
> --
>
> Key: SOLR-11390
> URL: https://issues.apache.org/jira/browse/SOLR-11390
> Project: Solr
>  Issue Type: Wish
>Reporter: Christine Poerschke
>Assignee: Mike Drob
>Priority: Minor
> Fix For: mino
>
> Attachments: SOLR-11390.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It would be helpful I think for the deprecated Trie* field classes' javadocs 
> to {{@see}} link to the replacement classes.
> And also perhaps to check for (undeprecated) point classes still backwards 
> {{@see}} linking to the deprecated classes, e.g.
> {code}
> DatePointField.java: * @see TrieDateField
> TrieDateField.java: * @see TrieField
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-11390) Trie* field javadocs to @see *Point equivalent

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-11390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154745#comment-17154745
 ] 

ASF subversion and git services commented on SOLR-11390:


Commit 2341c220ceb41412fbd74ad89db8d4fee9a097eb in lucene-solr's branch 
refs/heads/master from Mike Drob
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2341c22 ]

SOLR-11390 Trie* field javadocs to @see *Point (#1612)

Co-authored-by: Christine Poerschke 

> Trie* field javadocs to @see *Point equivalent
> --
>
> Key: SOLR-11390
> URL: https://issues.apache.org/jira/browse/SOLR-11390
> Project: Solr
>  Issue Type: Wish
>Reporter: Christine Poerschke
>Priority: Minor
> Attachments: SOLR-11390.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It would be helpful I think for the deprecated Trie* field classes' javadocs 
> to {{@see}} link to the replacement classes.
> And also perhaps to check for (undeprecated) point classes still backwards 
> {{@see}} linking to the deprecated classes, e.g.
> {code}
> DatePointField.java: * @see TrieDateField
> TrieDateField.java: * @see TrieField
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob merged pull request #1612: SOLR-11390 Trie* field javadocs to @see *Point

2020-07-09 Thread GitBox


madrob merged pull request #1612:
URL: https://github.com/apache/lucene-solr/pull/1612


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14608) Faster sorting for the /export handler

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154740#comment-17154740
 ] 

ASF subversion and git services commented on SOLR-14608:


Commit bb4ae51c1c3e54c976bd1d449d5264afa3d74ec2 in lucene-solr's branch 
refs/heads/jira/SOLR-14608-export from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bb4ae51 ]

SOLR-14608: Add basic top level merge sort iterator


> Faster sorting for the /export handler
> --
>
> Key: SOLR-14608
> URL: https://issues.apache.org/jira/browse/SOLR-14608
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Andrzej Bialecki
>Priority: Major
>
> The largest cost of the export handler is the sorting. This ticket will 
> implement an improved algorithm for sorting that should greatly increase 
> overall throughput for the export handler.
> *The current algorithm is as follows:*
> Collect a bitset of matching docs. Iterate over that bitset and materialize 
> the top level oridinals for the sort fields in the document and add them to 
> priority queue of size 3. Then export the top 3 docs, turn off the 
> bits in the bit set and iterate again until all docs are sorted and sent. 
> There are two performance bottlenecks with this approach:
> 1) Materializing the top level ordinals adds a huge amount of overhead to the 
> sorting process.
> 2) The size of priority queue, 30,000, adds significant overhead to sorting 
> operations.
> *The new algorithm:*
> Has a top level *merge sort iterator* that wraps segment level iterators that 
> perform segment level priority queue sorts.
> *Segment level:*
> The segment level docset will be iterated and the segment level ordinals for 
> the sort fields will be materialized and added to a segment level priority 
> queue. As the segment level iterator pops docs from the priority queue the 
> top level ordinals for the sort fields are materialized. Because the top 
> level ordinals are materialized AFTER the sort, they only need to be looked 
> up when the segment level ordinal changes. This takes advantage of the sort 
> to limit the lookups into the top level ordinal structures. This also 
> eliminates redundant lookups of top level ordinals that occur during the 
> multiple passes over the matching docset.
> The segment level priority queues can be kept smaller than 30,000 to improve 
> performance of the sorting operations because the overall batch size will 
> still be 30,000 or greater when all the segment priority queue sizes are 
> added up. This allows for batch sizes much larger then 30,000 without using a 
> single large priority queue. The increased batch size means fewer iterations 
> over the matching docset and the decreased priority queue size means faster 
> sorting operations.
> *Top level:*
> A top level iterator does a merge sort over the segment level iterators by 
> comparing the top level ordinals materialized when the segment level docs are 
> popped from the segment level priority queues. This requires no extra memory 
> and will be very performant.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14469) Removed deprecated code in solr/core (master only)

2020-07-09 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154692#comment-17154692
 ] 

Erick Erickson commented on SOLR-14469:
---

"The negatives are currently no-ops" Yeah, I understand that. Patience! I'll 
get to this JIRA soon ;) I hope.

FWIW, I consider this JIRA something of a placeholder. I intend to dive into 
this and hack around for a while to see what patterns emerge. Your suggestions 
here are probably the way it'll go, this discusssion has saved me some fumbling 
around.

Part of the motivation is code clean-up, I think we have stuff deprecated in 4x 
or earlier that's never been cleaned up. It's also extremely frustrating to see 
deprecation annotations with no indication of _when_ they were deprecated or 
_what_ to use in their place. I think of it as another barrier to keeping code 
clean that we can/should remove. I'm also certain that the person doing the 
deprecation has the very best chance of efficiently updating the usages at the 
time the call is deprecated! 

So my tentative thoughts here are:

1> do some labeling enforcement. Require the deprecation to say when and what 
to use instead. TBD is how that interacts with javadocs checks.

2> Yeah, I don't see how to fail compilations and still evolve the code. I like 
in-your-face failures wy early in the process, but the end goal of not 
allowing cruft to accumulate is served by check failures.

3> I like the forbiddenAPIs idea and I know you put some effort into that in 
Gradle that I haven't looked at how to use yet ;). We can use this approach to 
build up one fix at a time; that'd allow gradual improvement rather than the 
sledgehammer of using compile failures which is untenable.

It also provides a mechanism for cleaning things up when the deprecations are 
made. I'd really like to _strongly_ encourage people who add deprectations to 
clean up the internal usages at the same time since they know exactly what to 
do (at least they better!). For use-cases that require the deprecated calls to 
stay (say changing a method's visibility as someone pointed out) we can use 
SuppressForbidden. 

4> As far as actually removing deprecation annotations, I think there are two 
phases:
4.1> I see no reason we can't "do the right thing" in trunk for any deprecation 
added prior to 8.0. It won't always be removing the code.
4.2> For deprecations added in 8x, fix up all the internal calls and add them 
to ForbiddenAPIs.

This is my straw-man proposal, as I mentioned before I'll try doing some of it 
and see if it makes sense and report back.

Thanks again for your suggestions, they really helped me clarify a path 
forward. Any plan is better than fumbling around blindly, even if it turns out 
to be sub-optimal...

> Removed deprecated code in solr/core (master only)
> --
>
> Key: SOLR-14469
> URL: https://issues.apache.org/jira/browse/SOLR-14469
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> I'm currently working on getting all the warnings out of the code, so this is 
> something of a placeholder for a week or two.
> There will be sub-tasks, please create them when you start working on a 
> project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13132) Improve JSON "terms" facet performance when sorted by relatedness

2020-07-09 Thread Michael Gibney (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154661#comment-17154661
 ] 

Michael Gibney commented on SOLR-13132:
---

Sorry, yes; the "MASTER" results were for "filterCacheSize=0", so 
apples-to-apples with "SOLR-13132 sweep_collection=false, filterCacheSize=0". 
And yes, I'll update the ref guide shortly.

bq.Assuming i'm understanding correctly...

Yes, that's my takeaway as well.

My only remaining questions are around what's considered a "common" vs. 
"uncommon" case, and regarding the negative impact of sweep collection on 
low-cardinality fields, what impact we consider to be "small". To explore this 
a little bit: I think it's hard to say what the common vs. uncommon use case 
is. But the worst-case negative impact of sweep collection (disregarding 
filterCache) is ~4x, for very-low-cardinality fields over low-recall FG sets, 
which are likely among the fastest queries in an absolute sense. This seems 
acceptable to me.

Considering the performance boost that filterCache can in some cases provide to 
non-sweep collection, the worst-case negative performance impact can go to 
~100x ... _but_ I still think that's ok, because it makes sense to consider 
reliance on filterCache as an opt-in performance optimization (analogous to how 
the {{enum}} facet method can outperform {{dv}} faceting for low-cardinality 
fields and a sufficiently-sized filterCache). Relying on filterCache in these 
cases can yield significant performance benefits, but is very 
situation-specific, and should be approached carefully to avoid system-wide 
negative effects. So particularly pending some way to make filterCache use more 
selective (e.g., SOLR-13108) it makes sense to default to sweep collection 
_even if only_ because it avoids accidental filterCache thrashing.

... all that being a long way of saying "yes, I think we're good to go". Now 
I'll go transform that into something refGuide-appropriate


> Improve JSON "terms" facet performance when sorted by relatedness 
> --
>
> Key: SOLR-13132
> URL: https://issues.apache.org/jira/browse/SOLR-13132
> Project: Solr
>  Issue Type: Improvement
>  Components: Facet Module
>Affects Versions: 7.4, master (9.0)
>Reporter: Michael Gibney
>Priority: Major
> Attachments: SOLR-13132-benchmarks.tgz, 
> SOLR-13132-with-cache-01.patch, SOLR-13132-with-cache.patch, 
> SOLR-13132.patch, SOLR-13132_testSweep.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When sorting buckets by {{relatedness}}, JSON "terms" facet must calculate 
> {{relatedness}} for every term. 
> The current implementation uses a standard uninverted approach (either 
> {{docValues}} or {{UnInvertedField}}) to get facet counts over the domain 
> base docSet, and then uses that initial pass as a pre-filter for a 
> second-pass, inverted approach of fetching docSets for each relevant term 
> (i.e., {{count > minCount}}?) and calculating intersection size of those sets 
> with the domain base docSet.
> Over high-cardinality fields, the overhead of per-term docSet creation and 
> set intersection operations increases request latency to the point where 
> relatedness sort may not be usable in practice (for my use case, even after 
> applying the patch for SOLR-13108, for a field with ~220k unique terms per 
> core, QTime for high-cardinality domain docSets were, e.g.: cardinality 
> 1816684=9000ms, cardinality 5032902=18000ms).
> The attached patch brings the above example QTimes down to a manageable 
> ~300ms and ~250ms respectively. The approach calculates uninverted facet 
> counts over domain base, foreground, and background docSets in parallel in a 
> single pass. This allows us to take advantage of the efficiencies built into 
> the standard uninverted {{FacetFieldProcessorByArray[DV|UIF]}}), and avoids 
> the per-term docSet creation and set intersection overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14639) Improve concurrency of SlowCompositeReaderWrapper.terms

2020-07-09 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154652#comment-17154652
 ] 

Ishan Chattopadhyaya edited comment on SOLR-14639 at 7/9/20, 3:10 PM:
--

I have a client who is experiencing very similar symptoms with Collapse QParser 
(that uses SCRW). Higher latency during high concurrency, but no significant 
CPU load increase.


was (Author: ichattopadhyaya):
I have a client who is experiencing very similar symptoms with Collapse 
QParser. Higher latency during high concurrency, but no significant CPU load 
increase.

> Improve concurrency of SlowCompositeReaderWrapper.terms
> ---
>
> Key: SOLR-14639
> URL: https://issues.apache.org/jira/browse/SOLR-14639
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 8.4.1
>Reporter: Shalin Shekhar Mangar
>Priority: Major
> Attachments: Screen Shot 2020-07-09 at 4.38.03 PM.png
>
>
> Under heavy query load, the ConcurrentHashMap.computeIfAbsent method inside 
> the SlowCompositeReaderWrapper.terms(String) method blocks searcher threads 
> (see attached screenshot of a java flight recording).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14639) Improve concurrency of SlowCompositeReaderWrapper.terms

2020-07-09 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154652#comment-17154652
 ] 

Ishan Chattopadhyaya commented on SOLR-14639:
-

I have a client who is experiencing very similar symptoms with Collapse 
QParser. Higher latency during high concurrency, but no significant CPU load 
increase.

> Improve concurrency of SlowCompositeReaderWrapper.terms
> ---
>
> Key: SOLR-14639
> URL: https://issues.apache.org/jira/browse/SOLR-14639
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 8.4.1
>Reporter: Shalin Shekhar Mangar
>Priority: Major
> Attachments: Screen Shot 2020-07-09 at 4.38.03 PM.png
>
>
> Under heavy query load, the ConcurrentHashMap.computeIfAbsent method inside 
> the SlowCompositeReaderWrapper.terms(String) method blocks searcher threads 
> (see attached screenshot of a java flight recording).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9423) Leaking FileChannel in NIOFSDirectory#openInput

2020-07-09 Thread Nhat Nguyen (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nhat Nguyen updated LUCENE-9423:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Leaking FileChannel in NIOFSDirectory#openInput
> ---
>
> Key: LUCENE-9423
> URL: https://issues.apache.org/jira/browse/LUCENE-9423
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: master (9.0), 8.7
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
> Fix For: master (9.0), 8,7
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If we fail to get the 
> [size|https://github.com/apache/lucene-solr/blob/82692e76e054d3e6938034e96a4e9632bd9f7a70/lucene/core/src/java/org/apache/lucene/store/NIOFSDirectory.java#L107]
>  of a file in the constructor of NIOFSIndexInput, then we will leak a 
> FileChannel opened in NIOFSDirectory#openInput. This bug is discovered by a 
> test failure in 
> [Elasticsearch|https://github.com/elastic/elasticsearch/issues/39585#issuecomment-654995186].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9423) Leaking FileChannel in NIOFSDirectory#openInput

2020-07-09 Thread Nhat Nguyen (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nhat Nguyen updated LUCENE-9423:

Fix Version/s: master (9.0)
   8,7

> Leaking FileChannel in NIOFSDirectory#openInput
> ---
>
> Key: LUCENE-9423
> URL: https://issues.apache.org/jira/browse/LUCENE-9423
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: master (9.0), 8.7
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
> Fix For: master (9.0), 8,7
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If we fail to get the 
> [size|https://github.com/apache/lucene-solr/blob/82692e76e054d3e6938034e96a4e9632bd9f7a70/lucene/core/src/java/org/apache/lucene/store/NIOFSDirectory.java#L107]
>  of a file in the constructor of NIOFSIndexInput, then we will leak a 
> FileChannel opened in NIOFSDirectory#openInput. This bug is discovered by a 
> test failure in 
> [Elasticsearch|https://github.com/elastic/elasticsearch/issues/39585#issuecomment-654995186].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9423) Leaking FileChannel in NIOFSDirectory#openInput

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154637#comment-17154637
 ] 

ASF subversion and git services commented on LUCENE-9423:
-

Commit 8b619d1267d84a5d7fc385ed1cc04af3cfa5a093 in lucene-solr's branch 
refs/heads/branch_8x from Nhat Nguyen
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8b619d1 ]

LUCENE-9423: Handle exc in NIOFSDirectory#openInput (#1658)

If we fail to get the size of a file in the constructor of 
NIOFSIndexInput, then we will leak a FileChannel opened in
NIOFSDirectory#openInput.

> Leaking FileChannel in NIOFSDirectory#openInput
> ---
>
> Key: LUCENE-9423
> URL: https://issues.apache.org/jira/browse/LUCENE-9423
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: master (9.0), 8.7
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If we fail to get the 
> [size|https://github.com/apache/lucene-solr/blob/82692e76e054d3e6938034e96a4e9632bd9f7a70/lucene/core/src/java/org/apache/lucene/store/NIOFSDirectory.java#L107]
>  of a file in the constructor of NIOFSIndexInput, then we will leak a 
> FileChannel opened in NIOFSDirectory#openInput. This bug is discovered by a 
> test failure in 
> [Elasticsearch|https://github.com/elastic/elasticsearch/issues/39585#issuecomment-654995186].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9386) RegExpQuery - add case insensitive matching option

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154596#comment-17154596
 ] 

ASF subversion and git services commented on LUCENE-9386:
-

Commit 26706eea656c7124099139595f1390e513c6b2f0 in lucene-solr's branch 
refs/heads/branch_8x from markharwood
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=26706ee ]

LUCENE-9386 Bug fix for 8x backport (#1660)

Constructor wasn't passing through flag choice

> RegExpQuery - add case insensitive matching option
> --
>
> Key: LUCENE-9386
> URL: https://issues.apache.org/jira/browse/LUCENE-9386
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Mark Harwood
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In searches sometimes case sensitivity is important and sometimes not. 
> However, users don't want to have to index two versions of their data 
> (lowercased and original) in order to service both case sensitive and case 
> insensitive queries. To get around this users have been commonly seen to take 
> a user query e.g. `powershell.exe` and search for it with the regex 
> `[Pp][Oo][Ww][Ee][Rr][Ss][Hh][Ee][Ll][Ll]\.[Ee][Xx][Ee]`.
> The proposal is that we add an extra "case insensitive" option to the RegExp 
> query flags to automatically do this sort of expansion when we create 
> Automatons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] markharwood merged pull request #1660: LUCENE-9386 Bug fix for 8x backport

2020-07-09 Thread GitBox


markharwood merged pull request #1660:
URL: https://github.com/apache/lucene-solr/pull/1660


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] markharwood opened a new pull request #1660: LUCENE-9386 Bug fix for 8x backport

2020-07-09 Thread GitBox


markharwood opened a new pull request #1660:
URL: https://github.com/apache/lucene-solr/pull/1660


   Constructor passed the wrong flag



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9386) RegExpQuery - add case insensitive matching option

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154557#comment-17154557
 ] 

ASF subversion and git services commented on LUCENE-9386:
-

Commit fec5d49112ab23ce67792ddc821b1addcc8eea5d in lucene-solr's branch 
refs/heads/branch_8x from markharwood
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fec5d49 ]

LUCENE-9386 add case insensitive RegExp matching option. (#1659)

Backport of 887fe4c83d4114c6238265ca7f05aa491525af9d

> RegExpQuery - add case insensitive matching option
> --
>
> Key: LUCENE-9386
> URL: https://issues.apache.org/jira/browse/LUCENE-9386
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Reporter: Mark Harwood
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In searches sometimes case sensitivity is important and sometimes not. 
> However, users don't want to have to index two versions of their data 
> (lowercased and original) in order to service both case sensitive and case 
> insensitive queries. To get around this users have been commonly seen to take 
> a user query e.g. `powershell.exe` and search for it with the regex 
> `[Pp][Oo][Ww][Ee][Rr][Ss][Hh][Ee][Ll][Ll]\.[Ee][Xx][Ee]`.
> The proposal is that we add an extra "case insensitive" option to the RegExp 
> query flags to automatically do this sort of expansion when we create 
> Automatons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] markharwood merged pull request #1659: LUCENE-9386 add case insensitive RegExp matching option

2020-07-09 Thread GitBox


markharwood merged pull request #1659:
URL: https://github.com/apache/lucene-solr/pull/1659


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14637) Update CloudSolrClient examples

2020-07-09 Thread David Eric Pugh (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154555#comment-17154555
 ] 

David Eric Pugh commented on SOLR-14637:


Great.  I'll look for that updated patch.  I'm out tomorrow, so it may be next 
week.  Don't hesitate to ping me if I miss it ;-)

> Update CloudSolrClient examples
> ---
>
> Key: SOLR-14637
> URL: https://issues.apache.org/jira/browse/SOLR-14637
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Priority: Minor
> Attachments: SOLR-14637.patch
>
>
> CloudSolrClient.Builder() is deprecated ( SOLR-11629 ) but in the 
> documentation we still use this constructor. I think it would be better to 
> use a non-deprecated constructor in the examples.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9423) Leaking FileChannel in NIOFSDirectory#openInput

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154548#comment-17154548
 ] 

ASF subversion and git services commented on LUCENE-9423:
-

Commit 20ec57a4fed3a6a691eca6c76a8e0e8977d5eb63 in lucene-solr's branch 
refs/heads/master from Nhat Nguyen
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=20ec57a ]

LUCENE-9423: Handle exc in NIOFSDirectory#openInput (#1658)

If we fail to get the size of a file in the constructor of 
NIOFSIndexInput, then we will leak a FileChannel opened in
NIOFSDirectory#openInput.

> Leaking FileChannel in NIOFSDirectory#openInput
> ---
>
> Key: LUCENE-9423
> URL: https://issues.apache.org/jira/browse/LUCENE-9423
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/store
>Affects Versions: master (9.0), 8.7
>Reporter: Nhat Nguyen
>Assignee: Nhat Nguyen
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If we fail to get the 
> [size|https://github.com/apache/lucene-solr/blob/82692e76e054d3e6938034e96a4e9632bd9f7a70/lucene/core/src/java/org/apache/lucene/store/NIOFSDirectory.java#L107]
>  of a file in the constructor of NIOFSIndexInput, then we will leak a 
> FileChannel opened in NIOFSDirectory#openInput. This bug is discovered by a 
> test failure in 
> [Elasticsearch|https://github.com/elastic/elasticsearch/issues/39585#issuecomment-654995186].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dnhatn merged pull request #1658: LUCENE-9423: Handle exception in NIOFSDirectory#openInput

2020-07-09 Thread GitBox


dnhatn merged pull request #1658:
URL: https://github.com/apache/lucene-solr/pull/1658


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14244) Remove ReplicaInfo

2020-07-09 Thread Andrzej Bialecki (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki reassigned SOLR-14244:
---

Assignee: Andrzej Bialecki

> Remove ReplicaInfo
> --
>
> Key: SOLR-14244
> URL: https://issues.apache.org/jira/browse/SOLR-14244
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>
> SolrCloud uses {{Replica}} and {{ReplicaInfo}} beans more or less 
> interchangeably and rather inconsistently across the code base. They seem to 
> mean exactly the same thing.
> We should get rid of one or the other.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14244) Remove ReplicaInfo

2020-07-09 Thread Andrzej Bialecki (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki updated SOLR-14244:

Fix Version/s: master (9.0)

> Remove ReplicaInfo
> --
>
> Key: SOLR-14244
> URL: https://issues.apache.org/jira/browse/SOLR-14244
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
> Fix For: master (9.0)
>
>
> SolrCloud uses {{Replica}} and {{ReplicaInfo}} beans more or less 
> interchangeably and rather inconsistently across the code base. They seem to 
> mean exactly the same thing.
> We should get rid of one or the other.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14640) Improve concurrency of SlowCompositeReaderWrapper.getSortedDocValues

2020-07-09 Thread Shalin Shekhar Mangar (Jira)
Shalin Shekhar Mangar created SOLR-14640:


 Summary: Improve concurrency of 
SlowCompositeReaderWrapper.getSortedDocValues
 Key: SOLR-14640
 URL: https://issues.apache.org/jira/browse/SOLR-14640
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: search
Affects Versions: 8.4.1
Reporter: Shalin Shekhar Mangar
 Attachments: Screen Shot 2020-07-09 at 4.46.46 PM.png

Under heavy query load, the synchronized HashMap {{cachedOrdMaps}} inside 
SlowCompositeReaderWrapper.getSortedDocValues blocks search threads.

See attached screenshot of a java flight recording from an affected node. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14639) Improve concurrency of SlowCompositeReaderWrapper.terms

2020-07-09 Thread Shalin Shekhar Mangar (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154440#comment-17154440
 ] 

Shalin Shekhar Mangar commented on SOLR-14639:
--

The problem is that ConcurrentHashMap.computeIfAbsent can be costly under 
contention. In JDK8, computeIfAbsent locks the node in which the key should be 
present regardless of whether the key exists or not [1]. This means that 
computeIfAbsent is always blocking as compared to get() which is a non-blocking 
operation. In JDK9, this was slightly ameliorated by adding a fast-return in 
case the key was found in the first node without entering a synchronization 
block. But if there is a hash collision and the key is not in the first node, 
then computeIfAbsent enters into a synchronization block on the node to find 
the key. For a cache, we can expect that the key will exist in most of the 
lookups so it makes sense to avoid the cost of entering a synchronized block 
for retrieval.

Doug Lea wrote on the concurrency mailing list [2]:
{code}
With the current implementation,
if you are implementing a cache, it may be better to code cache.get
to itself do a pre-screen, as in:
   V v = map.get(key);
   return (v != null) ? v : map.computeIfAbsent(key, function);

However, the exact benefit depends on access patterns.
For example, I reran your benchmark cases (urls below) on a
32way x86, and got throughputs (ops/sec) that are dramatically
better with pre-screen for the case of a single key,
but worse with your Zipf-distributed keys.
{code}

I would like to implement this method or switch to caffeine which has a 
non-blocking return in case the keys already exist [3].

[1] - 
https://concurrency-interest.altair.cs.oswego.narkive.com/0Jfe1waD/computeifabsent-optimized-for-missing-entries
[2] - 
http://cs.oswego.edu/pipermail/concurrency-interest/2014-December/013360.html
[3] - https://github.com/ben-manes/caffeine/wiki/Benchmarks

> Improve concurrency of SlowCompositeReaderWrapper.terms
> ---
>
> Key: SOLR-14639
> URL: https://issues.apache.org/jira/browse/SOLR-14639
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Affects Versions: 8.4.1
>Reporter: Shalin Shekhar Mangar
>Priority: Major
> Attachments: Screen Shot 2020-07-09 at 4.38.03 PM.png
>
>
> Under heavy query load, the ConcurrentHashMap.computeIfAbsent method inside 
> the SlowCompositeReaderWrapper.terms(String) method blocks searcher threads 
> (see attached screenshot of a java flight recording).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14639) Improve concurrency of SlowCompositeReaderWrapper.terms

2020-07-09 Thread Shalin Shekhar Mangar (Jira)
Shalin Shekhar Mangar created SOLR-14639:


 Summary: Improve concurrency of SlowCompositeReaderWrapper.terms
 Key: SOLR-14639
 URL: https://issues.apache.org/jira/browse/SOLR-14639
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: search
Affects Versions: 8.4.1
Reporter: Shalin Shekhar Mangar
 Attachments: Screen Shot 2020-07-09 at 4.38.03 PM.png

Under heavy query load, the ConcurrentHashMap.computeIfAbsent method inside the 
SlowCompositeReaderWrapper.terms(String) method blocks searcher threads (see 
attached screenshot of a java flight recording).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14638) Edismax boost function zero score

2020-07-09 Thread Victor Zharikov (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154419#comment-17154419
 ] 

Victor Zharikov commented on SOLR-14638:


[~dsmiley] I'll try this. Thanks. So this is the expected behaviour for 7.7.2+ 
versions?

> Edismax boost function zero score
> -
>
> Key: SOLR-14638
> URL: https://issues.apache.org/jira/browse/SOLR-14638
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.7.1, 7.7.2
> Environment: I am using the [https://hub.docker.com/_/solr] image for 
> 7.7.1 and 7.7.2 versions without any major changes.
>Reporter: Victor Zharikov
>Priority: Critical
>
> I'm using edismax to boost a score. 
> Using the 'field(field_name)' function's boost, I multiply the points by the 
> value of this field.
> Its behavior without any patch notes became noticeably different between 
> 7.7.1 and 7.7.2.
> On version 7.7.1, the record for which the 'field_name' field is not filled 
> has regular score. And the record score with field_name = 2.0, for example, 
> is multiplied by two.
> On version 7.7.2, a record with the field 'field_name' not filled in has ZERO 
> score. And for the record with field_name = 2.0, the points are still 
> multiplied by two from the normal result.
> It completely breaks the score ranking.
> Example .
> Version 7.7.1
> no field_name "score": 32.586094
> field_name = 2.0 "score": 65.17219
> no field_name "score": 0.0
> field_name = 2.0 "score": 65.17219



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14629) Invalid method name in Kerberos Authentication Plugin Documentation

2020-07-09 Thread Andras Salamon (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154364#comment-17154364
 ] 

Andras Salamon commented on SOLR-14629:
---

Jira number was not added to the commit message, so the bot cannot link the 
commit here.

Here is the commit: 
[https://github.com/apache/lucene-solr/commit/4e20986f89bfcd8961f85f2e41af46ea1b82cace]

> Invalid method name in Kerberos Authentication Plugin Documentation
> ---
>
> Key: SOLR-14629
> URL: https://issues.apache.org/jira/browse/SOLR-14629
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Andras Salamon
>Assignee: David Eric Pugh
>Priority: Minor
> Fix For: 8.6
>
> Attachments: SOLR-14629.patch
>
>
> In the [Kerberos Authentication Plugin 
> documentation|https://lucene.apache.org/solr/guide/8_5/kerberos-authentication-plugin.html]
>  there are two references to the {{withDelegationToken}} method.
> There is no such method, the correct name (also referenced in this docs) is 
> {{withKerberosDelegationToken}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14637) Update CloudSolrClient examples

2020-07-09 Thread Andras Salamon (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154363#comment-17154363
 ] 

Andras Salamon commented on SOLR-14637:
---

Yes, it's a bit strange that the new non-deprecated constructor is much more 
complex than the original one. The more reason to put a working example into 
the documentation.

Enhancing the using-solrj doc with a CloudSolrClient is a good idea, I'll 
upload a new patch soon. SOLR-12309 added some javadoc, I'll start from that 
info.

> Update CloudSolrClient examples
> ---
>
> Key: SOLR-14637
> URL: https://issues.apache.org/jira/browse/SOLR-14637
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andras Salamon
>Priority: Minor
> Attachments: SOLR-14637.patch
>
>
> CloudSolrClient.Builder() is deprecated ( SOLR-11629 ) but in the 
> documentation we still use this constructor. I think it would be better to 
> use a non-deprecated constructor in the examples.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154360#comment-17154360
 ] 

ASF subversion and git services commented on SOLR-14610:


Commit 4ae976bdf0b295edf366c3279ec3b0863c363cd2 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4ae976b ]

SOLR-14610: CHANGES.txt


> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154361#comment-17154361
 ] 

ASF subversion and git services commented on SOLR-14610:


Commit 4ae976bdf0b295edf366c3279ec3b0863c363cd2 in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4ae976b ]

SOLR-14610: CHANGES.txt


> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154337#comment-17154337
 ] 

ASF subversion and git services commented on SOLR-14610:


Commit 35d39ee097c7160b7f8d86f321d245fc77af9be6 in lucene-solr's branch 
refs/heads/branch_8x from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=35d39ee ]

SOLR-14610: ReflectMapWriter to use MethodHandle instead of old reflection


> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154313#comment-17154313
 ] 

ASF subversion and git services commented on SOLR-14610:


Commit 21552589749a7faaf6ab457920daed0eb4b60c0a in lucene-solr's branch 
refs/heads/master from noblepaul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2155258 ]

SOLR-14610 : Use Methodhandles instead of VarHandle. Works with java8 as well


> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154295#comment-17154295
 ] 

Noble Paul commented on SOLR-14610:
---

Thanks [~uschindler]


> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154291#comment-17154291
 ] 

Uwe Schindler commented on SOLR-14610:
--

In addition: You can completely remove all this IllegalAccess throws and catch 
from the FieldWriter interface. The varhandle does not throw any exception!

> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154259#comment-17154259
 ] 

Uwe Schindler commented on SOLR-14610:
--

VarHandles do not exist in Java 8!

> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread Uwe Schindler (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154257#comment-17154257
 ] 

Uwe Schindler commented on SOLR-14610:
--

I am not sure if you need a VarHandle at all, because VarHandles are mostly 
created for the special access modes like volatile accesses! Maybe a simple 
MethodHandle would have been enough (and could be backported to 8.x, you can 
use Lookup.unreflectGetter and Lookup.unreflectSetter).

> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14610) ReflectMapWriter to use VarHandle instead of old legacy reflection

2020-07-09 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154210#comment-17154210
 ] 

Noble Paul commented on SOLR-14610:
---

Now, it is all cached in a static Map. Which means it is a lot faster that 
doing reflection all over again. So, in the cached object, the VarHandle is 
resolved and stored

> ReflectMapWriter to use VarHandle instead of old legacy reflection
> --
>
> Key: SOLR-14610
> URL: https://issues.apache.org/jira/browse/SOLR-14610
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The same reason why we changed to MethodHandles in SOLR-14404



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13989) Move all hadoop related code to a contrib module

2020-07-09 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17154207#comment-17154207
 ] 

David Smiley commented on SOLR-13989:
-

The package manager might not currently be able to load certain things 
mentioned above but my approach in 
[https://github.com/apache/lucene-solr/pull/1109] made it near universal what 
it could load.  While that particular PR got into a tug of war of how to do 
things, I hope to resurrect another incarnation of it after some further 
SolrResourceLoader refactorings I'll undergo.  That may then unblock the issue 
here (HDFS).

> Move all hadoop related code to a contrib module
> 
>
> Key: SOLR-13989
> URL: https://issues.apache.org/jira/browse/SOLR-13989
> Project: Solr
>  Issue Type: Task
>  Components: Hadoop Integration
>Reporter: Shalin Shekhar Mangar
>Priority: Major
> Fix For: master (9.0)
>
>
> Spin off from SOLR-13986:
> {quote}
> It seems really important to move or remove this hadoop shit out of the solr 
> core: It is really unreasonable that solr core depends on hadoop. that's 
> gonna simply block any progress improving its security, because solr code 
> will get dragged down by hadoop's code.
> {quote}
> We should move all hadoop related dependencies to a separate contrib module



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org